Skip to content

[BUG] Reinforcement PPO and pendulum failing against 2.6 binaries #3195

Closed
@svekars

Description

@svekars

Add Link

https://pytorch.org/tutorials/intermediate/reinforcement_ppo.html
https://pytorch.org/tutorials/advanced/pendulum.html

Describe the bug

Error for pendulum.py

Unexpected failing examples:
/var/lib/workspace/advanced_source/pendulum.py failed leaving traceback:
Traceback (most recent call last):
  File "/var/lib/workspace/advanced_source/pendulum.py", line 606, in <module>
    UnsqueezeTransform(
TypeError: UnsqueezeTransform.__init__() got an unexpected keyword argument 'unsqueeze_dim'

build log

Error for reinforcement_ppo.py:

Unexpected failing examples:
/var/lib/workspace/intermediate_source/reinforcement_ppo.py failed leaving traceback:
Traceback (most recent call last):
  File "/var/lib/workspace/intermediate_source/reinforcement_ppo.py", line 644, in <module>
    eval_rollout = env.rollout(1000, policy_module)
  File "/usr/local/lib/python3.10/dist-packages/torchrl/envs/common.py", line 2635, in rollout
    tensordicts = self._rollout_stop_early(
  File "/usr/local/lib/python3.10/dist-packages/torchrl/envs/common.py", line 2722, in _rollout_stop_early
    tensordict = policy(tensordict)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensordict/nn/common.py", line 314, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensordict/nn/utils.py", line 359, in wrapper
    result = func(_self, tensordict, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensordict/nn/probabilistic.py", line 622, in forward
    return self.module[-1](tensordict_out, _requires_sample=self._requires_sample)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensordict/nn/common.py", line 314, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensordict/nn/utils.py", line 359, in wrapper
    result = func(_self, tensordict, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensordict/nn/probabilistic.py", line 393, in forward
    out_tensors = self._dist_sample(dist, interaction_type=interaction_type())
  File "/usr/local/lib/python3.10/dist-packages/tensordict/nn/probabilistic.py", line 490, in _dist_sample
    if hasattr(dist, "mean"):
  File "/usr/local/lib/python3.10/dist-packages/torchrl/modules/distributions/continuous.py", line 551, in mean
    raise NotImplementedError(
NotImplementedError: TanhNormal does not have a closed form formula for the average. Am estimate of this value can be computed using dist.sample((N,)).mean(dim=0), where N is a large number of samples.

build log

Please submit fixes against the 2.6-RC-TEST branch and enable in the validate_tutorials.py

CC: @vmoens

Describe your environment

  • PyTorch 2.6

cc @vmoens @nairbv

Metadata

Metadata

Assignees

No one assigned

    Labels

    2.6Tracking 2.6 version PRs.bugrlIssues related to reinforcement learning tutorial, DQN, and so on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions