DQN Cartpole Example Does Not Learn

Opening because #683 is closed and the issue persists.  When running the vanilla code from the [Q learning example](https://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html), even for 10k episodes, the agent does not learn.  The reward is constant at 1.0, and losses increase over time.

```
$ python3 -c 'import torch;print(torch.__version__)'
1.10.0a0
```

![image](https://user-images.githubusercontent.com/4565287/144773417-99c1dfad-bf3a-4c71-bec2-24c9a3e30edb.png)
![image](https://user-images.githubusercontent.com/4565287/144775512-d7018e80-fb88-44d6-92e3-4f8ae0c27ecf.png)


cc @vmoens @nairbv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DQN Cartpole Example Does Not Learn #1755

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DQN Cartpole Example Does Not Learn #1755

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions