Closed
Description
I am unable to make the cartpole example working. It fails to learn even after 2000 iterations. Please check what may be wrong. My notebook with test is here, fully based on your notebook: https://github.com/poedator/otus_data_science/blob/master/project/reinforcement_q_learning_torch_example_tested.ipynb
Metadata
Metadata
Assignees
Labels
No labels