DQN tutorial fixes #2026

SiftingSands · 2022-09-05T18:17:05Z

Gym's domain name has changed since #1928 (comment)

Minor updates to Gym calls, which would produce errors with the latest version of Gym (0.25.2). (possibly fixes #1432 (comment))

Still doesn't train properly as noted in #1755 (comment) , but at least it runs now as shown in the figure below (didn't touch any other settings in the tutorial).

facebook-github-bot · 2022-09-05T18:17:09Z

Hi @SiftingSands!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

netlify · 2022-09-05T18:20:42Z

✅ Deploy Preview for pytorch-tutorials-preview ready!

Name	Link
🔨 Latest commit	`cf2e13a`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-tutorials-preview/deploys/63176793fd8b890008e7f3f1
😎 Deploy Preview	https://deploy-preview-2026--pytorch-tutorials-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

facebook-github-bot · 2022-09-05T20:07:36Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

vmoens

Thanks @SiftingSands for this
LGTM, if you don't mind I'll have a deeper look at the tuto and check why it doesn't learn before merging.
Forgive me for putting this on hold!

SiftingSands · 2022-09-06T18:23:56Z

@vmoens No worries at all. I did try tweaking various hyperparameters and doing reward shaping to get it to train within 500 to 1000 episodes. As with others, I saw non-monotonic reward history, where the agent would perform well then forget how to solve the problem again. I believe this is to be expected for vanilla DQN learning from pixels instead of the raw state variables.

I can send over my best results in a new merge request later this evening (GMT -4) if you're interested (still only able to consistently get ~100 timesteps by 1000 episodes, but I haven't tried saving the "best" model to see how it generalizes). I tried to keep my changes minor relative to Adam's original work such as on only using the "duration" in the reward and not using any state variables such as angle or position.

vmoens · 2022-09-06T21:03:58Z

@vmoens No worries at all. I did try tweaking various hyperparameters and doing reward shaping to get it to train within 500 to 1000 episodes. As with others, I saw non-monotonic reward history, where the agent would perform well then forget how to solve the problem again. I believe this is to be expected for vanilla DQN learning from pixels instead of the raw state variables.

I can send over my best results in a new merge request later this evening (GMT -4) if you're interested (still only able to consistently get ~100 timesteps by 1000 episodes, but I haven't tried saving the "best" model to see how it generalizes). I tried to keep my changes minor relative to Adam's original work such as on only using the "duration" in the reward and not using any state variables such as angle or position.

Sure let's keep iterating on this. There are some things I'm trying ok my side too!
Do you know if it used to train and then it didn't or did we always have this performance?

SiftingSands · 2022-09-06T22:19:02Z

Sure let's keep iterating on this. There are some things I'm trying ok my side too! Do you know if it used to train and then it didn't or did we always have this performance?

Unfortunately, I do not know if it trained successfully in the past. I looked through a few blogs to see if anyone reported it working, but all successful examples involved significant modifications such as using the state variables instead of pixels. I wonder if it's always been somewhat "broken", since there's this issue from 2018 #209 about missing the target network. Just speculating here though...

By adding pygame to the requirements and bumping gym to 0.25.0 (to account for changes introduced by #2026 ) Add cell for installing `gym` on Colab

SiftingSands added 2 commits September 5, 2022 13:54

gym website domain change

00acd8d

fixed issues due to changes to gym's API

37dd008

SiftingSands changed the title ~~Dqn fix~~ DQN tutorial fixes Sep 5, 2022

facebook-github-bot added the cla signed label Sep 5, 2022

no more warnings other than using cartpole v0

e5169eb

svekars requested a review from vmoens September 6, 2022 15:29

Merge branch 'master' into DQN_fix

cf2e13a

svekars requested a review from nairbv September 6, 2022 15:30

vmoens approved these changes Sep 6, 2022

View reviewed changes

svekars merged commit 9fb9f47 into pytorch:master Sep 6, 2022

SiftingSands deleted the DQN_fix branch September 6, 2022 18:28

SiftingSands mentioned this pull request Sep 7, 2022

[WIP] Improve training of DQN tutorial #2030

Closed

malfet mentioned this pull request Oct 5, 2022

Re-enable reinforcement_q_learning tutorial #2067

Merged

malfet added a commit that referenced this pull request Oct 5, 2022

Re-enable reinforcement_q_learning tutorial (#2067)

8a6bf36

By adding pygame to the requirements and bumping gym to 0.25.0 (to account for changes introduced by #2026 ) Add cell for installing `gym` on Colab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DQN tutorial fixes #2026

DQN tutorial fixes #2026

Uh oh!

SiftingSands commented Sep 5, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 5, 2022

Uh oh!

netlify bot commented Sep 5, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 5, 2022

Uh oh!

vmoens left a comment

Uh oh!

SiftingSands commented Sep 6, 2022 •

edited

Loading

Uh oh!

vmoens commented Sep 6, 2022

Uh oh!

SiftingSands commented Sep 6, 2022

Uh oh!

Uh oh!

DQN tutorial fixes #2026

DQN tutorial fixes #2026

Uh oh!

Conversation

SiftingSands commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Sep 5, 2022

Action Required

Process

Uh oh!

netlify bot commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-tutorials-preview ready!

Uh oh!

facebook-github-bot commented Sep 5, 2022

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

SiftingSands commented Sep 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vmoens commented Sep 6, 2022

Uh oh!

SiftingSands commented Sep 6, 2022

Uh oh!

Uh oh!

SiftingSands commented Sep 5, 2022 •

edited

Loading

netlify bot commented Sep 5, 2022 •

edited

Loading

SiftingSands commented Sep 6, 2022 •

edited

Loading