-
Notifications
You must be signed in to change notification settings - Fork 6k
Multiply lr scheduler steps by num_processes
.
#3983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks much!
Is this supposed to be merged now? |
Oh I was waiting for @muellerzr's eyes on this too. |
Please do not merge yet. It seems that we should not multiply the steps by accumulation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
* Multiply lr scheduler steps by `num_processes`. * Stop multiplying steps by gradient accumulation.
* Multiply lr scheduler steps by `num_processes`. * Stop multiplying steps by gradient accumulation.
* Multiply lr scheduler steps by `num_processes`. * Stop multiplying steps by gradient accumulation.
* Multiply lr scheduler steps by `num_processes`. * Stop multiplying steps by gradient accumulation.
What does this PR do?
Fixes #3954.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul @muellerzr