Skip to content

Fix TF32 convergence issue with TF32 #1244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 17, 2020
Merged

Fix TF32 convergence issue with TF32 #1244

merged 2 commits into from
Nov 17, 2020

Conversation

zasdfgbnm
Copy link
Contributor

@zasdfgbnm zasdfgbnm commented Nov 16, 2020

Fixes pytorch/pytorch#47844

Output before the fix:

99 581.02685546875
199 10.408246994018555
299 8.866250991821289
399 21.11296844482422
499 21.725337982177734

and

99 244.06027221679688
199 11.857839584350586
299 6.741735935211182
399 8.316146850585938
499 18.457813262939453

Output after the fix:

99 645.8162841796875
199 6.169060707092285
299 0.08816853910684586
399 0.0016538957133889198
499 0.00013183383271098137

and

99 604.789794921875
199 10.036174774169922
299 0.269910991191864
399 0.009123876690864563
499 0.0006364969885908067

@netlify
Copy link

netlify bot commented Nov 16, 2020

Deploy preview for pytorch-tutorials-preview ready!

Built with commit d7dda23

https://deploy-preview-1244--pytorch-tutorials-preview.netlify.app

@netlify
Copy link

netlify bot commented Nov 16, 2020

Deploy preview for pytorch-tutorials-preview ready!

Built with commit 5e8c110

https://deploy-preview-1244--pytorch-tutorials-preview.netlify.app

@zasdfgbnm
Copy link
Contributor Author

cc: @ngimel

@soumith soumith merged commit 9ecc44a into pytorch:master Nov 17, 2020
rodrigo-techera pushed a commit to Experience-Monks/tutorials that referenced this pull request Nov 29, 2021
* Fix TF32 convergence issue with TF32

* save
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect output loss value under specific CUDA version
4 participants