Skip to content

Update ddp_tutorial to work on Windows platform #1182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 27, 2020

Conversation

gunandrose4u
Copy link
Contributor

Due to on Windows platform, DDP can only work with Gloo backend and FileStore. Update ddp_tutorial to clarify how to make ddp working on Windows platform.

gunandrose4u and others added 2 commits October 9, 2020 20:23
update ddp_tutorial to work on Windows platform
@netlify
Copy link

netlify bot commented Oct 12, 2020

Deploy preview for pytorch-tutorials-preview failed.

Built with commit 1f0b816

https://app.netlify.com/sites/pytorch-tutorials-preview/deploys/5f9830fb6b616f0008a82ccb

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gunandrose4u would like you to add the following to the beginning of this doc:

Edited by: Joe Zhu <https://github.com/gunandrose4u>_

See below as an example:

**Edited by**: `Teng Li <https://github.com/teng-li>`_

# initialize the process group
dist.init_process_group("gloo", rank=rank, world_size=world_size)
if sys.platform == 'win32':
# Distributed package only covers collective communications with Gloo backend, FileStore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: backend, -> backend and

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

dist.init_process_group("gloo", rank=rank, world_size=world_size)
if sys.platform == 'win32':
# Distributed package only covers collective communications with Gloo backend, FileStore
# on Windows platform. Set init_method parameter in init_process_group and point it to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: and point it to -> to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

# initialize the process group
dist.init_process_group("gloo", rank=rank, world_size=world_size)
if sys.platform == 'win32':
# Distributed package only covers collective communications with Gloo backend and FileStore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: below is how the built doc looks like. Shall we break the lines into shorter ones to avoid line wrapping?

Screen Shot 2020-10-12 at 10 59 03 AM

# Distributed package only covers collective communications with 
# Gloo backend and FileStore on Windows platform. Set init_method 
# parameter in init_process_group to a local file.
# Example init_method="file:///f:/libtmp/some_file"
...

dist.init_process_group(
    "gloo", 
    init_method=init_method, 
    rank=rank, 
    world_size=world_size
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should wrap it. Just resolved.

@mrshenli
Copy link
Contributor

@jlin27 this PR LGTM. Shall we land it (if tests pass, the current failure does not seem relevant).

@jspisak jspisak requested a review from brianjo October 15, 2020 16:05
@brianjo brianjo merged commit 8eb4e3d into pytorch:master Oct 27, 2020
rodrigo-techera pushed a commit to Experience-Monks/tutorials that referenced this pull request Nov 29, 2021
* Update ddp_tutorial.rst

update ddp_tutorial to work on Windows platform

* Update ddp tutorial to contain script for Windows and non-Windows platform together

* Resolved code review comments

* Wrap init_process_group method since it is too long in built doc

* The comment is too long in one line, make it short.

* Update

Co-authored-by: Joe Zhu <jozh@microsoft.com>
Co-authored-by: Brian Johnson <brianjo@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants