Correctness of when to call `set_device` in the docs for DDP

### 📚 The doc issue

In the docs tutorial on [how to set up Multi-GPU training](https://pytorch.org/tutorials/beginner/ddp_series_multigpu.html), it is suggested that the following is the proper way to setup each process (initializing the, e.g., NCCL, process group and then calling `torch.cuda.set_device(rank)`):

```python
def ddp_setup(rank: int, world_size: int):
    """
    Args:
        rank: Unique identifier of each process
        world_size: Total number of processes
    """
    os.environ["MASTER_ADDR"] = "localhost"
    os.environ["MASTER_PORT"] = "12355"
    init_process_group(backend="nccl", rank=rank, world_size=world_size)
    torch.cuda.set_device(rank)
```

However, these issues suggest that the proper way is to call `set_device` before initializing the process group:
- https://github.com/pytorch/pytorch/issues/54550#issuecomment-808703316
- https://github.com/pytorch/pytorch/issues/18689#issuecomment-479042701

Which is the correct order? Are there pauses or slowdowns if the order changes?

### Suggest a potential alternative/fix

_No response_

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Correctness of when to call `set_device` in the docs for DDP #2859

📚 The doc issue

Suggest a potential alternative/fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Correctness of when to call set_device in the docs for DDP #2859

Description

📚 The doc issue

Suggest a potential alternative/fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Correctness of when to call `set_device` in the docs for DDP #2859