Skip to content

Commit 7a7862d

Browse files
committed
make_distribute_tutorial_work_in_google_colab
1 parent 748e52b commit 7a7862d

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

intermediate_source/dist_tuto.rst

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ the following template.
4747
"""run.py:"""
4848
#!/usr/bin/env python
4949
import os
50+
import sys
5051
import torch
5152
import torch.distributed as dist
5253
import torch.multiprocessing as mp
@@ -66,7 +67,11 @@ the following template.
6667
if __name__ == "__main__":
6768
size = 2
6869
processes = []
69-
mp.set_start_method("spawn")
70+
if "google.colab" in sys.modules:
71+
print("Running in Google Colab")
72+
mp.get_context("spawn")
73+
else:
74+
mp.set_start_method("spawn")
7075
for rank in range(size):
7176
p = mp.Process(target=init_process, args=(rank, size, run))
7277
p.start()
@@ -156,7 +161,8 @@ we should not modify the sent tensor nor access the received tensor before ``req
156161
In other words,
157162

158163
- writing to ``tensor`` after ``dist.isend()`` will result in undefined behaviour.
159-
- reading from ``tensor`` after ``dist.irecv()`` will result in undefined behaviour.
164+
- reading from ``tensor`` after ``dist.irecv()`` will result in undefined behaviour,
165+
until ``req.wait()`` has been executed.
160166

161167
However, after ``req.wait()``
162168
has been executed we are guaranteed that the communication took place,

0 commit comments

Comments
 (0)