Closed
Description
Hi!
The Writing Distributed Applications with PyTorch tutorial seems to be outdated. If you implement the first code example, you will receive the following error
Process Process-1:
Process Process-2:
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "run.py", line 18, in init_processes
dist.init_process_group(backend, rank=rank, world_size=size)
File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 335, in init_process_group
backend = Backend(backend)
File "run.py", line 18, in init_processes
dist.init_process_group(backend, rank=rank, world_size=size)
File "/opt/conda/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 57, in __new__
raise ValueError("TCP backend has been deprecated. Please use "
File "/opt/conda/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 335, in init_process_group
backend = Backend(backend)
ValueError: TCP backend has been deprecated. Please use Gloo or MPI backend for collective operations on CPU tensors.
File "/opt/conda/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 57, in __new__
raise ValueError("TCP backend has been deprecated. Please use "
ValueError: TCP backend has been deprecated. Please use Gloo or MPI backend for collective operations on CPU tensors.
The obvious fix is to use the Gloo backend from the start, since it is included in Pytorch anyway.
Best
René
Metadata
Metadata
Assignees
Labels
No labels