Closed
Description
The Writing Distributed Applications with PyTorch tutorial does not run as written with either tcp
or mpi
backends. With gloo
, it works.
Can you please replace code snippets that reference tcp
, such as this one:
dist.init_process_group(init_method='tcp://10.1.1.20:23456', rank=args.rank, world_size=4)
with the appropriate ones that reference gloo
? Even better, if you could remove the references to TCP and MPI throughout the tutorial, that would be great.
Seems I'm not the only one to notice this; see #618 .
Thanks, and let me know if I can help!