Update on "[Tensor Parallel] remove non-existing code pointer"

tianyu-l · tianyu-l · commit 3a3e0275b733 · 2024-08-15T15:07:28.000-07:00
[ghstack-poisoned]
diff --git a/intermediate_source/TP_tutorial.rst b/intermediate_source/TP_tutorial.rst
@@ -358,4 +358,4 @@ Conclusion
 This tutorial demonstrates how to train a large Transformer-like model across hundreds to thousands of GPUs using Tensor Parallel in combination with Fully Sharded Data Parallel.
 It explains how to apply Tensor Parallel to different parts of the model, with **no code changes** to the model itself. Tensor Parallel is a efficient model parallelism technique for large scale training.
 
-To see the complete end to end code example explained in this tutorial, please refer to the `Tensor Parallel examples <https://github.com/pytorch/examples/blob/main/distributed/tensor_parallelism/fsdp_tp_example.py>`__ in the pytorch/examples repository.
+To see the complete end-to-end code example explained in this tutorial, please refer to the `Tensor Parallel examples <https://github.com/pytorch/examples/blob/main/distributed/tensor_parallelism/fsdp_tp_example.py>`__ in the pytorch/examples repository.