Skip to content

Commit beaab5c

Browse files
committed
mend
1 parent 4bc0657 commit beaab5c

File tree

1 file changed

+5
-9
lines changed

1 file changed

+5
-9
lines changed

recipes_source/distributed_comm_debug_mode.rst

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,8 @@ unified abstraction that can bridge these different parallelism strategies. To a
1818
issue, PyTorch has proposed `DistributedTensor(DTensor)
1919
<https://github.com/pytorch/pytorch/blob/main/torch/distributed/_tensor/examples/comm_mode_features_example.py>`_
2020
which abstracts away the complexities of tensor communication in distributed training,
21-
providing a seamless user experience. However, when dealing with existing parallelism solutions
22-
and developing parallelism solutions using the unified abstraction like DTensor, the lack of
23-
transparency about what and when the collective communications happens under the hood could
24-
make it challenging for advanced users to identify and resolve issues. To address this challenge,
21+
providing a seamless user experience. However, this abstraction creates a lack of transparency
22+
that can make it challenging for users to identify and resolve issues. To address this challenge,
2523
``CommDebugMode``, a Python context manager will serve as one of the primary debugging tools for
2624
DTensors, enabling users to view when and why collective operations are happening when using DTensors,
2725
effectively addressing this issue.
@@ -34,7 +32,6 @@ Here is how you can use ``CommDebugMode``:
3432

3533
.. code-block:: python
3634
37-
# The model used in this example is a MLPModule that applies Tensor Parallel
3835
comm_mode = CommDebugMode()
3936
with comm_mode:
4037
output = model(inp)
@@ -74,8 +71,8 @@ you want to use to display the data. You can also use a ``noise_level`` argument
7471
level of displayed information. Here is what each noise level displays:
7572

7673
| 0. Prints module-level collective counts
77-
| 1. Prints DTensor operations (not including trivial operations), module sharding information
78-
| 2. Prints tensor operations (not including trivial operations)
74+
| 1. Prints dTensor operations not included in trivial operations, module information
75+
| 2. Prints operations not included in trivial operations
7976
| 3. Prints all operations
8077
8178
In the example above, you can see that the collective operation, all_reduce, occurs once in the forward pass
@@ -197,8 +194,7 @@ Below is the interactive module tree visualization that you can use to upload yo
197194
Conclusion
198195
------------------------------------------
199196

200-
In this recipe, we have learned how to use ``CommDebugMode`` to debug Distributed Tensors and
201-
parallelism solutions that uses communication collectives with PyTorch. You can use your
197+
In this recipe, we have learned how to use ``CommDebugMode`` to debug Distributed Tensors. You can use your
202198
own JSON outputs in the embedded visual browser.
203199

204200
For more detailed information about ``CommDebugMode``, see

0 commit comments

Comments
 (0)