Skip to content

Commit 4c6b4b1

Browse files
committed
mend
1 parent beaab5c commit 4c6b4b1

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

recipes_source/distributed_comm_debug_mode.rst

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,13 @@ unified abstraction that can bridge these different parallelism strategies. To a
1818
issue, PyTorch has proposed `DistributedTensor(DTensor)
1919
<https://github.com/pytorch/pytorch/blob/main/torch/distributed/_tensor/examples/comm_mode_features_example.py>`_
2020
which abstracts away the complexities of tensor communication in distributed training,
21-
providing a seamless user experience. However, this abstraction creates a lack of transparency
22-
that can make it challenging for users to identify and resolve issues. To address this challenge,
23-
``CommDebugMode``, a Python context manager will serve as one of the primary debugging tools for
24-
DTensors, enabling users to view when and why collective operations are happening when using DTensors,
25-
effectively addressing this issue.
21+
providing a seamless user experience. However, when dealing with existing parallelism solutions and
22+
developing parallelism solutions using the unified abstraction like DTensor, the lack of transparency
23+
about what and when the collective communications happens under the hood could make it challenging
24+
for advanced users to identify and resolve issues. To address this challenge, ``CommDebugMode``, a
25+
Python context manager will serve as one of the primary debugging tools for DTensors, enabling
26+
users to view when and why collective operations are happening when using DTensors, effectively
27+
addressing this issue.
2628

2729

2830
How to use CommDebugMode

0 commit comments

Comments
 (0)