@@ -34,6 +34,7 @@ Here is how you can use ``CommDebugMode``:
34
34
35
35
.. code-block :: python
36
36
37
+ # The model used in this example is a MLPModule applying Tensor Parallel
37
38
comm_mode = CommDebugMode()
38
39
with comm_mode:
39
40
output = model(inp)
@@ -73,8 +74,8 @@ you want to use to display the data. You can also use a ``noise_level`` argument
73
74
level of displayed information. Here is what each noise level displays:
74
75
75
76
| 0. Prints module-level collective counts
76
- | 1. Prints dTensor operations not included in trivial operations, module information
77
- | 2. Prints operations not included in trivial operations
77
+ | 1. Prints DTensor operations ( not including trivial operations) , module sharding information
78
+ | 2. Prints tensor operations ( not including trivial operations)
78
79
| 3. Prints all operations
79
80
80
81
In the example above, you can see that the collective operation, all_reduce, occurs once in the forward pass
@@ -196,8 +197,9 @@ Below is the interactive module tree visualization that you can use to upload yo
196
197
Conclusion
197
198
------------------------------------------
198
199
199
- In this recipe, we have learned how to use ``CommDebugMode `` to debug Distributed Tensors. You can use your
200
- own JSON outputs in the embedded visual browser.
200
+ In this recipe, we have learned how to use ``CommDebugMode `` to debug Distributed Tensors and
201
+ parallelism solutions that uses communication collectives with PyTorch. You can use your own
202
+ JSON outputs in the embedded visual browser.
201
203
202
204
For more detailed information about ``CommDebugMode ``, see
203
205
`comm_mode_features_example.py
0 commit comments