File tree Expand file tree Collapse file tree 1 file changed +5
-3
lines changed Expand file tree Collapse file tree 1 file changed +5
-3
lines changed Original file line number Diff line number Diff line change @@ -54,15 +54,17 @@ There are two required environment variables to get the initial version of fligh
54
54
there will be one file per rank output in the jobs running directory.
55
55
Optional settings:
56
56
- TORCH_NCCL_TRACE_CPP_STACK (true, false) true = enable cpp stack trace captures in flight recorder (for slow
57
- addr2line - see additinal settings)
57
+ addr2line - see additional settings)
58
58
- TORCH_NCCL_ENABLE_TIMING (true, false) true = enable additional cuda events at the start of each collective and
59
- record the ‘duration’ of each collective. May incur some CPU overhead.
59
+ record the `duration ` of each collective. May incur some CPU overhead. In the collected data, we end up with a
60
+ `duration ` field that indicates how long a collective took to execute.
60
61
61
62
Additional settings
62
63
-------------------
63
64
TORCH_SYMBOLIZE_MODE: {dladdr, addr2line, fast}: This setting controls the program that is used to retrieve C++ traces
64
65
from a running program. The default setting is `addr2line `. `fast ` is a new experimental mode that is shown to be much
65
- faster than the traditional `addr2line `.
66
+ faster than the traditional `addr2line `. Use this setting in conjunction with `TORCH_NCCL_TRACE_CPP_STACK ` to collect
67
+ C++ traces in `flight recorder ` data.
66
68
67
69
Retrieving Flight Recorder Data via an API
68
70
------------------------------------------
You can’t perform that action at this time.
0 commit comments