Skip to content

Commit a7e56bb

Browse files
authored
Small update to modes on torch.compile tutorial and explain why 2nd run in slower (#2619)
* make some updates to torch.compile mode and explain why 2nd run is slower
1 parent 8b1ed83 commit a7e56bb

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

intermediate_source/torch_compile_tutorial.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -195,11 +195,15 @@ def init_model():
195195
# GPU compute and the observed speedup may be less significant.
196196
#
197197
# You may also see different speedup results depending on the chosen ``mode``
198-
# argument. Since our model and data are small, we want to reduce overhead as
199-
# much as possible, and so we chose ``"reduce-overhead"``. For your own models,
198+
# argument. The ``"reduce-overhead"`` mode uses CUDA graphs to further reduce
199+
# the overhead of Python. For your own models,
200200
# you may need to experiment with different modes to maximize speedup. You can
201201
# read more about modes `here <https://pytorch.org/get-started/pytorch-2.0/#user-experience>`__.
202202
#
203+
# You may might also notice that the second time we run our model with ``torch.compile`` is significantly
204+
# slower than the other runs, although it is much faster than the first run. This is because the ``"reduce-overhead"``
205+
# mode runs a few warm-up iterations for CUDA graphs.
206+
#
203207
# For general PyTorch benchmarking, you can try using ``torch.utils.benchmark`` instead of the ``timed``
204208
# function we defined above. We wrote our own timing function in this tutorial to show
205209
# ``torch.compile``'s compilation latency.

0 commit comments

Comments
 (0)