@@ -162,11 +162,11 @@ You can use different compiler configs for the two compilations, for example, th
162
162
163
163
.. code :: python
164
164
165
- def train(model, x):
166
- model = torch.compile(model)
167
- loss = model(x).sum()
168
- torch._dynamo.config.compiled_autograd = True
169
- torch.compile(lambda: loss.backward(), fullgraph=True)()
165
+ def train (model , x ):
166
+ model = torch.compile(model)
167
+ loss = model(x).sum()
168
+ torch._dynamo.config.compiled_autograd = True
169
+ torch.compile(lambda : loss.backward(), fullgraph = True )()
170
170
171
171
Or you can use the context manager, which will apply to all autograd calls within its scope.
172
172
@@ -213,8 +213,8 @@ Compiled Autograd addresses certain limitations of AOTAutograd
213
213
assert (torch._dynamo.utils.counters[" stats" ][" unique_graphs" ] == 1 )
214
214
215
215
216
- In the `` 1. base torch.compile `` case, we see that 3 backward graphs were produced due to the 2 graph breaks in the compiled function ``fn ``.
217
- Whereas in `` 2. torch.compile with compiled autograd`` , we see that a full backward graph was traced despite the graph breaks.
216
+ In the first `` torch.compile `` case, we see that 3 backward graphs were produced due to the 2 graph breaks in the compiled function ``fn ``.
217
+ Whereas in the second `` torch.compile `` with compiled autograd case , we see that a full backward graph was traced despite the graph breaks.
218
218
219
219
2. Backward hooks are not captured
220
220
@@ -231,7 +231,7 @@ Whereas in ``2. torch.compile with compiled autograd``, we see that a full backw
231
231
with torch._dynamo.compiled_autograd.enable(torch.compile(backend = " aot_eager" )):
232
232
loss.backward()
233
233
234
- There should be a ``call_hook `` node in the graph, which dynamo will later inline into
234
+ There should be a ``call_hook `` node in the graph, which dynamo will later inline into the following:
235
235
236
236
.. code :: python
237
237
@@ -249,7 +249,7 @@ There should be a ``call_hook`` node in the graph, which dynamo will later inlin
249
249
250
250
Common recompilation reasons for Compiled Autograd
251
251
--------------------------------------------------
252
- 1. Due to changes in the autograd structure of the loss value
252
+ 1. Due to changes in the autograd structure of the loss value:
253
253
254
254
.. code :: python
255
255
@@ -274,7 +274,7 @@ In the example above, we call a different operator on each iteration, leading to
274
274
...
275
275
"""
276
276
277
- 2. Due to tensors changing shapes
277
+ 2. Due to tensors changing shapes:
278
278
279
279
.. code :: python
280
280
0 commit comments