You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: recipes_source/torch_compile_caching_configuration_tutorial.rst
+7-7Lines changed: 7 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ Before starting this recipe, make sure that you have the following:
24
24
Inductor Cache Settings
25
25
----------------------------
26
26
27
-
Most of these caches are in-memory, only used within the same process, and are transparent to the user. An exception is caches that store compiled FX graphs (FXGraphCache, AOTAutogradCache). These caches allow Inductor to avoid recompilation across process boundaries when it encounters the same graph with the same Tensor input shapes (and the same configuration). The default implementation stores compiled artifacts in the system temp directory. An optional feature also supports sharing those artifacts within a cluster by storing them in a Redis database.
27
+
Most of these caches are in-memory, only used within the same process, and are transparent to the user. An exception is caches that store compiled FX graphs (``FXGraphCache``, ``AOTAutogradCache``). These caches allow Inductor to avoid recompilation across process boundaries when it encounters the same graph with the same Tensor input shapes (and the same configuration). The default implementation stores compiled artifacts in the system temp directory. An optional feature also supports sharing those artifacts within a cluster by storing them in a Redis database.
28
28
29
29
There are a few settings relevant to caching and to FX graph caching in particular.
30
30
The settings are accessible via environment variables listed below or can be hard-coded in Inductor’s config file.
@@ -37,17 +37,17 @@ TORCHINDUCTOR_AUTOGRAD_CACHE
37
37
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38
38
This setting extends FXGraphCache to store cached results at the AOTAutograd level, instead of at the Inductor level. ``1`` enables, and any other value disables it.
39
39
By default, the disk location is per username, but users can enable sharing across usernames by specifying ``TORCHINDUCTOR_CACHE_DIR`` (below).
40
-
`TORCHINDUCTOR_AUTOGRAD_CACHE` requires `TORCHINDUCTOR_FX_GRAPH_CACHE` to work. The same cache dir stores cache entries for AOTAutogradCache (under `{TORCHINDUCTOR_CACHE_DIR}/aotautograd`) and FXGraphCache (under `{TORCHINDUCTOR_CACHE_DIR}/fxgraph`).
40
+
`TORCHINDUCTOR_AUTOGRAD_CACHE` requires `TORCHINDUCTOR_FX_GRAPH_CACHE` to work. The same cache dir stores cache entries for ``AOTAutogradCache`` (under `{TORCHINDUCTOR_CACHE_DIR}/aotautograd`) and ``FXGraphCache`` (under `{TORCHINDUCTOR_CACHE_DIR}/fxgraph`).
41
41
42
42
TORCHINDUCTOR_CACHE_DIR
43
43
~~~~~~~~~~~~~~~~~~~~~~~~
44
44
This setting specifies the location of all on-disk caches. By default, the location is in the system temp directory under ``torchinductor_<username>``, for example, ``/tmp/torchinductor_myusername``.
45
45
46
-
Note that if ``TRITON_CACHE_DIR`` is not set in the environment, Inductor sets the Triton cache directory to this same temp location, under the Triton subdirectory.
46
+
Note that if ``TRITON_CACHE_DIR`` is not set in the environment, Inductor sets the ``Triton`` cache directory to this same temp location, under the Triton sub-directory.
47
47
48
48
TORCHINDUCTOR_FX_GRAPH_REMOTE_CACHE
49
49
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50
-
This setting enables the remote FX graph cache feature. The current implementation uses Redis. ``1`` enables caching, and any other value disables it. The following environment variables configure the host and port of the Redis server:
50
+
This setting enables the remote FX graph cache feature. The current implementation uses ``Redis``. ``1`` enables caching, and any other value disables it. The following environment variables configure the host and port of the Redis server:
51
51
52
52
``TORCHINDUCTOR_REDIS_HOST`` (defaults to ``localhost``)
53
53
``TORCHINDUCTOR_REDIS_PORT`` (defaults to ``6379``)
@@ -56,15 +56,15 @@ Note that if Inductor locates a remote cache entry, it stores the compiled artif
56
56
57
57
TORCHINDUCTOR_AUTOGRAD_REMOTE_CACHE
58
58
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59
-
Like TORCHINDUCTOR_FX_GRAPH_REMOTE_CACHE, this setting enables the remote AOT AutogradCache feature. The current implementation uses Redis. ``1`` enables caching, and any other value disables it. The following environment variables configure the host and port of the Redis server:
59
+
Like ``TORCHINDUCTOR_FX_GRAPH_REMOTE_CACHE``, this setting enables the remote ``AOTAutogradCache`` feature. The current implementation uses Redis. ``1`` enables caching, and any other value disables it. The following environment variables configure the host and port of the ``Redis`` server:
60
60
``TORCHINDUCTOR_REDIS_HOST`` (defaults to ``localhost``)
61
61
``TORCHINDUCTOR_REDIS_PORT`` (defaults to ``6379``)
62
62
63
-
`TORCHINDUCTOR_AUTOGRAD_REMOTE_CACHE`` depends on `TORCHINDUCTOR_FX_GRAPH_REMOTE_CACHE` to be enabled to work. The same Redis server can store both AOTAutograd and FXGraph cache results.
63
+
`TORCHINDUCTOR_AUTOGRAD_REMOTE_CACHE`` depends on ``TORCHINDUCTOR_FX_GRAPH_REMOTE_CACHE`` to be enabled to work. The same Redis server can store both AOTAutograd and FXGraph cache results.
64
64
65
65
TORCHINDUCTOR_AUTOTUNE_REMOTE_CACHE
66
66
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
67
-
This setting enables a remote cache for Inductor’s autotuner. As with the remote FX graph cache, the current implementation uses Redis. ``1`` enables caching, and any other value disables it. The same host / port environment variables listed above apply to this cache.
67
+
This setting enables a remote cache for ``TorchInductor``’s autotuner. As with the remote FX graph cache, the current implementation uses Redis. ``1`` enables caching, and any other value disables it. The same host / port environment variables listed above apply to this cache.
Copy file name to clipboardExpand all lines: recipes_source/torch_compile_caching_tutorial.rst
+14-14Lines changed: 14 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -30,22 +30,22 @@ Caching Offerings
30
30
31
31
``torch.compile`` provides the following caching offerings:
32
32
33
-
* End to end caching (also known as Mega-Cache)
34
-
* Modular caching of TorchDynamo, TorchInductor, and Triton
33
+
* End to end caching (also known as ``Mega-Cache``)
34
+
* Modular caching of ``TorchDynamo``, ``TorchInductor``, and ``Triton``
35
35
36
36
It is important to note that caching validates that the cache artifacts are used with the same PyTorch and Triton version, as well as, same GPU when device is set to be cuda.
End to end caching, from here onwards referred to Mega-Cache, is the ideal solution for users looking for a portable caching solution that can be stored in a database and can later be fetched possibly on a separate machine.
42
42
43
-
Mega-Cache provides two compiler APIs
43
+
``Mega-Cache`` provides two compiler APIs
44
44
45
45
* ``torch.compiler.save_cache_artifacts()``
46
46
* ``torch.compiler.load_cache_artifacts()``
47
47
48
-
The intented use case is after compiling and executing a model, the user calls ``torch.compiler.save_cache_artifacts()`` which will return the compiler artifacts in a portable form. Later, potentially on a different machine, the user may call ``torch.compiler.load_cache_artifacts()`` with these artifacts to prepopulate the ``torch.compile`` caches in order to jump-start their cache.
48
+
The intended use case is after compiling and executing a model, the user calls ``torch.compiler.save_cache_artifacts()`` which will return the compiler artifacts in a portable form. Later, potentially on a different machine, the user may call ``torch.compiler.load_cache_artifacts()`` with these artifacts to pre-populate the ``torch.compile`` caches in order to jump-start their cache.
49
49
50
50
An example to this is as follows. First, compile and save the cache artifacts.
51
51
@@ -74,27 +74,27 @@ Later, the user can jump-start their cache by the following.
This operation populates all the modular caches that will be discussed in the next section, including PGO, AOTAutograd, Inductor, Triton, and Autotuning.
77
+
This operation populates all the modular caches that will be discussed in the next section, including ``PGO``, ``AOTAutograd``, ``Inductor``, ``Triton``, and ``Autotuning``.
78
78
79
79
80
-
Modular caching of TorchDynamo, TorchInductor, and Triton
80
+
Modular caching of ``TorchDynamo``, ``TorchInductor``, and ``Triton``
The above described MegaCache is also compromised of individual components that can be used without any user intervention. By default, PyTorch Compiler comes with local on-disk caches for TorchDynamo, TorchInductor, and Triton. These caches are as following.
83
+
The above described MegaCache is also compromised of individual components that can be used without any user intervention. By default, PyTorch Compiler comes with local on-disk caches for ``TorchDynamo``, ``TorchInductor``, and ``Triton``. These caches are as following.
84
84
85
-
* FXGraphCache: cache of graph-based IR components used in compilation
86
-
* Triton Cache: cache of Triton-compilation results (cubin files generated by Triton as well as other caching artifacts)
87
-
* InductorCache: bundling of FXGraphCache and Triton cache
88
-
* AOTAutogradCache: caching of joint graph artifacts
89
-
* PGO-cache: cache of dynamic shape decisions to reduce number of recompilations
85
+
* ``FXGraphCache``: cache of graph-based IR components used in compilation
86
+
* ``Triton Cache``: cache of Triton-compilation results (``cubin`` files generated by ``Triton`` as well as other caching artifacts)
87
+
* ``InductorCache``: bundling of ``FXGraphCache`` and ``Triton`` cache
88
+
* ``AOTAutogradCache``: caching of joint graph artifacts
89
+
* ``PGO-cache``: cache of dynamic shape decisions to reduce number of recompilations
90
90
91
91
All these cache artifacts are written to ``TORCHINDUCTOR_CACHE_DIR`` which by default will look like ``/tmp/torchinductor_myusername``.
92
92
93
93
94
94
Remote Caching
95
95
----------------
96
96
97
-
We also provide a remote caching option for users who would like to take advantage of a Redis based cache. Check out `Compile Time Caching Configurations <https://pytorch.org/tutorials/recipes/torch_compile_caching_configuration_tutorial.html>` to learn more about how to enable the Redis based caching.
97
+
We also provide a remote caching option for users who would like to take advantage of a ``Redis`` based cache. Check out `Compile Time Caching Configurations <https://pytorch.org/tutorials/recipes/torch_compile_caching_configuration_tutorial.html>` to learn more about how to enable the ``Redis`` based caching.
0 commit comments