Skip to content

Commit 8d959ca

Browse files
ZhaoqiongZsvekars
andauthored
Add tutorial inductor on Windows CPU (#3062)
* add tutorial for inductor on windows cpu --------- Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
1 parent 01d2270 commit 8d959ca

File tree

2 files changed

+136
-0
lines changed

2 files changed

+136
-0
lines changed
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
How to use TorchInductor on Windows CPU
2+
=======================================
3+
4+
**Author**: `Zhaoqiong Zheng <https://github.com/ZhaoqiongZ>`_, `Xu, Han <https://github.com/xuhancn>`_
5+
6+
7+
8+
TorchInductor is a compiler backend that transforms FX Graphs generated by TorchDynamo into highly optimized C++/Triton kernels.
9+
This tutorial will guide you through the process of using TorchInductor on a Windows CPU.
10+
11+
.. grid:: 2
12+
13+
.. grid-item-card:: :octicon:`mortar-board;1em;` What you will learn
14+
:class-card: card-prerequisites
15+
16+
* How to compile and execute a Python function with PyTorch, optimized for Windows CPU
17+
* Basics of TorchInductor's optimization using C++/Triton kernels.
18+
19+
.. grid-item-card:: :octicon:`list-unordered;1em;` Prerequisites
20+
:class-card: card-prerequisites
21+
22+
* PyTorch v2.5 or later
23+
* Microsoft Visual C++ (MSVC)
24+
* Miniforge for Windows
25+
26+
Install the Required Software
27+
-----------------------------
28+
29+
First, let's install the required software. C++ compiler is required for TorchInductor optimization.
30+
We will use Microsoft Visual C++ (MSVC) for this example.
31+
32+
1. Download and install `MSVC <https://visualstudio.microsoft.com/downloads/>`_.
33+
34+
2. During the installation, choose **Desktop Development with C++** in the **Desktop & Mobile** section in **Workloads** table. Then install the software
35+
36+
.. note::
37+
38+
We recommend C++ compiler `Clang <https://github.com/llvm/llvm-project/releases>`_ and `Intel Compiler <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_.
39+
Please check `Alternative Compiler for better performance <#alternative-compiler-for-better-performance>`_.
40+
41+
3. Download and install `Miniforge3-Windows-x86_64.exe <https://github.com/conda-forge/miniforge/releases/latest/>`__.
42+
43+
Set Up the Environment
44+
----------------------
45+
46+
#. Open the command line environment via ``cmd.exe``.
47+
#. Activate ``MSVC`` with the following command:
48+
49+
.. code-block:: sh
50+
51+
"C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Auxiliary/Build/vcvars64.bat"
52+
#. Activate ``conda`` with the following command:
53+
54+
.. code-block:: sh
55+
56+
"C:/ProgramData/miniforge3/Scripts/activate.bat"
57+
#. Create and activate a customer conda environment:
58+
59+
.. code-block:: sh
60+
61+
conda create -n inductor_cpu_windows python=3.10 -y
62+
conda activate inductor_cpu_windows
63+
64+
#. Install `PyTorch 2.5 <https://pytorch.org/get-started/locally/>`_ or later.
65+
66+
Using TorchInductor on Windows CPU
67+
----------------------------------
68+
69+
Here’s a simple example to demonstrate how to use TorchInductor:
70+
71+
.. code-block:: python
72+
73+
74+
import torch
75+
def foo(x, y):
76+
a = torch.sin(x)
77+
b = torch.cos(x)
78+
return a + b
79+
opt_foo1 = torch.compile(foo)
80+
print(opt_foo1(torch.randn(10, 10), torch.randn(10, 10)))
81+
82+
The code above returns the following output:
83+
84+
.. code-block:: sh
85+
86+
tensor([[-3.9074e-02, 1.3994e+00, 1.3894e+00, 3.2630e-01, 8.3060e-01,
87+
1.1833e+00, 1.4016e+00, 7.1905e-01, 9.0637e-01, -1.3648e+00],
88+
[ 1.3728e+00, 7.2863e-01, 8.6888e-01, -6.5442e-01, 5.6790e-01,
89+
5.2025e-01, -1.2647e+00, 1.2684e+00, -1.2483e+00, -7.2845e-01],
90+
[-6.7747e-01, 1.2028e+00, 1.1431e+00, 2.7196e-02, 5.5304e-01,
91+
6.1945e-01, 4.6654e-01, -3.7376e-01, 9.3644e-01, 1.3600e+00],
92+
[-1.0157e-01, 7.7200e-02, 1.0146e+00, 8.8175e-02, -1.4057e+00,
93+
8.8119e-01, 6.2853e-01, 3.2773e-01, 8.5082e-01, 8.4615e-01],
94+
[ 1.4140e+00, 1.2130e+00, -2.0762e-01, 3.3914e-01, 4.1122e-01,
95+
8.6895e-01, 5.8852e-01, 9.3310e-01, 1.4101e+00, 9.8318e-01],
96+
[ 1.2355e+00, 7.9290e-02, 1.3707e+00, 1.3754e+00, 1.3768e+00,
97+
9.8970e-01, 1.1171e+00, -5.9944e-01, 1.2553e+00, 1.3394e+00],
98+
[-1.3428e+00, 1.8400e-01, 1.1756e+00, -3.0654e-01, 9.7973e-01,
99+
1.4019e+00, 1.1886e+00, -1.9194e-01, 1.3632e+00, 1.1811e+00],
100+
[-7.1615e-01, 4.6622e-01, 1.2089e+00, 9.2011e-01, 1.0659e+00,
101+
9.0892e-01, 1.1932e+00, 1.3888e+00, 1.3898e+00, 1.3218e+00],
102+
[ 1.4139e+00, -1.4000e-01, 9.1192e-01, 3.0175e-01, -9.6432e-01,
103+
-1.0498e+00, 1.4115e+00, -9.3212e-01, -9.0964e-01, 1.0127e+00],
104+
[ 5.7244e-04, 1.2799e+00, 1.3595e+00, 1.0907e+00, 3.7191e-01,
105+
1.4062e+00, 1.3672e+00, 6.8502e-02, 8.5216e-01, 8.6046e-01]])
106+
107+
Using an Alternative Compiler for Better Performance
108+
-------------------------------------------
109+
110+
To enhance performance on Windows inductor, you can use the Intel Compiler or LLVM Compiler. However, they rely on the runtime libraries from Microsoft Visual C++ (MSVC). Therefore, your first step should be to install MSVC.
111+
112+
Intel Compiler
113+
^^^^^^^^^^^^^^
114+
115+
#. Download and install `Intel Compiler <https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler-download.html>`_ with Windows version.
116+
#. Set Windows Inductor Compiler with the CXX environment variable ``set CXX=icx-cl``.
117+
118+
LLVM Compiler
119+
^^^^^^^^^^^^^
120+
121+
#. Download and install `LLVM Compiler <https://github.com/llvm/llvm-project/releases>`_ and choose win64 version.
122+
#. Set Windows Inductor Compiler with the CXX environment variable ``set CXX=clang-cl``.
123+
124+
Conclusion
125+
----------
126+
127+
In this tutorial, we have learned how to use Inductor on Windows CPU with PyTorch. In addition, we discussed
128+
further performance improvements with Intel Compiler and LLVM Compiler.

prototype_source/prototype_index.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,13 @@ Prototype features are not available as part of binary distributions like PyPI o
217217
:link: ../prototype/inductor_cpp_wrapper_tutorial.html
218218
:tags: Model-Optimization
219219

220+
.. customcarditem::
221+
:header: Inductor Windows CPU Tutorial
222+
:card_description: Speed up your models with Inductor On Windows CPU
223+
:image: ../_static/img/thumbnails/cropped/generic-pytorch-logo.png
224+
:link: ../prototype/inductor_windows_cpu.html
225+
:tags: Model-Optimization
226+
220227
.. Distributed
221228
.. customcarditem::
222229
:header: Flight Recorder Tutorial
@@ -249,6 +256,7 @@ Prototype features are not available as part of binary distributions like PyPI o
249256
prototype/flight_recorder_tutorial.html
250257
prototype/graph_mode_dynamic_bert_tutorial.html
251258
prototype/inductor_cpp_wrapper_tutorial.html
259+
prototype/inductor_windows_cpu.html
252260
prototype/pt2e_quantizer.html
253261
prototype/pt2e_quant_ptq.html
254262
prototype/pt2e_quant_qat.html

0 commit comments

Comments
 (0)