|
| 1 | +How to use TorchInductor on Windows CPU |
| 2 | +======================================= |
| 3 | + |
| 4 | +**Author**: `Zhaoqiong Zheng <https://github.com/ZhaoqiongZ>`_, `Xu, Han <https://github.com/xuhancn>`_ |
| 5 | + |
| 6 | + |
| 7 | + |
| 8 | +TorchInductor is a compiler backend that transforms FX Graphs generated by TorchDynamo into highly optimized C++/Triton kernels. |
| 9 | +This tutorial will guide you through the process of using TorchInductor on a Windows CPU. |
| 10 | + |
| 11 | +.. grid:: 2 |
| 12 | + |
| 13 | + .. grid-item-card:: :octicon:`mortar-board;1em;` What you will learn |
| 14 | + :class-card: card-prerequisites |
| 15 | + |
| 16 | + * How to compile and execute a Python function with PyTorch, optimized for Windows CPU |
| 17 | + * Basics of TorchInductor's optimization using C++/Triton kernels. |
| 18 | + |
| 19 | + .. grid-item-card:: :octicon:`list-unordered;1em;` Prerequisites |
| 20 | + :class-card: card-prerequisites |
| 21 | + |
| 22 | + * PyTorch v2.5 or later |
| 23 | + * Microsoft Visual C++ (MSVC) |
| 24 | + * Miniforge for Windows |
| 25 | + |
| 26 | +Install the Required Software |
| 27 | +----------------------------- |
| 28 | + |
| 29 | +First, let's install the required software. C++ compiler is required for TorchInductor optimization. |
| 30 | +We will use Microsoft Visual C++ (MSVC) for this example. |
| 31 | + |
| 32 | +1. Download and install `MSVC <https://visualstudio.microsoft.com/downloads/>`_. |
| 33 | + |
| 34 | +2. During the installation, choose **Desktop Development with C++** in the **Desktop & Mobile** section in **Workloads** table. Then install the software |
| 35 | + |
| 36 | +.. note:: |
| 37 | + |
| 38 | + We recommend C++ compiler `Clang <https://github.com/llvm/llvm-project/releases>`_ and `Intel Compiler <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_. |
| 39 | + Please check `Alternative Compiler for better performance <#alternative-compiler-for-better-performance>`_. |
| 40 | + |
| 41 | +3. Download and install `Miniforge3-Windows-x86_64.exe <https://github.com/conda-forge/miniforge/releases/latest/>`__. |
| 42 | + |
| 43 | +Set Up the Environment |
| 44 | +---------------------- |
| 45 | + |
| 46 | +#. Open the command line environment via ``cmd.exe``. |
| 47 | +#. Activate ``MSVC`` with the following command: |
| 48 | + |
| 49 | + .. code-block:: sh |
| 50 | +
|
| 51 | + "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Auxiliary/Build/vcvars64.bat" |
| 52 | +#. Activate ``conda`` with the following command: |
| 53 | + |
| 54 | + .. code-block:: sh |
| 55 | +
|
| 56 | + "C:/ProgramData/miniforge3/Scripts/activate.bat" |
| 57 | +#. Create and activate a customer conda environment: |
| 58 | + |
| 59 | + .. code-block:: sh |
| 60 | +
|
| 61 | + conda create -n inductor_cpu_windows python=3.10 -y |
| 62 | + conda activate inductor_cpu_windows |
| 63 | +
|
| 64 | +#. Install `PyTorch 2.5 <https://pytorch.org/get-started/locally/>`_ or later. |
| 65 | + |
| 66 | +Using TorchInductor on Windows CPU |
| 67 | +---------------------------------- |
| 68 | + |
| 69 | +Here’s a simple example to demonstrate how to use TorchInductor: |
| 70 | + |
| 71 | +.. code-block:: python |
| 72 | +
|
| 73 | +
|
| 74 | + import torch |
| 75 | + def foo(x, y): |
| 76 | + a = torch.sin(x) |
| 77 | + b = torch.cos(x) |
| 78 | + return a + b |
| 79 | + opt_foo1 = torch.compile(foo) |
| 80 | + print(opt_foo1(torch.randn(10, 10), torch.randn(10, 10))) |
| 81 | +
|
| 82 | +The code above returns the following output: |
| 83 | + |
| 84 | +.. code-block:: sh |
| 85 | +
|
| 86 | + tensor([[-3.9074e-02, 1.3994e+00, 1.3894e+00, 3.2630e-01, 8.3060e-01, |
| 87 | + 1.1833e+00, 1.4016e+00, 7.1905e-01, 9.0637e-01, -1.3648e+00], |
| 88 | + [ 1.3728e+00, 7.2863e-01, 8.6888e-01, -6.5442e-01, 5.6790e-01, |
| 89 | + 5.2025e-01, -1.2647e+00, 1.2684e+00, -1.2483e+00, -7.2845e-01], |
| 90 | + [-6.7747e-01, 1.2028e+00, 1.1431e+00, 2.7196e-02, 5.5304e-01, |
| 91 | + 6.1945e-01, 4.6654e-01, -3.7376e-01, 9.3644e-01, 1.3600e+00], |
| 92 | + [-1.0157e-01, 7.7200e-02, 1.0146e+00, 8.8175e-02, -1.4057e+00, |
| 93 | + 8.8119e-01, 6.2853e-01, 3.2773e-01, 8.5082e-01, 8.4615e-01], |
| 94 | + [ 1.4140e+00, 1.2130e+00, -2.0762e-01, 3.3914e-01, 4.1122e-01, |
| 95 | + 8.6895e-01, 5.8852e-01, 9.3310e-01, 1.4101e+00, 9.8318e-01], |
| 96 | + [ 1.2355e+00, 7.9290e-02, 1.3707e+00, 1.3754e+00, 1.3768e+00, |
| 97 | + 9.8970e-01, 1.1171e+00, -5.9944e-01, 1.2553e+00, 1.3394e+00], |
| 98 | + [-1.3428e+00, 1.8400e-01, 1.1756e+00, -3.0654e-01, 9.7973e-01, |
| 99 | + 1.4019e+00, 1.1886e+00, -1.9194e-01, 1.3632e+00, 1.1811e+00], |
| 100 | + [-7.1615e-01, 4.6622e-01, 1.2089e+00, 9.2011e-01, 1.0659e+00, |
| 101 | + 9.0892e-01, 1.1932e+00, 1.3888e+00, 1.3898e+00, 1.3218e+00], |
| 102 | + [ 1.4139e+00, -1.4000e-01, 9.1192e-01, 3.0175e-01, -9.6432e-01, |
| 103 | + -1.0498e+00, 1.4115e+00, -9.3212e-01, -9.0964e-01, 1.0127e+00], |
| 104 | + [ 5.7244e-04, 1.2799e+00, 1.3595e+00, 1.0907e+00, 3.7191e-01, |
| 105 | + 1.4062e+00, 1.3672e+00, 6.8502e-02, 8.5216e-01, 8.6046e-01]]) |
| 106 | +
|
| 107 | +Using an Alternative Compiler for Better Performance |
| 108 | +------------------------------------------- |
| 109 | + |
| 110 | +To enhance performance on Windows inductor, you can use the Intel Compiler or LLVM Compiler. However, they rely on the runtime libraries from Microsoft Visual C++ (MSVC). Therefore, your first step should be to install MSVC. |
| 111 | + |
| 112 | +Intel Compiler |
| 113 | +^^^^^^^^^^^^^^ |
| 114 | + |
| 115 | +#. Download and install `Intel Compiler <https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler-download.html>`_ with Windows version. |
| 116 | +#. Set Windows Inductor Compiler with the CXX environment variable ``set CXX=icx-cl``. |
| 117 | + |
| 118 | +LLVM Compiler |
| 119 | +^^^^^^^^^^^^^ |
| 120 | + |
| 121 | +#. Download and install `LLVM Compiler <https://github.com/llvm/llvm-project/releases>`_ and choose win64 version. |
| 122 | +#. Set Windows Inductor Compiler with the CXX environment variable ``set CXX=clang-cl``. |
| 123 | + |
| 124 | +Conclusion |
| 125 | +---------- |
| 126 | + |
| 127 | +In this tutorial, we have learned how to use Inductor on Windows CPU with PyTorch. In addition, we discussed |
| 128 | +further performance improvements with Intel Compiler and LLVM Compiler. |
0 commit comments