Skip to content

Commit 4a2d17d

Browse files
Fix pt2e quant ptq x86 title issue
1 parent 219a9e3 commit 4a2d17d

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

prototype_source/pt2e_quant_ptq_x86_inductor.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,14 @@ PyTorch 2 Export Post Training Quantization with X86 Backend through Inductor
44
**Author**: `Leslie Fang <https://github.com/leslie-fang-intel>`_, `Weiwen Xia <https://github.com/Xia-Weiwen>`_, `Jiong Gong <https://github.com/jgong5>`_, `Jerry Zhang <https://github.com/jerryzh168>`_
55

66
Prerequisites
7-
^^^^^^^^^^^^^^^
7+
-------------
88

99
- `PyTorch 2 Export Post Training Quantization <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html>`_
1010
- `TorchInductor and torch.compile concepts in PyTorch <https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html>`_
1111
- `Inductor C++ Wrapper concepts <https://pytorch.org/tutorials/prototype/inductor_cpp_wrapper_tutorial.html>`_
1212

1313
Introduction
14-
^^^^^^^^^^^^^^
14+
------------
1515

1616
This tutorial introduces the steps for utilizing the PyTorch 2 Export Quantization flow to generate a quantized model customized
1717
for the x86 inductor backend and explains how to lower the quantized model into the inductor.
@@ -63,8 +63,8 @@ further boost the models' performance by leveraging the
6363

6464
Now, we will walk you through a step-by-step tutorial for how to use it with `torchvision resnet18 model <https://download.pytorch.org/models/resnet18-f37072fd.pth>`_.
6565

66-
1. Capture FX Graph
67-
---------------------
66+
Capture FX Graph
67+
----------------
6868

6969
We will start by performing the necessary imports, capturing the FX Graph from the eager module.
7070

@@ -110,8 +110,8 @@ We will start by performing the necessary imports, capturing the FX Graph from t
110110

111111
Next, we will have the FX Module to be quantized.
112112

113-
2. Apply Quantization
114-
----------------------------
113+
Apply Quantization
114+
------------------
115115

116116
After we capture the FX Module to be quantized, we will import the Backend Quantizer for X86 CPU and configure how to
117117
quantize the model.
@@ -159,8 +159,8 @@ Finally, we will convert the calibrated Model to a quantized Model. ``convert_pt
159159
After these steps, we finished running the quantization flow and we will get the quantized model.
160160

161161

162-
3. Lower into Inductor
163-
------------------------
162+
Lower into Inductor
163+
-------------------
164164

165165
After we get the quantized model, we will further lower it to the inductor backend. The default Inductor wrapper
166166
generates Python code to invoke both generated kernels and external kernels. Additionally, Inductor supports
@@ -222,8 +222,8 @@ With PyTorch 2.1 release, all CNN models from TorchBench test suite have been me
222222
to `this document <https://dev-discuss.pytorch.org/t/torchinductor-update-6-cpu-backend-performance-update-and-new-features-in-pytorch-2-1/1514#int8-inference-with-post-training-static-quantization-3>`_
223223
for detail benchmark number.
224224

225-
4. Conclusion
226-
---------------
225+
Conclusion
226+
----------
227227

228228
With this tutorial, we introduce how to use Inductor with X86 CPU in PyTorch 2 Quantization. Users can learn about
229229
how to use ``X86InductorQuantizer`` to quantize a model and lower it into the inductor with X86 CPU devices.

0 commit comments

Comments
 (0)