Skip to content

Add how to use C++ wrapper with X86InductorQuantizer #2716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion prototype_source/pt2e_quant_ptq_x86_inductor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Prerequisites

- `PyTorch 2 Export Post Training Quantization <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html>`_
- `TorchInductor and torch.compile concepts in PyTorch <https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html>`_
- `Inductor C++ Wrapper concepts <https://pytorch.org/tutorials/prototype/inductor_cpp_wrapper_tutorial.html>`_

Introduction
^^^^^^^^^^^^^^
Expand Down Expand Up @@ -161,7 +162,18 @@ After these steps, we finished running the quantization flow and we will get the
3. Lower into Inductor
------------------------

After we get the quantized model, we will further lower it to the inductor backend.
After we get the quantized model, we will further lower it to the inductor backend. The default Inductor wrapper
generates Python code to invoke both generated kernels and external kernels. Additionally, Inductor supports
C++ wrapper that generates pure C++ code. This allows seamless integration of the generated and external kernels,
effectively reducing Python overhead. In the future, leveraging the C++ wrapper, we can extend the capability
to achieve pure C++ deployment. For more comprehensive details about C++ Wrapper in general, please refer to the
dedicated tutorial on `Inductor C++ Wrapper Tutorial <https://pytorch.org/tutorials/prototype/inductor_cpp_wrapper_tutorial.html>`_.

::

# Optional: using the C++ wrapper instead of default Python wrapper
import torch._inductor.config as config
config.cpp_wrapper = True

::

Expand Down