Skip to content

[quant][pt2e][tutorials] Some fixes to pt2e quant docs #2513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion prototype_source/pt2e_quant_ptq_static.rst
Original file line number Diff line number Diff line change
Expand Up @@ -302,12 +302,13 @@ For post training quantization, we'll need to set model to the eval mode.
``Quantizer`` is backend specific, and each ``Quantizer`` will provide their own way to allow users to configure their model. Just as an example, here is the different configuration APIs supported by XNNPackQuantizer:

.. code:: python

quantizer.set_global(qconfig_opt) # qconfig_opt is an optional qconfig, either a valid qconfig or None
.set_object_type(torch.nn.Conv2d, qconfig_opt) # can be a module type
.set_object_type(torch.nn.functional.linear, qconfig_opt) # or torch functional op
.set_module_name("foo.bar", qconfig_opt)

We have another `tutorial <https://pytorch.org/tutorials/prototype/quantization_in_pytorch_2_0_export_tutorial.html>`_ that talks about how to write a new ``Quantizer``.
We have another `tutorial <https://pytorch.org/tutorials/prototype/pt2e_quantizer.html>`_ that talks about how to write a new ``Quantizer``.

6. Prepare the Model for Post Training Static Quantization
----------------------------------------------------------
Expand Down Expand Up @@ -396,6 +397,7 @@ quantization you are using to learn more about how you can have more control ove
We'll show how to save and load the quantized model.

.. code:: python

# 1. Save state_dict
pt2e_quantized_model_file_path = saved_model_dir + "resnet18_pt2e_quantized.pth"
torch.save(quantized_model.state_dict(), pt2e_quantized_model_file_path)
Expand Down Expand Up @@ -434,6 +436,7 @@ We'll show how to save and load the quantized model.

11. Debugging Quantized Model
----------------------------

We have `Numeric Suite <https://pytorch.org/docs/stable/quantization-accuracy-debugging.html#numerical-debugging-tooling-prototype>`_ that can help with debugging in eager mode and FX graph mode. The new version of Numeric Suite working with PyTorch 2.0 Export models is still in development.

12. Lowering and Performance Evaluation
Expand Down
10 changes: 7 additions & 3 deletions prototype_source/pt2e_quantizer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,16 @@ Prerequisites:

Required:
- `Torchdynamo concepts in PyTorch <https://pytorch.org/docs/stable/dynamo/index.html>`__

- `Quantization concepts in PyTorch <https://pytorch.org/docs/master/quantization.html#quantization-api-summary>`__

- `(prototype) PyTorch 2.0 Export Post Training Static Quantization <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html>`__

Optional:
- `FX Graph Mode post training static quantization <https://pytorch.org/tutorials/prototype/fx_graph_mode_ptq_static.html>`__

- `BackendConfig in PyTorch Quantization FX Graph Mode <https://pytorch.org/tutorials/prototype/backend_config_tutorial.html?highlight=backend>`__

- `QConfig and QConfigMapping in PyTorch Quantization FX Graph Mode <https://pytorch.org/tutorials/prototype/backend_config_tutorial.html#set-up-qconfigmapping-that-satisfies-the-backend-constraints>`__

Introduction
Expand All @@ -25,12 +29,12 @@ Introduction
(1). What is supported quantized operator or patterns in the backend
(2). How can users express the way they want their floating point model to be quantized, for example, quantized the whole model to be int8 symmetric quantization, or quantize only linear layers etc.

Please see `here <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html>`__ For motivations for ``Quantizer``.
Please see `here <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html#motivation-of-pytorch-2-0-export-quantization>`__ For motivations for the new API and ``Quantizer``.

An existing quantizer object defined for ``XNNPACK`` is in
`QNNPackQuantizer <https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/pt2e/quantizer/xnnpack_quantizer.py>`__

Annotation API:
Annotation API
^^^^^^^^^^^^^^^^^^^

``Quantizer`` uses annotation API to convey quantization intent for different operators/patterns.
Expand Down Expand Up @@ -269,4 +273,4 @@ Conclusion
With this tutorial, we introduce the new quantization path in PyTorch 2.0. Users can learn about
how to define a ``BackendQuantizer`` with the ``QuantizationAnnotation API`` and integrate it into the quantization 2.0 flow.
Examples of ``QuantizationSpec``, ``SharedQuantizationSpec``, ``FixedQParamsQuantizationSpec``, and ``DerivedQuantizationSpec``
are given for specific annotation use case. This is a prerequisite to be able to quantize a model in PyTorch 2.0 Export Quantization flow. Please follow `this tutorial <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html>`_ to actually quantize a model.
are given for specific annotation use case. This is a prerequisite to be able to quantize a model in PyTorch 2.0 Export Quantization flow. You can use `XNNPACKQuantizer <https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/quantizer/xnnpack_quantizer.py>`_ as an example to start implementing your own ``Quantizer``. After that please follow `this tutorial <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html>`_ to actually quantize your model.