You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: prototype_source/pt2e_quant_ptq_static.rst
+34-15Lines changed: 34 additions & 15 deletions
Original file line number
Diff line number
Diff line change
@@ -430,23 +430,42 @@ Convert the Calibrated Model to a Quantized Model
430
430
print(quantized_model)
431
431
432
432
.. note::
433
-
the model produced here also had some improvement upon the previous
434
-
`representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ in the FX graph mode quantizaiton, previously all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``, in the new flow, we choose to represent some of the operators with integer computation so that it's closer to the computation happens in hardwares.
435
-
For example, here is how we plan to represent a quantized linear operator:
433
+
At this step, we currently have two representations that you can choose from, but what exact representation
434
+
we offer in the long term might change based on feedbacks from users.
436
435
437
-
.. code-block:: python
436
+
* Q/DQ Representation (default)
437
+
Previous documentation for `representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``.
* Reference Quantized Model Representation (WIP, expected to be ready at end of August): we have special representation for selected ops (e.g. quantized linear), other ops are represented as (dq -> float32_op -> q), and q/dq are decomposed into more primitive operators.
452
+
453
+
You can get this representation by: convert_pt2e(..., use_reference_representation=True)
454
+
455
+
.. code-block:: python
456
+
# Reference Quantized Pattern for quantized linear
Please see `<here https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/pt2e/representation/rewrite.py>`_ for the most up to date reference representations.
0 commit comments