You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst
+30-35Lines changed: 30 additions & 35 deletions
Original file line number
Diff line number
Diff line change
@@ -14,54 +14,46 @@ have significantly higher model coverage, better programmability, and
14
14
a simplified UX.
15
15
16
16
Prerequisites:
17
-
-----------------------
17
+
^^^^^^^^^^^^^^^^
18
18
19
-
- `Understanding of torchdynamo concepts in PyTorch <https://pytorch.org/docs/stable/dynamo/index.html>`__
20
-
- `Understanding of the quantization concepts in PyTorch <https://pytorch.org/docs/master/quantization.html#quantization-api-summary>`__
21
-
- `Understanding of FX Graph Mode post training static quantization <https://pytorch.org/tutorials/prototype/fx_graph_mode_ptq_static.html>`__
22
-
- `Understanding of BackendConfig in PyTorch Quantization FX Graph Mode <https://pytorch.org/tutorials/prototype/backend_config_tutorial.html?highlight=backend>`__
23
-
- `Understanding of QConfig and QConfigMapping in PyTorch Quantization FX Graph Mode <https://pytorch.org/tutorials/prototype/backend_config_tutorial.html#set-up-qconfigmapping-that-satisfies-the-backend-constraints>`__
19
+
- `Torchdynamo concepts in PyTorch <https://pytorch.org/docs/stable/dynamo/index.html>`__
20
+
- `Quantization concepts in PyTorch <https://pytorch.org/docs/master/quantization.html#quantization-api-summary>`__
21
+
- `FX Graph Mode post training static quantization <https://pytorch.org/tutorials/prototype/fx_graph_mode_ptq_static.html>`__
22
+
- `BackendConfig in PyTorch Quantization FX Graph Mode <https://pytorch.org/tutorials/prototype/backend_config_tutorial.html?highlight=backend>`__
23
+
- `QConfig and QConfigMapping in PyTorch Quantization FX Graph Mode <https://pytorch.org/tutorials/prototype/backend_config_tutorial.html#set-up-qconfigmapping-that-satisfies-the-backend-constraints>`__
24
+
25
+
Introduction:
26
+
^^^^^^^^^^^^^^^^
24
27
25
28
Previously in ``FX Graph Mode Quantization`` we were using ``QConfigMapping`` for users to specify how the model to be quantized
26
29
and ``BackendConfig`` to specify the supported ways of quantization in their backend.
27
30
This API covers most use cases relatively well, but the main problem is that this API is not fully extensible
28
31
without involvement of the quantization team:
29
32
30
-
- This API has limitation around expressing quantization intentions for complicated operator patterns such as in the discussion of
31
-
`Issue-96288 <https://github.com/pytorch/pytorch/issues/96288>`__ to support ``conv add`` fusion.
32
-
Supporting ``conv add`` fusion also requires some changes to current already complicated pattern matching code such as in the
0 commit comments