You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
``Quantizer`` is backend specific, and each ``Quantizer`` will provide their
383
381
own way to allow users to configure their model. Just as an example, here is
384
382
the different configuration APIs supported by ``XNNPackQuantizer``:
385
383
386
384
.. code-block:: python
387
385
388
-
quantizer.set_global(qconfig_opt) # qconfig_opt is an optional qconfig, either a valid qconfig or None
386
+
quantizer.set_global(qconfig_opt) # qconfig_opt is an optional quantization config
389
387
.set_object_type(torch.nn.Conv2d, qconfig_opt) # can be a module type
390
388
.set_object_type(torch.nn.functional.linear, qconfig_opt) # or torch functional op
391
389
.set_module_name("foo.bar", qconfig_opt)
@@ -441,8 +439,7 @@ we offer in the long term might change based on feedback from PyTorch users.
441
439
442
440
* Q/DQ Representation (default)
443
441
444
-
Previous documentation for `representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019-
445
-
Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``.
442
+
Previous documentation for `representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``.
446
443
447
444
.. code-block:: python
448
445
@@ -457,9 +454,10 @@ we offer in the long term might change based on feedback from PyTorch users.
* Reference Quantized Model Representation (WIP, expected to be ready at end of August): we have special representation for selected ops (for example, quantized linear), other ops are represented as (``dq -> float32_op -> q``), and ``q/dq`` are decomposed into more primitive operators.
457
+
* Reference Quantized Model Representation (available in the nightly build)
461
458
462
-
You can get this representation by using ``convert_pt2e(..., use_reference_representation=True)``.
459
+
We will have a special representation for selected ops, for example, quantized linear. Other ops are represented as ``dq -> float32_op -> q`` and ``q/dq`` are decomposed into more primitive operators.
460
+
You can get this representation by using ``convert_pt2e(..., use_reference_representation=True)``.
463
461
464
462
.. code-block:: python
465
463
@@ -515,7 +513,7 @@ Now we can compare the size and model accuracy with baseline model.
515
513
If you want to get better accuracy or performance, try configuring
516
514
``quantizer`` in different ways, and each ``quantizer`` will have its own way
517
515
of configuration, so please consult the documentation for the
518
-
quantization you are using to learn more about how you can have more control
516
+
quantizer you are using to learn more about how you can have more control
0 commit comments