Skip to content

Commit 29db287

Browse files
jerryzh168Svetlana Karslioglu
and
Svetlana Karslioglu
authored
[quant][2.1] fix formatting for the tutorial (#2561)
* [quant][2.1] fix formatting for the tutorial Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
1 parent 646c8b6 commit 29db287

File tree

1 file changed

+14
-16
lines changed

1 file changed

+14
-16
lines changed

prototype_source/pt2e_quant_ptq_static.rst

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,25 +10,23 @@ this flow is expected to have significantly higher model coverage
1010
(`88% on 14K models <https://github.com/pytorch/pytorch/issues/93667#issuecomment-1601171596>`_),
1111
better programmability, and a simplified UX.
1212

13-
Exportable by `torch._export.export` is a prerequisite to use the flow, you can
13+
Exportable by `torch.export.export` is a prerequisite to use the flow, you can
1414
find what are the constructs that's supported in `Export DB <https://pytorch.org/docs/main/generated/exportdb/index.html>`_.
1515

1616
The high level architecture of quantization 2.0 with quantizer could look like
1717
this:
1818

1919
::
2020

21-
float_model(Python) Input
21+
float_model(Python) Example Input
2222
\ /
2323
\ /
2424
—-------------------------------------------------------
25-
| Export |
25+
| export |
2626
—-------------------------------------------------------
2727
|
28-
FX Graph in ATen XNNPACKQuantizer,
29-
| or X86InductorQuantizer,
30-
| or <Other Backend Quantizer>
31-
| /
28+
FX Graph in ATen Backend Specific Quantizer
29+
| /
3230
—--------------------------------------------------------
3331
| prepare_pt2e |
3432
—--------------------------------------------------------
@@ -39,13 +37,13 @@ this:
3937
| convert_pt2e |
4038
—--------------------------------------------------------
4139
|
42-
Reference Quantized Model
40+
Quantized Model
4341
|
4442
—--------------------------------------------------------
4543
| Lowering |
4644
—--------------------------------------------------------
4745
|
48-
Executorch, or Inductor, or <Other Backends>
46+
Executorch, Inductor or <Other Backends>
4947

5048

5149
The PyTorch 2.0 export quantization API looks like this:
@@ -377,15 +375,15 @@ The following code snippets describes how to quantize the model:
377375
get_symmetric_quantization_config,
378376
)
379377
quantizer = XNNPACKQuantizer()
380-
quantizer.set_globa(get_symmetric_quantization_config())
378+
quantizer.set_global(get_symmetric_quantization_config())
381379
382380
``Quantizer`` is backend specific, and each ``Quantizer`` will provide their
383381
own way to allow users to configure their model. Just as an example, here is
384382
the different configuration APIs supported by ``XNNPackQuantizer``:
385383

386384
.. code-block:: python
387385
388-
quantizer.set_global(qconfig_opt) # qconfig_opt is an optional qconfig, either a valid qconfig or None
386+
quantizer.set_global(qconfig_opt) # qconfig_opt is an optional quantization config
389387
.set_object_type(torch.nn.Conv2d, qconfig_opt) # can be a module type
390388
.set_object_type(torch.nn.functional.linear, qconfig_opt) # or torch functional op
391389
.set_module_name("foo.bar", qconfig_opt)
@@ -441,8 +439,7 @@ we offer in the long term might change based on feedback from PyTorch users.
441439

442440
* Q/DQ Representation (default)
443441

444-
Previous documentation for `representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019-
445-
Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``.
442+
Previous documentation for `representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``.
446443

447444
.. code-block:: python
448445
@@ -457,9 +454,10 @@ we offer in the long term might change based on feedback from PyTorch users.
457454
out_fp32, out_scale, out_zero_point, out_quant_min, out_quant_max, torch.int8)
458455
return out_i8
459456
460-
* Reference Quantized Model Representation (WIP, expected to be ready at end of August): we have special representation for selected ops (for example, quantized linear), other ops are represented as (``dq -> float32_op -> q``), and ``q/dq`` are decomposed into more primitive operators.
457+
* Reference Quantized Model Representation (available in the nightly build)
461458

462-
You can get this representation by using ``convert_pt2e(..., use_reference_representation=True)``.
459+
We will have a special representation for selected ops, for example, quantized linear. Other ops are represented as ``dq -> float32_op -> q`` and ``q/dq`` are decomposed into more primitive operators.
460+
You can get this representation by using ``convert_pt2e(..., use_reference_representation=True)``.
463461

464462
.. code-block:: python
465463
@@ -515,7 +513,7 @@ Now we can compare the size and model accuracy with baseline model.
515513
If you want to get better accuracy or performance, try configuring
516514
``quantizer`` in different ways, and each ``quantizer`` will have its own way
517515
of configuration, so please consult the documentation for the
518-
quantization you are using to learn more about how you can have more control
516+
quantizer you are using to learn more about how you can have more control
519517
over how to quantize a model.
520518

521519
Save and Load Quantized Model

0 commit comments

Comments
 (0)