You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pt2e quantization tutorial related updates (#3106)
* pt2e quantization tutorial related updates
Summary:
* prototype_source/pt2e_quant_ptq_x86_inductor.rst and prototype_source/quantization_in_pytorch_2_0_export_tutorial.rst are redirects to the actual doc, these are there because the tutorials are renamed by the pages files are still there
* pt2e_quant_ptq.rst and pt2e_quant_qat.rst
Updates to export API
Updates to import path of xnnapck quantizer
Updates to print_model_size
---------
Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
* Reference Quantized Model Representation (available in the nightly build)
463
+
* Reference Quantized Model Representation
458
464
459
465
We will have a special representation for selected ops, for example, quantized linear. Other ops are represented as ``dq -> float32_op -> q`` and ``q/dq`` are decomposed into more primitive operators.
460
466
You can get this representation by using ``convert_pt2e(..., use_reference_representation=True)``.
@@ -485,8 +491,6 @@ Now we can compare the size and model accuracy with baseline model.
0 commit comments