Apply suggestions from code review

jerryzh168 · Svetlana Karslioglu · web-flow · commit ae209ccb0158 · 2023-08-23T15:41:32.000-07:00
Co-authored-by: Svetlana Karslioglu &lt;svekars@fb.com&gt;
diff --git a/prototype_source/pt2e_quant_ptq_static.rst b/prototype_source/pt2e_quant_ptq_static.rst
@@ -436,11 +436,13 @@ Convert the Calibrated Model to a Quantized Model
     print(quantized_model)
 
 .. note::
-   At this step, we currently have two representations that you can choose from, but what exact representation
-   we offer in the long term might change based on feedbacks from users.
+   At this step, we currently have two representations that you can choose from, but exact representation
+   we offer in the long term might change based on feedback from PyTorch users.
 
    * Q/DQ Representation (default)
-   Previous documentation for `representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``.
+      
+      Previous documentation for `representations <https://github.com/pytorch/rfcs/blob/master/RFC-0019- 
+ Extending-PyTorch-Quantization-to-Custom-Backends.md>`_ all quantized operators are represented as ``dequantize -> fp32_op -> qauntize``.
 
    .. code-block:: python
 
@@ -455,11 +457,12 @@ Convert the Calibrated Model to a Quantized Model
           out_fp32, out_scale, out_zero_point, out_quant_min, out_quant_max, torch.int8)
           return out_i8
      
-     * Reference Quantized Model Representation (WIP, expected to be ready at end of August): we have special representation for selected ops (for example, quantized linear), other ops are represented as (dq -> float32_op -> q), and q/dq are decomposed into more primitive operators.
+     * Reference Quantized Model Representation (WIP, expected to be ready at end of August): we have special representation for selected ops (for example, quantized linear), other ops are represented as (``dq -> float32_op -> q``), and ``q/dq`` are decomposed into more primitive operators.
 
-       You can get this representation by: ``convert_pt2e(..., use_reference_representation=True)``
+       You can get this representation by using ``convert_pt2e(..., use_reference_representation=True)``.
 
     .. code-block:: python
+    
        # Reference Quantized Pattern for quantized linear
        def quantized_linear(x_int8, x_scale, x_zero_point, weight_int8, weight_scale, weight_zero_point, bias_fp32, output_scale, output_zero_point):
            x_int16 = x_int8.to(torch.int16)
diff --git a/prototype_source/pt2e_quantizer.rst b/prototype_source/pt2e_quantizer.rst
@@ -146,14 +146,17 @@ parameters are shared with other tensors. Input of ``SharedQuantizationSpec`` is
 can be an input edge or an output value.
 
 .. note::
-   * Sharing is transitive
 
-     Some Tensors might be effectively using shared quantization spec due to (1) two nodes/edges are
-     configured to use SharedQuantizationSpec (2) there is existing sharing of some of the nodes
+   * Sharing is transitive
 
+     Some tensors might be effectively using shared quantization spec due to:
+     
+     * Two nodes/edges are configured to use ``SharedQuantizationSpec``.
+     * There is existing sharing of some nodes.
+     
      For example, let's say we have two ``conv`` nodes ``conv1`` and ``conv2``, and both of them are fed into a ``cat``
-     node. `cat([conv1_out, conv2_out], ...)` Let's say output of ``conv1``, ``conv2`` and the first input of ``cat`` are configured
-     with the same configurations of ``QuantizationSpec``, second input of ``cat`` is configured to use ``SharedQuantizationSpec``
+     node: ``cat([conv1_out, conv2_out], ...)``. Let's say the output of ``conv1``, ``conv2``, and the first input of ``cat`` are configured
+     with the same configurations of ``QuantizationSpec``. The second input of ``cat`` is configured to use ``SharedQuantizationSpec``
      with the first input.
      
      .. code-block::
@@ -163,15 +166,16 @@ can be an input edge or an output value.
        cat_input0: qspec1(dtype=torch.int8, ...)
        cat_input1: SharedQuantizationSpec((conv1, cat))  # conv1 node is the first input of cat
      
-     First of all, the output of ``conv1`` is implicitly sharing quantization parameter (and observer object)
-     with the first input of ``cat``, and same for output of ``conv2`` and the second input of ``cat``.
-     So since user configures the two inputs of ``cat`` to share quantization parameters, by transitivity,
+     First of all, the output of ``conv1`` is implicitly sharing quantization parameters (and observer object)
+     with the first input of ``cat``, and the same is true for the output of ``conv2`` and the second input of ``cat``.
+     Therefore, since the user configures the two inputs of ``cat`` to share quantization parameters, by transitivity,
      ``conv2_out`` and ``conv1_out`` will also be sharing quantization parameters. In the observed graph, you
-     will see:
+     will see the following:
+     
      .. code-block::
      
-     conv1 -> obs -> cat
-     conv2 -> obs   /
+         conv1 -> obs -> cat
+         conv2 -> obs   /
 
      and both ``obs`` will be the same observer instance.