Update script_optimized and vulkan_workflow docs with new optimization options (#2163)

salilsdesai · Svetlana Karslioglu · kimishpatel · web-flow · commit a0d4ce605a22 · 2023-01-04T12:19:17.000-08:00
* Update script_optimized and vulkan_workflow docs with new optimization options
Co-authored-by: Svetlana Karslioglu &lt;svekars@fb.com&gt;
Co-authored-by: Kimish Patel &lt;kimishpatel@fb.com&gt;
diff --git a/prototype_source/vulkan_workflow.rst b/prototype_source/vulkan_workflow.rst
@@ -102,7 +102,7 @@ Python script to save pretrained mobilenet_v2 to a file:
     script_model = torch.jit.script(model)
     torch.jit.save(script_model, "mobilenet2.pt")
 
-PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default model needs additional step that will optimize operators fusing 
+PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default model needs additional step that will optimize operators fusing
 
 ::
 
@@ -112,6 +112,10 @@ PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default mode
 
 The result model can be used only on Vulkan backend as it contains specific to the Vulkan backend operators.
 
+By default, ``optimize_for_mobile`` with ``backend='vulkan'`` rewrites the graph so  that inputs are transferred to the Vulkan backend, and outputs are transferred to the CPU backend, therefore, the model can be run on CPU inputs and produce CPU outputs. To disable this, add the argument ``optimization_blocklist={MobileOptimizerType.VULKAN_AUTOMATIC_GPU_TRANSFER}`` to ``optimize_for_mobile``. (``MobileOptimizerType`` can be imported from ``torch.utils.mobile_optimizer``)
+
+For more information, see the `torch.utils.mobile_optimizer` `API documentation <https://pytorch.org/docs/stable/mobile_optimizer.html>`_.
+
 Using Vulkan backend in code
 ----------------------------
 
@@ -219,19 +223,19 @@ Or if you need only specific abi you can set it as an argument:
 Add prepared model ``mobilenet2-vulkan.pt`` to test applocation assets:
 
 ::
-  
+
   cp mobilenet2-vulkan.pt $PYTORCH_ROOT/android/test_app/app/src/main/assets/
 
 
-3. Build and Install test applocation to connected android device 
+3. Build and Install test applocation to connected android device
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 ::
 
     cd $PYTORCH_ROOT
     gradle -p android test_app:installMbvulkanLocalBaseDebug
 
-After successful installation, the application with the name 'MBQ' can be launched on the device. 
+After successful installation, the application with the name 'MBQ' can be launched on the device.
 
 
 
diff --git a/recipes_source/script_optimized.rst b/recipes_source/script_optimized.rst
@@ -194,16 +194,23 @@ The optimized model can then be saved and deployed in mobile apps:
 
     optimized_torchscript_model.save("optimized_torchscript_model.pth")
 
-By default, `optimize_for_mobile` will perform the following types of optimizations:
+By default, for the CPU backend, `optimize_for_mobile` performs the following types of optimizations:
 
-* Conv2D and BatchNorm fusion which folds Conv2d-BatchNorm2d into Conv2d;
+* `Conv2D and BatchNorm fusion` which folds Conv2d-BatchNorm2d into Conv2d;
 
-* Insert and fold prepacked ops which rewrites the model graph to replace 2D convolutions and linear ops with their prepacked counterparts.
+* `Insert and fold prepacked ops` which rewrites the model graph to replace 2D convolutions and linear ops with their prepacked counterparts.
 
-* ReLU and hardtanh fusion which rewrites graph by finding ReLU/hardtanh ops and fuses them together.
+* `ReLU and hardtanh fusion` which rewrites graph by finding ReLU/hardtanh ops and fuses them together.
 
-* Dropout removal which removes dropout nodes from this module when training is false.
+* `Dropout removal` which removes dropout nodes from this module when training is false.
 
+* `Conv packed params hoisting` which moves convolution packed params to the root module, so that the convolution structs can be deleted. This decreases model size without impacting numerics.
+
+For the Vulkan backend,`optimize_for_mobile` performs the following type of optimization:
+
+* `Automatic GPU transfer` which rewrites the graph so that moving input and output data to and from the GPU becomes part of the model.
+
+Optimization types can be disabled by passing an optimization blocklist as an argument to `optimize_for_mobile`.
 
 Learn More
 -----------------