From 6474cf985dcd6467b8ad093e70ec41960636e87f Mon Sep 17 00:00:00 2001
From: Salil Desai <salilsdesai@fb.com>
Date: Mon, 2 Jan 2023 16:11:58 -0500
Subject: [PATCH 1/5] Update script_optimized and vulkan_workflow docs with new
 optimization options

---
 prototype_source/vulkan_workflow.rst | 12 ++++++++----
 recipes_source/script_optimized.rst  | 17 ++++++++++++-----
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/prototype_source/vulkan_workflow.rst b/prototype_source/vulkan_workflow.rst
index c18f57ae201..255af36a808 100644
--- a/prototype_source/vulkan_workflow.rst
+++ b/prototype_source/vulkan_workflow.rst
@@ -102,7 +102,7 @@ Python script to save pretrained mobilenet_v2 to a file:
     script_model = torch.jit.script(model)
     torch.jit.save(script_model, "mobilenet2.pt")
 
-PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default model needs additional step that will optimize operators fusing 
+PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default model needs additional step that will optimize operators fusing
 
 ::
 
@@ -112,6 +112,10 @@ PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default mode
 
 The result model can be used only on Vulkan backend as it contains specific to the Vulkan backend operators.
 
+By default, ``optimize_for_mobile`` with ``backend='vulkan'`` rewrites the graph such that inputs are transferred to Vulkan backend, and outputs are transferred to CPU backend, so the model can be run on CPU inputs and produce CPU outputs. To disable this, add the argument ``optimization_blocklist={MobileOptimizerType.VULKAN_AUTOMATIC_GPU_TRANSFER}`` to ``optimize_for_mobile``. (``MobileOptimizerType`` can be imported from ``torch.utils.mobile_optimizer``)
+
+For more information, see the `torch.utils.mobile_optimizer` `API documentation <https://pytorch.org/docs/stable/mobile_optimizer.html>`_.
+
 Using Vulkan backend in code
 ----------------------------
 
@@ -219,11 +223,11 @@ Or if you need only specific abi you can set it as an argument:
 Add prepared model ``mobilenet2-vulkan.pt`` to test applocation assets:
 
 ::
-  
+
   cp mobilenet2-vulkan.pt $PYTORCH_ROOT/android/test_app/app/src/main/assets/
 
 
-3. Build and Install test applocation to connected android device 
+3. Build and Install test applocation to connected android device
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 ::
@@ -231,7 +235,7 @@ Add prepared model ``mobilenet2-vulkan.pt`` to test applocation assets:
     cd $PYTORCH_ROOT
     gradle -p android test_app:installMbvulkanLocalBaseDebug
 
-After successful installation, the application with the name 'MBQ' can be launched on the device. 
+After successful installation, the application with the name 'MBQ' can be launched on the device.
 
 
 
diff --git a/recipes_source/script_optimized.rst b/recipes_source/script_optimized.rst
index e1c0f19ad13..a1365076a34 100644
--- a/recipes_source/script_optimized.rst
+++ b/recipes_source/script_optimized.rst
@@ -194,16 +194,23 @@ The optimized model can then be saved and deployed in mobile apps:
 
     optimized_torchscript_model.save("optimized_torchscript_model.pth")
 
-By default, `optimize_for_mobile` will perform the following types of optimizations:
+For CPU backend, by default, `optimize_for_mobile` will perform the following types of optimizations:
 
-* Conv2D and BatchNorm fusion which folds Conv2d-BatchNorm2d into Conv2d;
+* `Conv2D and BatchNorm fusion` which folds Conv2d-BatchNorm2d into Conv2d;
 
-* Insert and fold prepacked ops which rewrites the model graph to replace 2D convolutions and linear ops with their prepacked counterparts.
+* `Insert and fold prepacked ops` which rewrites the model graph to replace 2D convolutions and linear ops with their prepacked counterparts.
 
-* ReLU and hardtanh fusion which rewrites graph by finding ReLU/hardtanh ops and fuses them together.
+* `ReLU and hardtanh fusion` which rewrites graph by finding ReLU/hardtanh ops and fuses them together.
 
-* Dropout removal which removes dropout nodes from this module when training is false.
+* `Dropout removal` which removes dropout nodes from this module when training is false.
 
+* `Conv packed params hoisting` which moves convolution packed params to the root module, so that the convolution structs can be deleted. This decreases model size without impacting numerics.
+
+For Vulkan backend, by default, `optimize_for_mobile` will perform the following type of optimization:
+
+* `Automatic GPU transfer` which rewrites the graph such that inputs are transferred to Vulkan backend, and outputs are transferred to CPU backend.
+
+Optimization types can be disabled by passing an optimization blocklist as an argument to `optimize_for_mobile`.
 
 Learn More
 -----------------

From e90229705c1d8091e58bfe4ea908ba1e298f4dae Mon Sep 17 00:00:00 2001
From: Svetlana Karslioglu <svekars@fb.com>
Date: Wed, 4 Jan 2023 08:31:43 -0800
Subject: [PATCH 2/5] grammar

---
 prototype_source/vulkan_workflow.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/prototype_source/vulkan_workflow.rst b/prototype_source/vulkan_workflow.rst
index 255af36a808..7cd3a5c9864 100644
--- a/prototype_source/vulkan_workflow.rst
+++ b/prototype_source/vulkan_workflow.rst
@@ -112,7 +112,7 @@ PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default mode
 
 The result model can be used only on Vulkan backend as it contains specific to the Vulkan backend operators.
 
-By default, ``optimize_for_mobile`` with ``backend='vulkan'`` rewrites the graph such that inputs are transferred to Vulkan backend, and outputs are transferred to CPU backend, so the model can be run on CPU inputs and produce CPU outputs. To disable this, add the argument ``optimization_blocklist={MobileOptimizerType.VULKAN_AUTOMATIC_GPU_TRANSFER}`` to ``optimize_for_mobile``. (``MobileOptimizerType`` can be imported from ``torch.utils.mobile_optimizer``)
+By default, ``optimize_for_mobile`` with ``backend='vulkan'`` rewrites the graph so  that inputs are transferred to the Vulkan backend, and outputs are transferred to the CPU backend, therefore, the model can be run on CPU inputs and produce CPU outputs. To disable this, add the argument ``optimization_blocklist={MobileOptimizerType.VULKAN_AUTOMATIC_GPU_TRANSFER}`` to ``optimize_for_mobile``. (``MobileOptimizerType`` can be imported from ``torch.utils.mobile_optimizer``)
 
 For more information, see the `torch.utils.mobile_optimizer` `API documentation <https://pytorch.org/docs/stable/mobile_optimizer.html>`_.
 

From bd731cff20dcebcd67ce380d9c45a996d56793cb Mon Sep 17 00:00:00 2001
From: Svetlana Karslioglu <svekars@fb.com>
Date: Wed, 4 Jan 2023 08:31:52 -0800
Subject: [PATCH 3/5] grammar

---
 recipes_source/script_optimized.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/recipes_source/script_optimized.rst b/recipes_source/script_optimized.rst
index a1365076a34..db8c8f15181 100644
--- a/recipes_source/script_optimized.rst
+++ b/recipes_source/script_optimized.rst
@@ -194,7 +194,7 @@ The optimized model can then be saved and deployed in mobile apps:
 
     optimized_torchscript_model.save("optimized_torchscript_model.pth")
 
-For CPU backend, by default, `optimize_for_mobile` will perform the following types of optimizations:
+By default, for the CPU backend, `optimize_for_mobile` performs the following types of optimizations:
 
 * `Conv2D and BatchNorm fusion` which folds Conv2d-BatchNorm2d into Conv2d;
 

From 866262ecf93685ef7a49c75a1707a191b3f2c57d Mon Sep 17 00:00:00 2001
From: Svetlana Karslioglu <svekars@fb.com>
Date: Wed, 4 Jan 2023 08:32:06 -0800
Subject: [PATCH 4/5] grammar

---
 recipes_source/script_optimized.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/recipes_source/script_optimized.rst b/recipes_source/script_optimized.rst
index db8c8f15181..161076db179 100644
--- a/recipes_source/script_optimized.rst
+++ b/recipes_source/script_optimized.rst
@@ -206,7 +206,7 @@ By default, for the CPU backend, `optimize_for_mobile` performs the following ty
 
 * `Conv packed params hoisting` which moves convolution packed params to the root module, so that the convolution structs can be deleted. This decreases model size without impacting numerics.
 
-For Vulkan backend, by default, `optimize_for_mobile` will perform the following type of optimization:
+For the Vulkan backend,`optimize_for_mobile` performs the following type of optimization:
 
 * `Automatic GPU transfer` which rewrites the graph such that inputs are transferred to Vulkan backend, and outputs are transferred to CPU backend.
 

From c1ff94ae143bcc0a882d841741c73daa3da32a6d Mon Sep 17 00:00:00 2001
From: Svetlana Karslioglu <svekars@fb.com>
Date: Wed, 4 Jan 2023 08:32:14 -0800
Subject: [PATCH 5/5] grammar

Co-authored-by: Kimish Patel <kimishpatel@fb.com>
---
 recipes_source/script_optimized.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/recipes_source/script_optimized.rst b/recipes_source/script_optimized.rst
index 161076db179..f4384b1a3ae 100644
--- a/recipes_source/script_optimized.rst
+++ b/recipes_source/script_optimized.rst
@@ -208,7 +208,7 @@ By default, for the CPU backend, `optimize_for_mobile` performs the following ty
 
 For the Vulkan backend,`optimize_for_mobile` performs the following type of optimization:
 
-* `Automatic GPU transfer` which rewrites the graph such that inputs are transferred to Vulkan backend, and outputs are transferred to CPU backend.
+* `Automatic GPU transfer` which rewrites the graph so that moving input and output data to and from the GPU becomes part of the model.
 
 Optimization types can be disabled by passing an optimization blocklist as an argument to `optimize_for_mobile`.