Skip to content

Update script_optimized and vulkan_workflow docs with new optimization options #2163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions prototype_source/vulkan_workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Python script to save pretrained mobilenet_v2 to a file:
script_model = torch.jit.script(model)
torch.jit.save(script_model, "mobilenet2.pt")

PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default model needs additional step that will optimize operators fusing
PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default model needs additional step that will optimize operators fusing

::

Expand All @@ -112,6 +112,10 @@ PyTorch 1.7 Vulkan backend supports only float 32bit operators. The default mode

The result model can be used only on Vulkan backend as it contains specific to the Vulkan backend operators.

By default, ``optimize_for_mobile`` with ``backend='vulkan'`` rewrites the graph so that inputs are transferred to the Vulkan backend, and outputs are transferred to the CPU backend, therefore, the model can be run on CPU inputs and produce CPU outputs. To disable this, add the argument ``optimization_blocklist={MobileOptimizerType.VULKAN_AUTOMATIC_GPU_TRANSFER}`` to ``optimize_for_mobile``. (``MobileOptimizerType`` can be imported from ``torch.utils.mobile_optimizer``)

For more information, see the `torch.utils.mobile_optimizer` `API documentation <https://pytorch.org/docs/stable/mobile_optimizer.html>`_.

Using Vulkan backend in code
----------------------------

Expand Down Expand Up @@ -219,19 +223,19 @@ Or if you need only specific abi you can set it as an argument:
Add prepared model ``mobilenet2-vulkan.pt`` to test applocation assets:

::

cp mobilenet2-vulkan.pt $PYTORCH_ROOT/android/test_app/app/src/main/assets/


3. Build and Install test applocation to connected android device
3. Build and Install test applocation to connected android device
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

::

cd $PYTORCH_ROOT
gradle -p android test_app:installMbvulkanLocalBaseDebug

After successful installation, the application with the name 'MBQ' can be launched on the device.
After successful installation, the application with the name 'MBQ' can be launched on the device.



Expand Down
17 changes: 12 additions & 5 deletions recipes_source/script_optimized.rst
Original file line number Diff line number Diff line change
Expand Up @@ -194,16 +194,23 @@ The optimized model can then be saved and deployed in mobile apps:

optimized_torchscript_model.save("optimized_torchscript_model.pth")

By default, `optimize_for_mobile` will perform the following types of optimizations:
By default, for the CPU backend, `optimize_for_mobile` performs the following types of optimizations:

* Conv2D and BatchNorm fusion which folds Conv2d-BatchNorm2d into Conv2d;
* `Conv2D and BatchNorm fusion` which folds Conv2d-BatchNorm2d into Conv2d;

* Insert and fold prepacked ops which rewrites the model graph to replace 2D convolutions and linear ops with their prepacked counterparts.
* `Insert and fold prepacked ops` which rewrites the model graph to replace 2D convolutions and linear ops with their prepacked counterparts.

* ReLU and hardtanh fusion which rewrites graph by finding ReLU/hardtanh ops and fuses them together.
* `ReLU and hardtanh fusion` which rewrites graph by finding ReLU/hardtanh ops and fuses them together.

* Dropout removal which removes dropout nodes from this module when training is false.
* `Dropout removal` which removes dropout nodes from this module when training is false.

* `Conv packed params hoisting` which moves convolution packed params to the root module, so that the convolution structs can be deleted. This decreases model size without impacting numerics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to add this point here:

  • Add/ReLU fusion which finds instances of relu ops that follow add ops and fuses them into a single add_relu.

For the Vulkan backend,`optimize_for_mobile` performs the following type of optimization:

* `Automatic GPU transfer` which rewrites the graph so that moving input and output data to and from the GPU becomes part of the model.

Optimization types can be disabled by passing an optimization blocklist as an argument to `optimize_for_mobile`.

Learn More
-----------------
Expand Down