-
Notifications
You must be signed in to change notification settings - Fork 4.2k
pt mobile script and optimize recipe #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c347228
be34ce5
1c047cc
d4f3261
a6fb9d0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,157 @@ | ||
Fuse Modules Recipe | ||
===================================== | ||
|
||
This recipe demonstrates how to fuse a list of PyTorch modules into a single module and how to do the performance test to compare the fused model with its non-fused version. | ||
|
||
Introduction | ||
------------ | ||
|
||
Before quantization is applied to a model to reduce its size and memory footprint (see `Quantization Recipe <quantization.html>`_ for details on quantization), the list of modules in the model may be fused first into a single module. Fusion is optional, but it may save on memory access, make the model run faster, and improve its accuracy. | ||
|
||
|
||
Pre-requisites | ||
-------------- | ||
|
||
PyTorch 1.6.0 or 1.7.0 | ||
|
||
Steps | ||
-------------- | ||
|
||
Follow the steps below to fuse an example model, quantize it, script it, optimize it for mobile, save it and test it with the Android benchmark tool. | ||
|
||
1. Define the Example Model | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Use the same example model defined in the `PyTorch Mobile Performance Recipes <https://pytorch.org/tutorials/recipes/mobile_perf.html>`_: | ||
|
||
:: | ||
|
||
import torch | ||
from torch.utils.mobile_optimizer import optimize_for_mobile | ||
|
||
class AnnotatedConvBnReLUModel(torch.nn.Module): | ||
def __init__(self): | ||
super(AnnotatedConvBnReLUModel, self).__init__() | ||
self.conv = torch.nn.Conv2d(3, 5, 3, bias=False).to(dtype=torch.float) | ||
self.bn = torch.nn.BatchNorm2d(5).to(dtype=torch.float) | ||
self.relu = torch.nn.ReLU(inplace=True) | ||
self.quant = torch.quantization.QuantStub() | ||
self.dequant = torch.quantization.DeQuantStub() | ||
|
||
def forward(self, x): | ||
x.contiguous(memory_format=torch.channels_last) | ||
x = self.quant(x) | ||
x = self.conv(x) | ||
x = self.bn(x) | ||
x = self.relu(x) | ||
x = self.dequant(x) | ||
return x | ||
|
||
|
||
2. Generate Two Models with and without `fuse_modules` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Add the following code below the model definition above and run the script: | ||
|
||
:: | ||
|
||
model = AnnotatedConvBnReLUModel() | ||
|
||
def prepare_save(model, fused): | ||
model.qconfig = torch.quantization.get_default_qconfig('qnnpack') | ||
torch.quantization.prepare(model, inplace=True) | ||
torch.quantization.convert(model, inplace=True) | ||
torchscript_model = torch.jit.script(model) | ||
torchscript_model_optimized = optimize_for_mobile(torchscript_model) | ||
torch.jit.save(torchscript_model_optimized, "model.pt" if not fused else "model_fused.pt") | ||
|
||
prepare_save(model, False) | ||
model_fused = torch.quantization.fuse_modules(model, [['bn', 'relu']], inplace=False) | ||
prepare_save(model_fused, True) | ||
|
||
print(model) | ||
print(model_fused) | ||
|
||
|
||
|
||
|
||
The graphs of the original model and its fused version will be printed as follows: | ||
|
||
:: | ||
|
||
AnnotatedConvBnReLUModel( | ||
(conv): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1), bias=False) | ||
(bn): BatchNorm2d(5, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) | ||
(relu): ReLU(inplace=True) | ||
(quant): QuantStub() | ||
(dequant): DeQuantStub() | ||
) | ||
|
||
AnnotatedConvBnReLUModel( | ||
(conv): Conv2d(3, 5, kernel_size=(3, 3), stride=(1, 1), bias=False) | ||
(bn): BNReLU2d( | ||
(0): BatchNorm2d(5, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) | ||
(1): ReLU(inplace=True) | ||
) | ||
(relu): Identity() | ||
(quant): QuantStub() | ||
(dequant): DeQuantStub() | ||
) | ||
|
||
In the second fused model output, the first item `bn` in the list is replaced with the fused module, and the rest of the modules (`relu` in this example) is replaced with identity. In addition, the non-fused and fused versions of the model `model.pt` and `model_fused.pt` are generated. | ||
|
||
3. Build the Android benchmark Tool | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Get the PyTorch source and build the Android benchmark tool as follows: | ||
|
||
:: | ||
|
||
git clone --recursive https://github.com/pytorch/pytorch | ||
cd pytorch | ||
git submodule update --init --recursive | ||
BUILD_PYTORCH_MOBILE=1 ANDROID_ABI=arm64-v8a ./scripts/build_android.sh -DBUILD_BINARY=ON | ||
|
||
|
||
This will generate the Android benchmark binary `speed_benchmark_torch` in the `build_android/bin` folder. | ||
|
||
4. Test Compare the Fused and Non-Fused Models | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Connect your Android device, then copy `speed_benchmark_torch` and the model files and run the benchmark tool on them: | ||
|
||
:: | ||
|
||
adb push build_android/bin/speed_benchmark_torch /data/local/tmp | ||
adb push model.pt /data/local/tmp | ||
adb push model_fused.pt /data/local/tmp | ||
adb shell "/data/local/tmp/speed_benchmark_torch --model=/data/local/tmp/model.pt" --input_dims="1,3,224,224" --input_type="float" | ||
adb shell "/data/local/tmp/speed_benchmark_torch --model=/data/local/tmp/model_fused.pt" --input_dims="1,3,224,224" --input_type="float" | ||
|
||
|
||
The results from the last two commands should be like: | ||
|
||
:: | ||
|
||
Main run finished. Microseconds per iter: 6189.07. Iters per second: 161.575 | ||
|
||
and | ||
|
||
:: | ||
|
||
Main run finished. Microseconds per iter: 6216.65. Iters per second: 160.858 | ||
|
||
For this example model, there is no much performance difference between the fused and non-fused models. But the similar steps can be used to fuse and prepare a real deep model and test to see the performance improvement. Keep in mind that currently `torch.quantization.fuse_modules` only fuses the following sequence of modules: | ||
|
||
* conv, bn | ||
* conv, bn, relu | ||
* conv, relu | ||
* linear, relu | ||
* bn, relu | ||
|
||
If any other sequence list is provided to the `fuse_modules` call, it will simply be ignored. | ||
|
||
Learn More | ||
--------------- | ||
|
||
See `here <https://pytorch.org/docs/stable/quantization.html#preparing-model-for-quantization>`_ for the official documentation of `torch.quantization.fuse_modules`. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
Model Preparation for Android Recipe | ||
===================================== | ||
|
||
This recipe demonstrates how to prepare a PyTorch MobileNet v2 image classification model for Android apps, and how to set up Android projects to use the mobile-ready model file. | ||
|
||
Introduction | ||
----------------- | ||
|
||
After a PyTorch model is trained or a pre-trained model is made available, it is normally not ready to be used in mobile apps yet. It needs to be quantized (see the `Quantization Recipe <quantization.html>`_), converted to TorchScript so Android apps can load it, and optimized for mobile apps. Furthermore, Android apps need to be set up correctly to enable the use of PyTorch Mobile libraries, before they can load and use the model for inference. | ||
|
||
Pre-requisites | ||
----------------- | ||
|
||
PyTorch 1.6.0 or 1.7.0 | ||
|
||
torchvision 0.6.0 or 0.7.0 | ||
|
||
Android Studio 3.5.1 or above with NDK installed | ||
|
||
Steps | ||
----------------- | ||
|
||
1. Get Pretrained and Quantized MobileNet v2 Model | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
To get the MobileNet v2 quantized model, simply do: | ||
|
||
:: | ||
|
||
import torchvision | ||
|
||
model_quantized = torchvision.models.quantization.mobilenet_v2(pretrained=True, quantize=True) | ||
|
||
2. Script and Optimize the Model for Mobile Apps | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Use either the `script` or `trace` method to convert the quantized model to the TorchScript format: | ||
|
||
:: | ||
|
||
import torch | ||
|
||
dummy_input = torch.rand(1, 3, 224, 224) | ||
torchscript_model = torch.jit.trace(model_quantized, dummy_input) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to warn the user that |
||
|
||
or | ||
|
||
:: | ||
|
||
torchscript_model = torch.jit.script(model_quantized) | ||
|
||
|
||
.. warning:: | ||
The `trace` method only scripts the code path executed during the trace, so it will not work properly for models that include decision branches. See the `Script and Optimize for Mobile Recipe <script_optimized.html>`_ for more details. | ||
|
||
Then optimize the TorchScript formatted model for mobile and save it: | ||
|
||
:: | ||
|
||
from torch.utils.mobile_optimizer import optimize_for_mobile | ||
torchscript_model_optimized = optimize_for_mobile(torchscript_model) | ||
torch.jit.save(torchscript_model_optimized, "mobilenetv2_quantized.pt") | ||
|
||
With the total 7 or 8 (depending on if the `script` or `trace` method is called to get the TorchScript format of the model) lines of code in the two steps above, we have a model ready to be added to mobile apps. | ||
|
||
3. Add the Model and PyTorch Library on Android | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
* In your current or a new Android Studio project, open the build.gradle file, and add the following two lines (the second one is required only if you plan to use a TorchVision model): | ||
|
||
:: | ||
|
||
implementation 'org.pytorch:pytorch_android:1.6.0' | ||
implementation 'org.pytorch:pytorch_android_torchvision:1.6.0' | ||
|
||
* Drag and drop the model file `mobilenetv2_quantized.pt` to your project's assets folder. | ||
|
||
That's it! Now you can build your Android app with the PyTorch library and the model ready to use. To actually write code to use the model, refer to the PyTorch Mobile `Android Quickstart with a HelloWorld Example <https://pytorch.org/mobile/android/#quickstart-with-a-helloworld-example>`_ and `Android Hackathon Example <https://github.com/pytorch/workshops/tree/master/PTMobileWalkthruAndroid>`_. | ||
|
||
Learn More | ||
----------------- | ||
|
||
1. `PyTorch Mobile site <https://pytorch.org/mobile>`_ | ||
|
||
2. `Introduction to TorchScript <https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html>`_ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
Model Preparation for iOS Recipe | ||
===================================== | ||
|
||
This recipe demonstrates how to prepare a PyTorch MobileNet v2 image classification model for iOS apps, and how to set up an iOS project to use the mobile-ready model file. | ||
|
||
Introduction | ||
----------------- | ||
|
||
After a PyTorch model is trained or a pre-trained model is made available, it is normally not ready to be used in mobile apps yet. It needs to be quantized (see `Quantization Recipe <quantization.html>`_ for more details), converted to TorchScript so iOS apps can load it and optimized for mobile apps (see `Script and Optimize for Mobile Recipe <script_optimized.html>`_). Furthermore, iOS apps need to be set up correctly to enable the use of PyTorch Mobile libraries, before they can load and use the model for inference. | ||
|
||
Pre-requisites | ||
----------------- | ||
|
||
PyTorch 1.6.0 or 1.7.0 | ||
|
||
torchvision 0.6.0 or 0.7.0 | ||
|
||
Xcode 11 or 12 | ||
|
||
Steps | ||
----------------- | ||
|
||
1. Get Pretrained and Quantized MobileNet v2 Model | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
To get the MobileNet v2 quantized model, simply do: | ||
|
||
:: | ||
|
||
import torchvision | ||
|
||
model_quantized = torchvision.models.quantization.mobilenet_v2(pretrained=True, quantize=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I mentioned this separately in the Android recipe) Does it make sense to demonstrate post-training quantization as a separate step? This would be valuable in the case that the user's existing model is not already quantized. |
||
|
||
2. Script and Optimize the Model for Mobile Apps | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Use either the script or trace method to convert the quantized model to the TorchScript format: | ||
|
||
:: | ||
|
||
import torch | ||
|
||
dummy_input = torch.rand(1, 3, 224, 224) | ||
torchscript_model = torch.jit.trace(model_quantized, dummy_input) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I mentioned this separately in the Android recipe) It would be good to warn the user that |
||
|
||
or | ||
|
||
:: | ||
|
||
torchscript_model = torch.jit.script(model_quantized) | ||
|
||
.. warning:: | ||
The `trace` method only scripts the code path executed during the trace, so it will not work properly for models that include decision branches. See the `Script and Optimize for Mobile Recipe <script_optimized.html>`_ for more details. | ||
|
||
|
||
Then optimize the TorchScript formatted model for mobile and save it: | ||
|
||
:: | ||
|
||
from torch.utils.mobile_optimizer import optimize_for_mobile | ||
torchscript_model_optimized = optimize_for_mobile(torchscript_model) | ||
torch.jit.save(torchscript_model_optimized, "mobilenetv2_quantized.pt") | ||
|
||
With the total 7 or 8 (depending on if the script or trace method is called to get the TorchScript format of the model) lines of code in the two steps above, we have a model ready to be added to mobile apps. | ||
|
||
3. Add the Model and PyTorch Library on iOS | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
To use the mobile-ready model `mobilenetv2_quantized.pt` in an iOS app, either create a new Xcode project or in your existing Xcode project, then follow the steps below: | ||
|
||
* Open a Mac Terminal, cd to your iOS app's project folder; | ||
|
||
* If your iOS app does not use Cocoapods yet, run `pod init` first to generate the `Podfile` file. | ||
|
||
* Edit `Podfile` either from Xcode or any editor, and add the following line under the target: | ||
|
||
:: | ||
|
||
pod 'LibTorch', '~>1.6.1' | ||
|
||
* Run `pod install` from the Terminal and then open your project's xcworkspace file; | ||
|
||
* Drag and drop the two files `TorchModule.h` and `TorchModule.mm` to your project. If your project is Swift based, a message box with the title "Would you like to configure an Objective-C bridging header?" will show up; click the "Create Bridging Header" button to create a Swift to Objective-c bridging header file, and add `#import "TorchModule.h"` to the header file `<your_project_name>-Bridging-Header.h`; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this flow change significantly for the case where the user is working with an existing project that already contains a bridging header? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No except that they'd not see the prompt (if there's already a bridging header file). I plan to cover in a future demo app the case with an existing project. |
||
|
||
* Drag and drop the model file `mobilenetv2_quantized.pt` to the project. | ||
|
||
After these steps, you can successfully build and run your Xcode project. To actually write code to use the model, refer to the PyTorch Mobile `iOS Code Walkthrough <https://pytorch.org/mobile/ios/#code-walkthrough>`_ and two complete ready-to-run sample iOS apps `HelloWorld <https://github.com/pytorch/ios-demo-app/tree/master/HelloWorld>`_ and `iOS Hackathon Example <https://github.com/pytorch/workshops/tree/master/PTMobileWalkthruIOS>`_. | ||
|
||
|
||
Learn More | ||
----------------- | ||
|
||
1. `PyTorch Mobile site <https://pytorch.org/mobile>`_ | ||
|
||
2. `Introduction to TorchScript <https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html>`_ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
Summary of PyTorch Mobile Recipes | ||
===================================== | ||
|
||
This summary provides a top level overview of recipes for PyTorch Mobile to help developers choose which recipes to follow for their PyTorch-powered mobile app development. | ||
|
||
Introduction | ||
---------------- | ||
|
||
When a PyTorch model is trained or retrained, or when a pre-trained model is available, for mobile deployment, follow the the recipes outlined in this summary so mobile apps can successfully use the model. | ||
|
||
Pre-requisites | ||
---------------- | ||
|
||
PyTorch 1.6.0 or 1.7.0 | ||
|
||
(Optional) torchvision 0.6.0 or 0.7.0 | ||
|
||
For iOS development: Xcode 11 or 12 | ||
|
||
For Android development: Android Studio 3.5.1 or above (with NDK installed); or Android SDK, NDK, Gradle, JDK. | ||
|
||
New Recipes for PyTorch Mobile | ||
-------------------------------- | ||
|
||
* (Recommended) To fuse a list of PyTorch modules into a single module to reduce the model size before quantization, read the `Fuse Modules recipe <fuse.html>`_. | ||
|
||
* (Recommended) To reduce the model size and make it run faster without losing much on accuracy, read the `Quantization Recipe <quantization.html>`_. | ||
|
||
* (Must) To convert the model to TorchScipt and (optional) optimize it for mobile apps, read the `Script and Optimize for Mobile Recipe <script_optimized.html>`_. | ||
|
||
* (Must for iOS development) To add the model in an iOS project and use PyTorch pod for iOS, read the `Model preparation for iOS Recipe <model_preparation_ios.html>`_. | ||
|
||
* (Must for Android development) To add the model in an Android project and use the PyTorch library for Android, read the `Model preparation for Android Recipe <model_preparation_android.html>`_. | ||
|
||
|
||
Learn More | ||
----------------- | ||
|
||
1. `PyTorch Mobile site <https://pytorch.org/mobile>`_ | ||
2. `PyTorch Mobile Performance Recipes <https://pytorch.org/tutorials/recipes/mobile_perf.html>`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to demonstrate post-training quantization as a separate step? This would be valuable in the case that the user's existing model is not already quantized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does but the Quantization recipe covers the info in details.