Skip to content

Commit 8c1d408

Browse files
qihqiSvetlana Karslioglu
and
Svetlana Karslioglu
authored
Add a section to advertise use of flatbuffer format for mobile models. (#2286)
* Add a section to advertise use of flatbuffer format for mobile models. * Apply suggestions from code review Co-authored-by: Svetlana Karslioglu <svekars@fb.com> * Add a section on caveats. * Apply suggestions from code review Co-authored-by: Svetlana Karslioglu <svekars@fb.com> --------- Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
1 parent 12c8be8 commit 8c1d408

File tree

1 file changed

+67
-0
lines changed

1 file changed

+67
-0
lines changed

recipes_source/mobile_perf.rst

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,73 @@ You can check how it looks in code in `pytorch android application example <http
199199
Member fields ``mModule``, ``mInputTensorBuffer`` and ``mInputTensor`` are initialized only once
200200
and buffer is refilled using ``org.pytorch.torchvision.TensorImageUtils.imageYUV420CenterCropToFloatBuffer``.
201201

202+
6. Load time optimization
203+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
204+
**Available since Pytorch 1.13**
205+
206+
PyTorch Mobile also supports a FlatBuffer-based file format that is faster
207+
to load. Both flatbuffer and pickle-based model file can be load with the
208+
same ``_load_for_lite_interpreter`` (Python) or ``_load_for_mobile``(C++) API.
209+
210+
To use the FlatBuffer format, instead of creating the model file with
211+
``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command:
212+
213+
214+
One can save using
215+
216+
::
217+
218+
model._save_for_lite_interpreter('path/to/file.ptl', _use_flatbuffer=True)
219+
220+
221+
The extra argument ``_use_flatbuffer`` makes a FlatBuffer file instead of a
222+
zip file. The created file will be faster to load.
223+
224+
For example, using ResNet-50 and running the following script:
225+
226+
::
227+
228+
import torch
229+
from torch.jit import mobile
230+
import time
231+
model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet50', pretrained=True)
232+
model.eval()
233+
jit_model = torch.jit.script(model)
234+
235+
jit_model._save_for_lite_interpreter('/tmp/jit_model.ptl')
236+
jit_model._save_for_lite_interpreter('/tmp/jit_model.ff', _use_flatbuffer=True)
237+
238+
import timeit
239+
print('Load ptl file:')
240+
print(timeit.timeit('from torch.jit import mobile; mobile._load_for_lite_interpreter("/tmp/jit_model.ptl")',
241+
number=20))
242+
print('Load flatbuffer file:')
243+
print(timeit.timeit('from torch.jit import mobile; mobile._load_for_lite_interpreter("/tmp/jit_model.ff")',
244+
number=20))
245+
246+
247+
248+
you would get the following result:
249+
250+
::
251+
252+
Load ptl file:
253+
0.5387594579999999
254+
Load flatbuffer file:
255+
0.038842832999999466
256+
257+
While speed ups on actual mobile devices will be smaller, you can still expect
258+
3x - 6x load time reductions.
259+
260+
### Reasons to avoid using a FlatBuffer-based mobile model
261+
262+
However, FlatBuffer format also has some limitations that you might want to consider:
263+
264+
* It is only available in PyTorch 1.13 or later. Therefore, client devices compiled
265+
with earlier PyTorch versions might not be able to load it.
266+
* The Flatbuffer library imposes a 4GB limit for file sizes. So it is not suitable
267+
for large models.
268+
202269
Benchmarking
203270
------------
204271

0 commit comments

Comments
 (0)