Skip to content

Commit b7c53a5

Browse files
committed
Add a section to advertise use of flatbuffer format for mobile models.
1 parent 87fa403 commit b7c53a5

File tree

1 file changed

+62
-0
lines changed

1 file changed

+62
-0
lines changed

recipes_source/mobile_perf.rst

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,68 @@ You can check how it looks in code in `pytorch android application example <http
199199
Member fields ``mModule``, ``mInputTensorBuffer`` and ``mInputTensor`` are initialized only once
200200
and buffer is refilled using ``org.pytorch.torchvision.TensorImageUtils.imageYUV420CenterCropToFloatBuffer``.
201201

202+
6. Load time optimization
203+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
204+
**Available since Pytorch 1.13**
205+
206+
Pytorch mobile also support a flatbuffer based file format that is faster
207+
to load. Both flatbuffer and pickle based model file can be load with the
208+
same `_load_for_lite_interpreter` (Python) or `_load_for_mobile`(C++) API.
209+
210+
To use flatbuffer format, instead of create model file with
211+
212+
::
213+
214+
model._save_for_lite_interpreter('path/to/file.ptl')
215+
216+
217+
One can save using
218+
219+
::
220+
221+
model._save_for_lite_interpreter('path/to/file.ptl', _use_flatbuffer=True)
222+
223+
224+
The extra kwarg `_use_flatbuffer` makes a flatbuffer file instead of
225+
zip file. The created file will be faster to load.
226+
227+
For example, using resnet-50, running the following script:
228+
229+
::
230+
231+
import torch
232+
from torch.jit import mobile
233+
import time
234+
model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet50', pretrained=True)
235+
model.eval()
236+
jit_model = torch.jit.script(model)
237+
238+
jit_model._save_for_lite_interpreter('/tmp/jit_model.ptl')
239+
jit_model._save_for_lite_interpreter('/tmp/jit_model.ff', _use_flatbuffer=True)
240+
241+
import timeit
242+
print('Load ptl file:')
243+
print(timeit.timeit('from torch.jit import mobile; mobile._load_for_lite_interpreter("/tmp/jit_model.ptl")',
244+
number=20))
245+
print('Load flatbuffer file:')
246+
print(timeit.timeit('from torch.jit import mobile; mobile._load_for_lite_interpreter("/tmp/jit_model.ff")',
247+
number=20))
248+
249+
250+
251+
yields
252+
253+
::
254+
255+
Load ptl file:
256+
0.5387594579999999
257+
Load flatbuffer file:
258+
0.038842832999999466
259+
260+
Speed ups on actual mobile devices will be smaller. One can still expect
261+
3x - 6x load time reductions.
262+
263+
202264
Benchmarking
203265
------------
204266

0 commit comments

Comments
 (0)