From b7c53a57fb4644838344c6b7f9367e41ede7aa7f Mon Sep 17 00:00:00 2001 From: "Han Qi (qihqi)" Date: Fri, 14 Apr 2023 15:25:19 -0700 Subject: [PATCH 1/4] Add a section to advertise use of flatbuffer format for mobile models. --- recipes_source/mobile_perf.rst | 62 ++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst index ace505aac06..eba4c89e0bb 100644 --- a/recipes_source/mobile_perf.rst +++ b/recipes_source/mobile_perf.rst @@ -199,6 +199,68 @@ You can check how it looks in code in `pytorch android application example Date: Mon, 17 Apr 2023 12:53:18 -0700 Subject: [PATCH 2/4] Apply suggestions from code review Co-authored-by: Svetlana Karslioglu --- recipes_source/mobile_perf.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst index eba4c89e0bb..3c857e43512 100644 --- a/recipes_source/mobile_perf.rst +++ b/recipes_source/mobile_perf.rst @@ -203,11 +203,11 @@ and buffer is refilled using ``org.pytorch.torchvision.TensorImageUtils.imageYUV ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Available since Pytorch 1.13** -Pytorch mobile also support a flatbuffer based file format that is faster -to load. Both flatbuffer and pickle based model file can be load with the -same `_load_for_lite_interpreter` (Python) or `_load_for_mobile`(C++) API. +PyTorch Mobile also supports a FlatBuffer-based file format that is faster +to load. Both flatbuffer and pickle-based model file can be load with the +same ``_load_for_lite_interpreter`` (Python) or ``_load_for_mobile``(C++) API. -To use flatbuffer format, instead of create model file with +To use the FlatBuffer format, instead of creating the model file with ``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command: :: @@ -221,10 +221,10 @@ One can save using model._save_for_lite_interpreter('path/to/file.ptl', _use_flatbuffer=True) -The extra kwarg `_use_flatbuffer` makes a flatbuffer file instead of +The extra arguemnt ``_use_flatbuffer`` makes a FlatBuffer file instead of a zip file. The created file will be faster to load. -For example, using resnet-50, running the following script: +For example, using ResNet-50 and running the following script: :: @@ -248,7 +248,7 @@ For example, using resnet-50, running the following script: -yields +you would get the following result: :: @@ -257,7 +257,7 @@ yields Load flatbuffer file: 0.038842832999999466 -Speed ups on actual mobile devices will be smaller. One can still expect +While speed ups on actual mobile devices will be smaller, you can still expect 3x - 6x load time reductions. From b94a8224f7957f8739bfcfad19ef18987cad5a02 Mon Sep 17 00:00:00 2001 From: "Han Qi (qihqi)" Date: Mon, 17 Apr 2023 13:02:41 -0700 Subject: [PATCH 3/4] Add a section on caveats. --- recipes_source/mobile_perf.rst | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst index 3c857e43512..8160f1c6f0d 100644 --- a/recipes_source/mobile_perf.rst +++ b/recipes_source/mobile_perf.rst @@ -207,7 +207,8 @@ PyTorch Mobile also supports a FlatBuffer-based file format that is faster to load. Both flatbuffer and pickle-based model file can be load with the same ``_load_for_lite_interpreter`` (Python) or ``_load_for_mobile``(C++) API. -To use the FlatBuffer format, instead of creating the model file with ``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command: +To use the FlatBuffer format, instead of creating the model file with +``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command: :: @@ -221,7 +222,7 @@ One can save using model._save_for_lite_interpreter('path/to/file.ptl', _use_flatbuffer=True) -The extra arguemnt ``_use_flatbuffer`` makes a FlatBuffer file instead of a +The extra argument ``_use_flatbuffer`` makes a FlatBuffer file instead of a zip file. The created file will be faster to load. For example, using ResNet-50 and running the following script: @@ -260,6 +261,15 @@ you would get the following result: While speed ups on actual mobile devices will be smaller, you can still expect 3x - 6x load time reductions. +### Reasons to not use flatbuffer based mobile model: + +However flatbuffer format also has some limitations that one should consider +before using it. Namely: + +* It is only available since Pytorch 1.13. Therefore, client devices compiled + with earlier Pytorch versions might not be able to load it. +* Flatbuffer library imposes a 4GB maximum for file sizes. So it is not suitable + for large models. Benchmarking ------------ From 5a75d24e134981f81f6ff13e58eeff1d1a522c98 Mon Sep 17 00:00:00 2001 From: qihqi Date: Mon, 17 Apr 2023 13:59:55 -0700 Subject: [PATCH 4/4] Apply suggestions from code review Co-authored-by: Svetlana Karslioglu --- recipes_source/mobile_perf.rst | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst index 8160f1c6f0d..aae1447cbf8 100644 --- a/recipes_source/mobile_perf.rst +++ b/recipes_source/mobile_perf.rst @@ -210,10 +210,6 @@ same ``_load_for_lite_interpreter`` (Python) or ``_load_for_mobile``(C++) API. To use the FlatBuffer format, instead of creating the model file with ``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command: -:: - - model._save_for_lite_interpreter('path/to/file.ptl') - One can save using @@ -261,14 +257,13 @@ you would get the following result: While speed ups on actual mobile devices will be smaller, you can still expect 3x - 6x load time reductions. -### Reasons to not use flatbuffer based mobile model: +### Reasons to avoid using a FlatBuffer-based mobile model -However flatbuffer format also has some limitations that one should consider -before using it. Namely: +However, FlatBuffer format also has some limitations that you might want to consider: -* It is only available since Pytorch 1.13. Therefore, client devices compiled - with earlier Pytorch versions might not be able to load it. -* Flatbuffer library imposes a 4GB maximum for file sizes. So it is not suitable +* It is only available in PyTorch 1.13 or later. Therefore, client devices compiled + with earlier PyTorch versions might not be able to load it. +* The Flatbuffer library imposes a 4GB limit for file sizes. So it is not suitable for large models. Benchmarking