From b7c53a57fb4644838344c6b7f9367e41ede7aa7f Mon Sep 17 00:00:00 2001
From: "Han Qi (qihqi)" <qihan@fb.com>
Date: Fri, 14 Apr 2023 15:25:19 -0700
Subject: [PATCH 1/4] Add a section to advertise use of flatbuffer format for
 mobile models.

---
 recipes_source/mobile_perf.rst | 62 ++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst
index ace505aac06..eba4c89e0bb 100644
--- a/recipes_source/mobile_perf.rst
+++ b/recipes_source/mobile_perf.rst
@@ -199,6 +199,68 @@ You can check how it looks in code in `pytorch android application example <http
 Member fields ``mModule``, ``mInputTensorBuffer`` and ``mInputTensor`` are initialized only once
 and buffer is refilled using ``org.pytorch.torchvision.TensorImageUtils.imageYUV420CenterCropToFloatBuffer``.
 
+6. Load time optimization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+**Available since Pytorch 1.13**
+
+Pytorch mobile also support a flatbuffer based file format that is faster
+to load. Both flatbuffer and pickle based model file can be load with the
+same `_load_for_lite_interpreter` (Python) or `_load_for_mobile`(C++) API.
+
+To use flatbuffer format, instead of create model file with
+
+::
+
+  model._save_for_lite_interpreter('path/to/file.ptl')
+
+
+One can save using
+
+::
+
+  model._save_for_lite_interpreter('path/to/file.ptl', _use_flatbuffer=True)
+
+
+The extra kwarg `_use_flatbuffer` makes a flatbuffer file instead of
+zip file. The created file will be faster to load.
+
+For example, using resnet-50, running the following script:
+
+::
+
+  import torch
+  from torch.jit import mobile
+  import time
+  model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet50', pretrained=True)
+  model.eval()
+  jit_model = torch.jit.script(model)
+
+  jit_model._save_for_lite_interpreter('/tmp/jit_model.ptl')
+  jit_model._save_for_lite_interpreter('/tmp/jit_model.ff', _use_flatbuffer=True)
+
+  import timeit
+  print('Load ptl file:')
+  print(timeit.timeit('from torch.jit import mobile; mobile._load_for_lite_interpreter("/tmp/jit_model.ptl")',
+                         number=20))
+  print('Load flatbuffer file:')
+  print(timeit.timeit('from torch.jit import mobile; mobile._load_for_lite_interpreter("/tmp/jit_model.ff")',
+                         number=20))
+
+
+
+yields
+
+::
+
+  Load ptl file:
+  0.5387594579999999
+  Load flatbuffer file:
+  0.038842832999999466
+
+Speed ups on actual mobile devices will be smaller. One can still expect
+3x - 6x load time reductions.
+
+
 Benchmarking
 ------------
 

From 42207fb267f79adbb0c2a59ae93708e71f175760 Mon Sep 17 00:00:00 2001
From: qihqi <qihan.dev@gmail.com>
Date: Mon, 17 Apr 2023 12:53:18 -0700
Subject: [PATCH 2/4] Apply suggestions from code review

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
---
 recipes_source/mobile_perf.rst | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst
index eba4c89e0bb..3c857e43512 100644
--- a/recipes_source/mobile_perf.rst
+++ b/recipes_source/mobile_perf.rst
@@ -203,11 +203,11 @@ and buffer is refilled using ``org.pytorch.torchvision.TensorImageUtils.imageYUV
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 **Available since Pytorch 1.13**
 
-Pytorch mobile also support a flatbuffer based file format that is faster
-to load. Both flatbuffer and pickle based model file can be load with the
-same `_load_for_lite_interpreter` (Python) or `_load_for_mobile`(C++) API.
+PyTorch Mobile also supports a FlatBuffer-based file format that is faster
+to load. Both flatbuffer and pickle-based model file can be load with the
+same ``_load_for_lite_interpreter`` (Python) or ``_load_for_mobile``(C++) API.
 
-To use flatbuffer format, instead of create model file with
+To use the FlatBuffer format, instead of creating the model file with ``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command:
 
 ::
 
@@ -221,10 +221,10 @@ One can save using
   model._save_for_lite_interpreter('path/to/file.ptl', _use_flatbuffer=True)
 
 
-The extra kwarg `_use_flatbuffer` makes a flatbuffer file instead of
+The extra arguemnt ``_use_flatbuffer`` makes a FlatBuffer file instead of a
 zip file. The created file will be faster to load.
 
-For example, using resnet-50, running the following script:
+For example, using ResNet-50 and running the following script:
 
 ::
 
@@ -248,7 +248,7 @@ For example, using resnet-50, running the following script:
 
 
 
-yields
+you would get the following result: 
 
 ::
 
@@ -257,7 +257,7 @@ yields
   Load flatbuffer file:
   0.038842832999999466
 
-Speed ups on actual mobile devices will be smaller. One can still expect
+While speed ups on actual mobile devices will be smaller, you can still expect
 3x - 6x load time reductions.
 
 

From b94a8224f7957f8739bfcfad19ef18987cad5a02 Mon Sep 17 00:00:00 2001
From: "Han Qi (qihqi)" <qihan@fb.com>
Date: Mon, 17 Apr 2023 13:02:41 -0700
Subject: [PATCH 3/4] Add a section on caveats.

---
 recipes_source/mobile_perf.rst | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst
index 3c857e43512..8160f1c6f0d 100644
--- a/recipes_source/mobile_perf.rst
+++ b/recipes_source/mobile_perf.rst
@@ -207,7 +207,8 @@ PyTorch Mobile also supports a FlatBuffer-based file format that is faster
 to load. Both flatbuffer and pickle-based model file can be load with the
 same ``_load_for_lite_interpreter`` (Python) or ``_load_for_mobile``(C++) API.
 
-To use the FlatBuffer format, instead of creating the model file with ``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command:
+To use the FlatBuffer format, instead of creating the model file with
+``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command:
 
 ::
 
@@ -221,7 +222,7 @@ One can save using
   model._save_for_lite_interpreter('path/to/file.ptl', _use_flatbuffer=True)
 
 
-The extra arguemnt ``_use_flatbuffer`` makes a FlatBuffer file instead of a
+The extra argument ``_use_flatbuffer`` makes a FlatBuffer file instead of a
 zip file. The created file will be faster to load.
 
 For example, using ResNet-50 and running the following script:
@@ -260,6 +261,15 @@ you would get the following result:
 While speed ups on actual mobile devices will be smaller, you can still expect
 3x - 6x load time reductions.
 
+### Reasons to not use flatbuffer based mobile model:
+
+However flatbuffer format also has some limitations that one should consider
+before using it. Namely:
+
+* It is only available since Pytorch 1.13. Therefore, client devices compiled
+  with earlier Pytorch versions might not be able to load it.
+* Flatbuffer library imposes a 4GB maximum for file sizes. So it is not suitable
+  for large models.
 
 Benchmarking
 ------------

From 5a75d24e134981f81f6ff13e58eeff1d1a522c98 Mon Sep 17 00:00:00 2001
From: qihqi <qihan.dev@gmail.com>
Date: Mon, 17 Apr 2023 13:59:55 -0700
Subject: [PATCH 4/4] Apply suggestions from code review

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
---
 recipes_source/mobile_perf.rst | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/recipes_source/mobile_perf.rst b/recipes_source/mobile_perf.rst
index 8160f1c6f0d..aae1447cbf8 100644
--- a/recipes_source/mobile_perf.rst
+++ b/recipes_source/mobile_perf.rst
@@ -210,10 +210,6 @@ same ``_load_for_lite_interpreter`` (Python) or ``_load_for_mobile``(C++) API.
 To use the FlatBuffer format, instead of creating the model file with
 ``model._save_for_lite_interpreter('path/to/file.ptl')``, you can run the following command:
 
-::
-
-  model._save_for_lite_interpreter('path/to/file.ptl')
-
 
 One can save using
 
@@ -261,14 +257,13 @@ you would get the following result:
 While speed ups on actual mobile devices will be smaller, you can still expect
 3x - 6x load time reductions.
 
-### Reasons to not use flatbuffer based mobile model:
+### Reasons to avoid using a FlatBuffer-based mobile model
 
-However flatbuffer format also has some limitations that one should consider
-before using it. Namely:
+However, FlatBuffer format also has some limitations that you might want to consider:
 
-* It is only available since Pytorch 1.13. Therefore, client devices compiled
-  with earlier Pytorch versions might not be able to load it.
-* Flatbuffer library imposes a 4GB maximum for file sizes. So it is not suitable
+* It is only available in PyTorch 1.13 or later. Therefore, client devices compiled
+  with earlier PyTorch versions might not be able to load it.
+* The Flatbuffer library imposes a 4GB limit for file sizes. So it is not suitable
   for large models.
 
 Benchmarking