Force to remap to CPU for TorchScript model if GPUdoesn't exist #167

gigony · 2021-10-09T01:04:30Z

The TorchScript model file is loaded to the device where the model is trained (torch.jit.load).

For that reason, if the model is trained/saved by GPU and loaded by the system without GPU, it causes errors because the model cannot be loaded.

monai-deploy-app-sdk/monai/deploy/core/models/torch_model.py

Line 50 in fa4df1d

self._predictor = torch.jit.load(self.path).eval()

  File "/Users/Erik/opt/anaconda3/lib/python3.8/site-packages/monai/deploy/core/models/model.py", line 220, in __call__
    if self.predictor:
  File "/Users/Erik/opt/anaconda3/lib/python3.8/site-packages/monai/deploy/core/models/torch_model.py", line 50, in predictor
    self._predictor = torch.jit.load(self.path).eval()
  File "/Users/Erik/opt/anaconda3/lib/python3.8/site-packages/torch/jit/_serialization.py", line 161, in load
    cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
NotImplementedError: Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend. 
This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). 
If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 
'aten::empty_strided' is only available for these backends: [CPU, Meta, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

This patch resolves the issue by remapping the storage to CPU if GPU is not available.

Resolve #166 that is from #156.

A test package is available by

pip install -i https://test.pypi.org/simple/ monai-deploy-app-sdk==0.0.167

Verified locally by disabling GPU

export CUDA_VISIBLE_DEVICES=""
python examples/apps/mednist_classifier_monaideploy/mednist_classifier_monaideploy.py -i input/AbdomenCT_007000.jpeg -o output -m classifier.zip

python examples/apps/ai_spleen_seg_app/app.py -i dcm/ -o output -m model.pt

MMelQin

This PR address the issue of jit.load should target available device (GPU preferred but not forced in not present). It, however, does not, or have not proved to, address the issue of force-load model traced with GPU onto CPU.

MMelQin · 2021-10-09T01:34:19Z

monai/deploy/core/models/torch_model.py

@@ -47,7 +47,9 @@ def predictor(self) -> "torch.nn.Module":  # type: ignore
            torch.nn.Module: the model's predictor
        """
        if self._predictor is None:
-            self._predictor = torch.jit.load(self.path).eval()
+            # If GPU is not available, explicitly use "cpu" to dynamically remap storages to CPU.
+            map_location = None if torch.cuda.is_available() else "cpu"


Wonder why not using torch device, to be explicit here and for clarity (https://pytorch.org/docs/master/generated/torch.jit.load.html#torch.jit.load)?

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Thanks for the suggestion!

I am afraid the behavior is different (map_location needs to be None, not torch.device("cuda")).

We wanted to remap to 'cpu' only if cuda is not available. If cuda is available, we wouldn't want to remap location, and use the device specified in the model.

device = torch.device("cuda" if torch.cuda.is_available() else "CPU") self._predictor = torch.jit.load(self.path, map_location=device).eval()

vs

map_location = None if torch.cuda.is_available() else "cpu" self._predictor = torch.jit.load(self.path, map_location=map_location).eval()

However, on second thought, it would be better to force remap, depending on cuda availability (as operator/application's code also usually use the condition to determine which device to use.

So, will fix as suggested

MMelQin · 2021-10-09T01:38:47Z

monai/deploy/operators/monai_seg_inference_operator.py

@@ -165,7 +165,8 @@ def compute(self, op_input: InputContext, op_output: OutputContext, context: Exe
                model = context.models.get()
            else:
                print(f"Loading TorchScript model from: {MonaiSegInferenceOperator.MODEL_LOCAL_PATH}")
-                model = torch.jit.load(MonaiSegInferenceOperator.MODEL_LOCAL_PATH)
+                map_location = None if torch.cuda.is_available() else "cpu"


I'd change this to use the device which is instantiated in this block of code. Also, this else block is a fallback when the SDK model loader fails to load the model, the original line #172 below.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Also force a model traced on GPU to CPU will most likely fail, so, this is only shifting the failure points.

Thanks Ming!
Updated to use device variable

MMelQin · 2021-10-09T19:13:56Z

notebooks/tutorials/02_mednist_app-prebuilt.ipynb

@@ -432,12 +432,10 @@
    "        image_tensor = self.transform(img)  # (1, 64, 64), torch.float64\n",
    "        image_tensor = image_tensor[None].float()  # (1, 1, 64, 64), torch.float32\n",
    "\n",
-    "        # Comment below line if you want to do CPU inference\n",
-    "        image_tensor = image_tensor.cuda()\n",
+    "        if torch.cuda.is_available():\n",


This conditional logic is repeated multiple times. Wonder why not using instantiating a device, the tensor.to(device)?

Updated existing code/tutorial.
Thank you for the suggestion!

Signed-off-by: Gigon Bae <gbae@nvidia.com>

gigony added the bug Something isn't working label Oct 9, 2021

gigony self-assigned this Oct 9, 2021

gigony requested a review from MMelQin October 9, 2021 06:21

MMelQin reviewed Oct 9, 2021

View reviewed changes

Force to remap to CPU for TorchScript model if GPU doesn't exist

14c1aae

Signed-off-by: Gigon Bae <gbae@nvidia.com>

gigony force-pushed the force_cpu_torchscript branch from d81de7a to 14c1aae Compare November 23, 2021 19:50

gigony merged commit 2f83bc1 into main Nov 24, 2021

gigony mentioned this pull request Nov 24, 2021

Release v0.2.0 #205

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Force to remap to CPU for TorchScript model if GPUdoesn't exist #167

Force to remap to CPU for TorchScript model if GPUdoesn't exist #167

Uh oh!

gigony commented Oct 9, 2021 •

edited

Loading

Uh oh!

MMelQin left a comment

Uh oh!

MMelQin Oct 9, 2021

Uh oh!

gigony Nov 23, 2021

Uh oh!

MMelQin Oct 9, 2021

Uh oh!

gigony Nov 23, 2021

Uh oh!

MMelQin Oct 9, 2021

Uh oh!

gigony Nov 23, 2021

Uh oh!

Uh oh!

Force to remap to CPU for TorchScript model if GPUdoesn't exist #167

Force to remap to CPU for TorchScript model if GPUdoesn't exist #167

Uh oh!

Conversation

gigony commented Oct 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MMelQin left a comment

Choose a reason for hiding this comment

Uh oh!

MMelQin Oct 9, 2021

Choose a reason for hiding this comment

Uh oh!

gigony Nov 23, 2021

Choose a reason for hiding this comment

Uh oh!

MMelQin Oct 9, 2021

Choose a reason for hiding this comment

Uh oh!

gigony Nov 23, 2021

Choose a reason for hiding this comment

Uh oh!

MMelQin Oct 9, 2021

Choose a reason for hiding this comment

Uh oh!

gigony Nov 23, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gigony commented Oct 9, 2021 •

edited

Loading