Enable Linear+ReLU fuse by OneDNNL #20

zhuhaozhe · 2020-05-25T02:21:37Z

No description provided.

jgong5 · 2020-05-25T02:47:04Z

intel_pytorch_extension_py/ops/DNNL_linear_relu.py

+        grad_input, grad_weight, grad_bias = core.linear_backward(input, grad_output, weight, output_mask)
+        return (grad_input, grad_weight, grad_bias)
+
+class DNNLLRFuse(nn.Module):


How about DNNLLinearFuseReLU? "LR" sounds vague to me.

jgong5 · 2020-05-25T02:53:25Z

intel_pytorch_extension_py/ops/DNNL_linear_relu.py

+import math
+import _torch_ipex as core
+
+class DNNLFC(Function):


Are we going to move this into c++?

Xiaobing is trying this since the Python way can not work in JIT.

jgong5 · 2020-05-25T02:56:37Z

torch_ipex/csrc/cpu/DevOPs.cpp

+  if (bias.has_value()) {
+    at::Tensor bias_vec = bias.value();
+    const dil::tensor b = dbl::comm::try_gen_dil_tensor(bias_vec);
+    dil::inner_product_forward::compute(x, w, b, y, true);


true here means enable relu as post-op? Can you add something like /* name_of_arg = */ true here for clarity?

intel_pytorch_extension_py/ops/__init__.py

torch_ipex/csrc/cpu/ExtendOPs.cpp

hongzhen1 · 2020-05-25T05:53:12Z

intel_pytorch_extension_py/ops/dil_linear_relu.py

+import math
+import _torch_ipex as core
+
+class dilLinearFuseReluFC(Function):


you can use LinearFuseRelu directly, not expose dil to front-end.

hongzhen1 · 2020-05-25T05:56:04Z

intel_pytorch_extension_py/ops/dil_linear_relu.py

+        grad_input, grad_weight, grad_bias = core.linear_backward(input, grad_output, weight, output_mask)
+        return (grad_input, grad_weight, grad_bias)
+
+class dilLinearFuseRelu(nn.Module):


same as above Function

hongzhen1 · 2020-05-25T05:57:20Z

torch_ipex/csrc/cpu/DevOPs.cpp

+  if (bias.has_value()) {
+    at::Tensor bias_vec = bias.value();
+    const dil::tensor b = dbl::comm::try_gen_dil_tensor(bias_vec);
+    dil::inner_product_forward::compute(x, w, b, y, /*fuse_relu=*/true);


please reuse attr parameter, and remove this new parameter

hongzhen1 · 2020-05-25T07:45:47Z

@zhuhaozhe could you add one UT to cover this integration?

EikanWang

As hongzhen's comments. Please add unit test cases.

zhuhaozhe · 2020-05-26T04:51:11Z

@hongzhen1 @EikanWang I have already added an unit test by following up test_mlp.py. 804110a

zhuhaozhe · 2020-05-27T01:43:36Z

@EikanWang Ok, but this patch will crash based on new commit, I am trying to fix now.

rename some FNs hide dil from frontend and reuse attr args instead of fuse_relu remove useless headfiles since relu' function body are moved to DevCPs.cpp add unit test for linear fuse relu move ut to test_lazy_reorder

EikanWang · 2020-05-28T00:46:08Z

tests/cpu/test_lazy_reorder.py

@@ -12,7 +12,7 @@
 import sys
 import torch
 import _torch_ipex as ipex
-import intel_pytorch_extension
+import intel_pytorch_extension_py


intel_pytorch_extension_py => intel_pytorch_extension

@jingxu10

* Add AVX512 macro in CMake to enable AVX512 * Cannot use the input dil tensor to check is_public_format or not because it out of scope * Fix build issue of PR #20 * Increase precision tolerance for ut * Update for new 'oneDNN' GitHub URL (#146) * Update default IPEX version to 1.2.0 * fallback to cpu for LSTM training with dropout * Parse pytorch 1.8 registrationdeclarition.h to gen dense operators code and sparse operators code * git commit -m * 1. Replace TensorList by c10::List 2. Replace tensor size and stride by SizesAndStrides TODO: Needs to workaround the RegXXX.h that the function sig conflicts with NativeFunctions.h * remove autocast from master * Pass build for pytorch 1.8 TODO: Add comments for gen-dense-cpu-ops.py There might be potential issues for grad copying * Enhance embedding bag last offset memory copy by using parallelized move_ker * add UT for int8 LSTM * add asymmetric quantization * enable int8 for LSTM * Port utils for ut from PyTorch 1.8 * Fix the issue that cannot fallback the tensor list wrapped by c10::list * Enable upsample_bilinear2d to support the scale factor is vector * Update README to clarify the IPEX version and PyTorch Update the IPEX version in setup.py to 1.2.0 * enable bf16 layernorm * Enable native layer norm signature matching * Pass all the test cases of the committed test file except layer_norm. Because IPEX cannot capture the layer_norm. * Capture layernorm on python side * Replace ATen/Tensor.h to ATen/ATen.h to avoid undefined symbol Conflicts: torch_ipex/csrc/utils.h * Gen sparse operators * Reorder to publice for slice in case throwing exception * 1. Support NHWC 2. Remove recorder tensors to reduce pytorch profiler overhead * 1. dependencies installation; 2. torch wheel file query and packaging; 3. doesn't require git anymore when compiling * Added tutorial Performance Tuning.md in directory tutorials * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * update test_torch.py and align with common_utils.py * bug fix in dockerfile (#164) * Update Dockerfile to include pybind11-dev (#157) As a fix for issue - #155. As suggested by @jingxu10, adding pybind11-dev allows for a successful build of the Docker container. * fix pt-1.8's UT * - Installation for IPEX-1.8, to remove the recompilation for PT and add the installation for dependency package. - Add the supported customized ops & fusion patterns. * tmp commit * pass most UT * modified _C.cpython.xxxx.so's rpath * fix unexpected keyword argument 'prec' in test_torch.py * Keep intel_pytorch_extension to ensure backward-compatibility * fix test_int8.py's regression * update the version to 1.8.0 * fix runtime undefined reference error caused by libstdc++ Dual ABI * Updated README.md for v1.8.0 * Updated torch-ccl to fix libfabric.so not found issue * setup.py: 1. fix include_paths and library_paths missing issue if torch is installed via setup.py. 2. sovled libstdc++ dual abi issue. 3. removed duplicated package importings. torch-ccl: 1. fixed oneCCL library path patching not taking effect issue * Update README.md * clean ipex installation folder structure * clean ipex installation folder structure * clean ipex installation folder structure * Add a warning message of deprecation of intel_pytorch_extension * fix rpath issue to libtorch_ccl.so after hierarchy adjustment * 1. removed execute bit of libtorch_ipex.so permission 2. upgraded torch-ccl to make libtorch_ccl.so installed to torch_ccl folder * Pass build for pytorch 1.9.0 * Enable batch_norm operator * update ipex Dockerfile to use no-patch version (#170) * update ipex Dockerfile to use no-patch version * explicit pytorch version * Exclude the operators that do not run into autograd * Pass all test cases except test_torch * Fix the issues 1. LSTM indents error 2. Check batch_normalization * Fix the issue that the grad of nll_loss input is none * update build version from 1.8.0.1 to 1.9.0 (along with pytorch version) * fix dil_cat bug when concating empty tensors with customized shape * 1. moved python codes out from libtorch_ipex.so to _C.so 2. removed pybind11 as denpendency library from third_party folder 3. changed "import intel_pytorch_extension" to "import torch_ipex" in tests folder, Readme.md, torch_ipex/ops/embeddingbag.py and torch_ipex/launch.py 4. commented "core.enable_torch_ccl()" out in torch_ipex/__init__.py, to avoid the following error when "import torch_ipex" Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/jingxu1/dl/pytorch/srcs/venv_test_py38/lib/python3.8/site-packages/torch_ipex/__init__.py", line 14, in <module> core.enable_torch_ccl() RuntimeError: arg(): could not convert default argument into a Python object (type not registered yet?). Compile in debug mode for more information. * 1. removed torch-ccl 2. added debug info into version.py 3. removed pytorch wheel file binding in debug mode * updated dockerfile to 1.9.0 * removed core.enable_torch_ccl() * updated README.md for 1.9.0 * updated README.md for 1.9.0 * updated .gitignore to delete torch_ipex/version.py when performing clean * V1.8.0 whl release (#171) * Added wheel file release info to README.md * Added wheel file release info to README.md * Exclude flatten.using_ints and cross_entropy_loss because the two operators do not generate backward functions * Does not capture batch_norm and _batch_norm_impl_index * Exclude reshape and where * Exclude nll_loss2d * added denormal numbers section to performance_tuning.md * Add installation guide for 1.9.0 * Add installation guide for 1.9.0 * Update README.md The default IPEX and PyTorch versions are v1.9.0 * added avx512 note * updated launch.py * added launcher doc * added launcher doc * Add python interface c++ source file * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update LICENSE.txt * Update README.md * Remove useless files * Fix format issue Co-authored-by: Abolfazl Shahbazi <abolfazl.shahbazi@intel.com> Co-authored-by: chunyuan-w <chunyuan.wu@intel.com> Co-authored-by: leslie-fang-intel <leslie.fang@intel.com> Co-authored-by: Chen, Jian Ping <jian.ping.chen@intel.com> Co-authored-by: jiayisun <jiayi.sun@intel.com> Co-authored-by: Jing Xu <jing.xu@intel.com> Co-authored-by: Zhu, Jewel <jewel.zhu@intel.com> Co-authored-by: tangleintel <lei1.tang@intel.com> Co-authored-by: Chaitanya Hazarey <C24IO@users.noreply.github.com> Co-authored-by: Ashok Emani <ashok.emani@intel.com> Co-authored-by: Wang, Eikan <root@JF5300-B11A316T.jf.intel.com> Co-authored-by: jianangu <jianan.gu@intel.com>

jgong5 reviewed May 25, 2020

View reviewed changes

EikanWang requested changes May 25, 2020

View reviewed changes

intel_pytorch_extension_py/ops/__init__.py Outdated Show resolved Hide resolved

torch_ipex/csrc/cpu/ExtendOPs.cpp Outdated Show resolved Hide resolved

zhuhaozhe requested a review from EikanWang May 25, 2020 05:51

hongzhen1 reviewed May 25, 2020

View reviewed changes

EikanWang reviewed May 26, 2020

View reviewed changes

Enable Linear+ReLU fuse by OneDNNL

3e499cb

rename some FNs hide dil from frontend and reuse attr args instead of fuse_relu remove useless headfiles since relu' function body are moved to DevCPs.cpp add unit test for linear fuse relu move ut to test_lazy_reorder

zhuhaozhe force-pushed the LRfuse branch from 804110a to 3e499cb Compare May 27, 2020 08:02

EikanWang reviewed May 28, 2020

View reviewed changes

remove py

81d80ea

zhuhaozhe force-pushed the LRfuse branch from fa73b96 to 81d80ea Compare May 28, 2020 06:09

EikanWang merged commit fc686f6 into intel:master May 28, 2020

EikanWang added a commit to EikanWang/intel-extension-for-pytorch that referenced this pull request May 28, 2020

Fix build issue of PR intel#20

a0c6bd5

EikanWang added a commit to EikanWang/intel-extension-for-pytorch that referenced this pull request May 28, 2020

Fix build issue of PR intel#20

544ea2d

EikanWang added a commit that referenced this pull request May 28, 2020

Fix build issue of PR #20 (#32)

9cd8f30

zhuhaozhe deleted the LRfuse branch August 18, 2020 09:27

EikanWang pushed a commit that referenced this pull request Oct 4, 2021

direct init ouput memory to avoid resize it for NHWC path (#20)

5579971

NathanJHLee mentioned this pull request Jul 7, 2022

required rank 4 tensor to use channels_last format #234

Open

edward-io mentioned this pull request Dec 9, 2022

Bus error (core dumped) calling ipex.optimize on Arc A770 #274

Open

Steve-Tech mentioned this pull request Aug 6, 2023

RuntimeError: Number of dpcpp devices should be greater than zero! #287

Open

Enable Linear+ReLU fuse by OneDNNL #20

Enable Linear+ReLU fuse by OneDNNL #20

Uh oh!

Conversation

zhuhaozhe commented May 25, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hongzhen1 commented May 25, 2020

Uh oh!

EikanWang left a comment

Choose a reason for hiding this comment

Uh oh!

zhuhaozhe commented May 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhuhaozhe commented May 27, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zhuhaozhe commented May 26, 2020 •

edited

Loading