Sync the strides and size of DNNL tensor to its aten::tensor wrapper #5

EikanWang · 2020-05-11T13:51:03Z

We should just take the DNNL tensor as the buffer of aten tensor. It means that the shape information for OP computation should come from aten tensor but not its buffer - DNNL tensor. Then we can refine DNNL OP as follows.

Attach the shape meta info of aten tensor to DNNL tensor at the DNNL OP entry point
Write the shape meta info of DNNL tensor back to its aten tensor wrapper.

Besides that, if the OP(ex, slice, view) could change the shape of a tensor, its input tensors will always be reordered to a public format. Then its output tensors will be the plain format.

… not align with external aten tensor. TODO: We need to take aten tensor as the meta data source for DNNL op computation

EikanWang · 2020-05-11T13:57:12Z

@pinzhenx, @jiayisunx , @hongzhen1, @XiaobingSuper

EikanWang · 2020-05-11T13:58:55Z

I have not added the code - "Attach the shape meta info of aten tensor to DNNL tensor at the DNNL OP entry point". Let's have more discussion and check if it can cover all cases.

hongzhen1 · 2020-05-11T14:23:17Z

torch_ipex/csrc/aten_ipex_bridge.cpp

+#endif
+    auto pub_tensor = dil_tensor.to_public(nullptr, dil_tensor.get_data_type());
+
+    cpu::ShadeDataContext *new_shade_data_context = cpu::ShadeDataContext::allocShadeDataContext();


Is this useful? supposing this tensor is temporary, if so, shadedatacontext will be useless, right?

the returned tensor is dnnl tensor, it should be as same as dil_tensor. @pinzhenx , is that correct?

From my personal view, shade data context should only be attached in upgrade to DPCPP related interfaces.

Agree with your point. In this case, the DNNL tensor is reordered from block formant to plain format. And the buffer of the reordered DNNL tensor can be shared with the CPU. But the DataPtr does not expose the interface to modify its "data" field. Then we replace the old DataPtr for sharing data between CPU buffer and DNNL buffer while attaching a ShadeDataContext for keeping DNNL tensor to avoid resource-release.

we should discuss more about "device exchange" and "data type conversion", make it more simple and clear. current implementation may cause data_type conversion attaching shadecontext too.

jgong5 · 2020-05-12T00:20:12Z

tests/cpu/test_rn50_cpu_ops.py

@@ -416,6 +416,9 @@ def test_view(self):
        self.assertRaises(RuntimeError, lambda: tensor.view(7, -1))
        self.assertRaises(RuntimeError, lambda: tensor.view(15, -1, -1))

+        # TODO(Eikan): DNNL OP does not support >6 dim tensor, so we disable it temporily. When we fix it, we will open it
+        old_dnnl_conf = ipex.get_auto_dnnl()


May consider to do with with context manager to save the complexity of save/restore original conf.

Great! I will fix it.

It seems like that "with" does not work for native C++ API.

Two known issues here: 1. matmul does not support broadcast operator. Pinzhen will refine matmul DNNL op 2. does not register all data types for DPCPP backend. Eikan will fix it.

EikanWang · 2020-05-13T07:28:23Z

Passed most unit test cases with enabling auto_dnnl except two issues(test_copy_all_dtypes_and_devices, test_broadcast_batched_matmul). I recorded the two issues at github. And we will fix it later.

running BF16 with using INT8 AMX proxy

EikanWang added 2 commits May 11, 2020 21:13

Fix the issue that the size and strides of the internal dil buffer is…

86ea015

… not align with external aten tensor. TODO: We need to take aten tensor as the meta data source for DNNL op computation

Merge branch 'master' into fix_dil_at_strides_issue

cd20c9c

hongzhen1 reviewed May 11, 2020

View reviewed changes

jgong5 reviewed May 12, 2020

View reviewed changes

Fix most failed test cases.

e91c403

Two known issues here: 1. matmul does not support broadcast operator. Pinzhen will refine matmul DNNL op 2. does not register all data types for DPCPP backend. Eikan will fix it.

EikanWang merged commit f47ad5b into intel:master May 13, 2020

zhuhaozhe pushed a commit to zhuhaozhe/intel-extension-for-pytorch that referenced this pull request Apr 23, 2021

Merge pull request intel#5 from otcshare/INT8_PROXY_BF16

b332ebd

running BF16 with using INT8 AMX proxy

EikanWang pushed a commit that referenced this pull request Oct 4, 2021

remove_dropout_from_blacklist (#5)

a252efc

NathanJHLee mentioned this pull request Jul 7, 2022

required rank 4 tensor to use channels_last format #234

Open

edward-io mentioned this pull request Dec 9, 2022

Bus error (core dumped) calling ipex.optimize on Arc A770 #274

Open

Steve-Tech mentioned this pull request Aug 6, 2023

RuntimeError: Number of dpcpp devices should be greater than zero! #287

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync the strides and size of DNNL tensor to its aten::tensor wrapper #5

Sync the strides and size of DNNL tensor to its aten::tensor wrapper #5

Uh oh!

EikanWang commented May 11, 2020

Uh oh!

EikanWang commented May 11, 2020

Uh oh!

EikanWang commented May 11, 2020

Uh oh!

hongzhen1 May 11, 2020

Uh oh!

EikanWang May 11, 2020 •

edited

Loading

Uh oh!

hongzhen1 May 12, 2020

Uh oh!

EikanWang May 12, 2020

Uh oh!

hongzhen1 May 12, 2020

Uh oh!

jgong5 May 12, 2020

Uh oh!

EikanWang May 12, 2020

Uh oh!

EikanWang May 13, 2020

Uh oh!

EikanWang commented May 13, 2020

Uh oh!

Uh oh!

Sync the strides and size of DNNL tensor to its aten::tensor wrapper #5

Sync the strides and size of DNNL tensor to its aten::tensor wrapper #5

Uh oh!

Conversation

EikanWang commented May 11, 2020

Uh oh!

EikanWang commented May 11, 2020

Uh oh!

EikanWang commented May 11, 2020

Uh oh!

hongzhen1 May 11, 2020

Choose a reason for hiding this comment

Uh oh!

EikanWang May 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hongzhen1 May 12, 2020

Choose a reason for hiding this comment

Uh oh!

EikanWang May 12, 2020

Choose a reason for hiding this comment

Uh oh!

hongzhen1 May 12, 2020

Choose a reason for hiding this comment

Uh oh!

jgong5 May 12, 2020

Choose a reason for hiding this comment

Uh oh!

EikanWang May 12, 2020

Choose a reason for hiding this comment

Uh oh!

EikanWang May 13, 2020

Choose a reason for hiding this comment

Uh oh!

EikanWang commented May 13, 2020

Uh oh!

Uh oh!

EikanWang May 11, 2020 •

edited

Loading