-
Notifications
You must be signed in to change notification settings - Fork 282
Sync the strides and size of DNNL tensor to its aten::tensor wrapper #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… not align with external aten tensor. TODO: We need to take aten tensor as the meta data source for DNNL op computation
I have not added the code - "Attach the shape meta info of aten tensor to DNNL tensor at the DNNL OP entry point". Let's have more discussion and check if it can cover all cases. |
#endif | ||
auto pub_tensor = dil_tensor.to_public(nullptr, dil_tensor.get_data_type()); | ||
|
||
cpu::ShadeDataContext *new_shade_data_context = cpu::ShadeDataContext::allocShadeDataContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this useful? supposing this tensor is temporary, if so, shadedatacontext will be useless, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the returned tensor is dnnl tensor, it should be as same as dil_tensor. @pinzhenx , is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my personal view, shade data context should only be attached in upgrade to DPCPP
related interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with your point. In this case, the DNNL tensor is reordered from block formant to plain format. And the buffer of the reordered DNNL tensor can be shared with the CPU. But the DataPtr does not expose the interface to modify its "data" field. Then we replace the old DataPtr for sharing data between CPU buffer and DNNL buffer while attaching a ShadeDataContext for keeping DNNL tensor to avoid resource-release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should discuss more about "device exchange" and "data type conversion", make it more simple and clear. current implementation may cause data_type conversion attaching shadecontext too.
@@ -416,6 +416,9 @@ def test_view(self): | |||
self.assertRaises(RuntimeError, lambda: tensor.view(7, -1)) | |||
self.assertRaises(RuntimeError, lambda: tensor.view(15, -1, -1)) | |||
|
|||
# TODO(Eikan): DNNL OP does not support >6 dim tensor, so we disable it temporily. When we fix it, we will open it | |||
old_dnnl_conf = ipex.get_auto_dnnl() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May consider to do with
with context manager to save the complexity of save/restore original conf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like that "with" does not work for native C++ API.
Two known issues here: 1. matmul does not support broadcast operator. Pinzhen will refine matmul DNNL op 2. does not register all data types for DPCPP backend. Eikan will fix it.
Passed most unit test cases with enabling auto_dnnl except two issues(test_copy_all_dtypes_and_devices, test_broadcast_batched_matmul). I recorded the two issues at github. And we will fix it later. |
running BF16 with using INT8 AMX proxy
We should just take the DNNL tensor as the buffer of aten tensor. It means that the shape information for OP computation should come from aten tensor but not its buffer - DNNL tensor. Then we can refine DNNL OP as follows.
Besides that, if the OP(ex, slice, view) could change the shape of a tensor, its input tensors will always be reordered to a public format. Then its output tensors will be the plain format.