Skip to content

Enhance from_dlpack to support imported kDLCPU data to kDLOneAPI #1789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 13, 2024

Conversation

oleksandr-pavlyk
Copy link
Contributor

This PR contributes to gh-1781 to support importing objects resident on other devices to oneAPI devices by copying via host.

In [1]: import dpctl.tensor as dpt, numpy as np

In [2]: x = dpt.ones((2, 3))

In [3]: x.device
Out[3]: Device(level_zero:gpu:0)

In [4]: y = dpt.from_dlpack(x, device=(1,0))

In [5]: type(y)
Out[5]: numpy.ndarray

In [6]: w1 = dpt.from_dlpack(y, device="opencl:gpu")

In [7]: w1.device
Out[7]: Device(opencl:gpu:0)

In [8]: w2 = dpt.from_dlpack(y, device="gpu")

In [9]: w2.device
Out[9]: Device(level_zero:gpu:0)

In [10]: w3 = dpt.from_dlpack(y, device="cpu")

In [11]: w3.device, type(w3)
Out[11]: (Device(opencl:cpu:0), dpctl.tensor._usmarray.usm_ndarray)
  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

Copy link

github-actions bot commented Aug 7, 2024

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

Copy link

github-actions bot commented Aug 7, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_246 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Copy link

github-actions bot commented Aug 7, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_248 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Copy link

github-actions bot commented Aug 7, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_250 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Copy link

github-actions bot commented Aug 8, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_251 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

@ndgrigorian
Copy link
Collaborator

ndgrigorian commented Aug 8, 2024

@oleksandr-pavlyk
A small (very, very niche) problem I've found:
from_dlpack_capsule itself can raise BufferError. Under the right conditions it's possible to circumvent some otherwise unacceptable operations by from_dlpack because we try-except BufferError.

For example:

In [1]: import dpctl.tensor as dpt, dpctl, numpy as np, dpctl.tensor._dlpack as _dlp

In [2]: x_np = np.ones((10, 10), dtype="i4")

In [3]: x_np.strides = (x_np.strides[0] - 1, x_np.strides[1]) # strides are no longer multiple of itemsize

In [4]: dpt.from_dlpack(x_np)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/repos/dpctl/dpctl/tensor/_dlpack.pyx:1047, in dpctl.tensor._dlpack.from_dlpack()
   1046         dl_device = (device_OneAPI, get_parent_device_ordinal_id(<c_dpctl.SyclDevice>d))
-> 1047 dlpack_capsule = dlpack_attr(max_version=get_build_dlpack_version(), dl_device=dl_device, copy=copy)
   1048 return from_dlpack_capsule(dlpack_capsule)

TypeError: __dlpack__() got an unexpected keyword argument 'max_version'

During handling of the above exception, another exception occurred:

BufferError                               Traceback (most recent call last)
Cell In[4], line 1
----> 1 dpt.from_dlpack(x_np)

File ~/repos/dpctl/dpctl/tensor/_dlpack.pyx:1053, in dpctl.tensor._dlpack.from_dlpack()
   1051 x_dldev = dlpack_dev_attr()
   1052 if (dl_device is None) or (dl_device == x_dldev):
-> 1053     dlpack_capsule = dlpack_attr()
   1054     return from_dlpack_capsule(dlpack_capsule)
   1055 # must copy via host

BufferError: DLPack only supports strides which are a multiple of itemsize.

In [5]: dpt.from_dlpack(x_np, device=(14, 1))
Out[5]:
usm_ndarray([[       1,        1,        1,        1,        1,        1,
                     1,        1,        1,        1],
             [     256,      256,      256,      256,      256,      256,
                   256,      256,      256,      256],
             [   65536,    65536,    65536,    65536,    65536,    65536,
                 65536,    65536,    65536,    65536],
             [16777216, 16777216, 16777216, 16777216, 16777216, 16777216,
              16777216, 16777216, 16777216, 16777216],
             [       1,        1,        1,        1,        1,        1,
                     1,        1,        1,        1],
             [     256,      256,      256,      256,      256,      256,
                   256,      256,      256,      256],
             [   65536,    65536,    65536,    65536,    65536,    65536,
                 65536,    65536,    65536,    65536],
             [16777216, 16777216, 16777216, 16777216, 16777216, 16777216,
              16777216, 16777216, 16777216, 16777216],
             [       1,        1,        1,        1,        1,        1,
                     1,        1,        1,        1],
             [     256,      256,      256,      256,      256,      256,
                   256,      256,      256,      256]], dtype=int32)

This may not really be a problem, because the usm_ndarray copy itself is importable (because it has normal strides) but it came up when I was looking for corner cases.

@oleksandr-pavlyk
Copy link
Contributor Author

Thanks for the example @ndgrigorian. I have pushed the fix, by replacing host_blob = x with host_blob = result_of_roundtripping_through_dlpack(x). Added a test based in your example.

Copy link

github-actions bot commented Aug 8, 2024

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_291 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Copy link

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_295 ran successfully.
Passed: 895
Failed: 0
Skipped: 119

@oleksandr-pavlyk
Copy link
Contributor Author

@ndgrigorian Once this is ready to go, I intend to squash all changes into few commits. Please rereview

Copy link

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_296 ran successfully.
Passed: 895
Failed: 0
Skipped: 119

Copy link

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_298 ran successfully.
Passed: 895
Failed: 0
Skipped: 119

For arr that supports DLPack, version (1, 0),
or legacy, support

```
from_dlpack(arr, device=target_dev)
```

where target_dev is `(kDLCPU, 0)` for transfer to
host, or a value recognized by device keywords
in dpctl.tensor for other functions, or `(kDLOneAPI, dev_id)`.

To support transfer via host, `arr` must support
`__dlpack__(max_version=(1,0), dl_device=(1, 0))`.

For array objects with legacy `__dlpack__` support only,
supported inputs are those residing on kDLCPU device, or
those from kDLOneAPI device only.

---

This is a combination of 17 commits squashed into one:

Combine two validation checks into one, improving coverage

Only fall-back to __dlpack__() if requested device does not change

Simplify branching, only fall-back to no-arg call to __dlpack__ is
dl_device is None or same as reported for the input

Changed from_dlpack to copy via host is needed

This enables dpt.from_dlpack(numpy_array, device="opencl:cpu")

Add a test to exercise copy via host

Handle possibilities for TypeError and BufferError

These may be hard to test

Change exception raised by __dlpack__ if dl_device is unsupported

It used to raise NotImplementedError, not raises BufferError

Add case of dlpack test to expand coverage

Removed comment, add NotImplementedError to the except clause

To ensure same validation across branches, compute host_blob by roundtripping it through dlpack

Test from_dlpack on numpy input with strides not multiple of elementsize

Refined from_dlpack docstrings, reorged impl of from_dlpack

Used try/except/else/finally to avoid raising an exception when
another one is in flight (confusing UX).

device keyword is only allowed to be (kDLCPU, 0) or
(kDLOneAPI, num). Device keyword value is used to create output
array, rather than device_id deduced from it.

Adjusted test per change in implementation

Expand applicability of fall-back behavior

When `from_dlpack(arr, device=dev)` is called, for `arr` object
that supports legacy DLPack interface (max_version, dl_device, copy
are not supported), we now support arr being device on host, that
is (kDLCPU, 0), and (kDLOneAPI, different_device_id). Support for
this last case is being added in this commit, as per review comment.

Add symmetric support for containers with legacy DLPack support

For legacy containers, support device=(kDLCPU, 0) as well as
oneAPI device.

Add tests for importing generic legacy and generic modern containers

Fix typos in comments

Add test for legacy container holding numpy's array.
Copy link

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_300 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Copy link

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_284 ran successfully.
Passed: 895
Failed: 0
Skipped: 119

Copy link
Collaborator

@ndgrigorian ndgrigorian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, thank you @oleksandr-pavlyk !

@oleksandr-pavlyk oleksandr-pavlyk merged commit 9a17afc into feature/dlpack-kdlcpu-support Aug 13, 2024
43 of 49 checks passed
@oleksandr-pavlyk oleksandr-pavlyk deleted the enhance-from_dlpack branch August 13, 2024 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants