@@ -39,8 +39,11 @@ The interchange mechanism must offer the following:
39
39
C/C++, and are released independently from each other. Hence a stable C
40
40
ABI is required for packages to work well together. *
41
41
42
+ DLPack: An in-memory tensor structure
43
+ -------------------------------------
44
+
42
45
The best candidate for this protocol is
43
- `DLPack <https://github.com/dmlc/ dlpack >`_, and hence that is what this
46
+ `DLPack <https://dmlc. github.io/ dlpack/latest/ >`_, and hence that is what this
44
47
standard has chosen as the primary/recommended protocol. Note that the
45
48
``asarray `` function also supports the Python buffer protocol (CPU-only) to
46
49
support libraries that already implement buffer protocol support.
@@ -70,116 +73,14 @@ support libraries that already implement buffer protocol support.
70
73
See the `RFC to adopt DLPack <https://github.com/data-apis/consortium-feedback/issues/1 >`_
71
74
for discussion that preceded the adoption of DLPack.
72
75
76
+ DLPack's documentation can be found at: https://dmlc.github.io/dlpack/latest/.
73
77
74
- DLPack support
75
- --------------
78
+ The ` Python specification of DLPack < https://dmlc.github.io/dlpack/latest/python_spec.html >`__
79
+ page gives a high-level specification for data exchange in Python using DLPack.
76
80
77
81
.. note ::
78
82
DLPack is a standalone protocol/project and can therefore be used outside of
79
83
this standard. Python libraries that want to implement only DLPack support
80
84
are recommended to do so using the same syntax and semantics as outlined
81
85
below. They are not required to return an array object from ``from_dlpack ``
82
86
which conforms to this standard.
83
-
84
- DLPack itself has no documentation currently outside of the inline comments in
85
- `dlpack.h <https://github.com/dmlc/dlpack/blob/main/include/dlpack/dlpack.h >`_.
86
- In the future, the below content may be migrated to the (to-be-written) DLPack docs.
87
-
88
-
89
- Syntax for data interchange with DLPack
90
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91
-
92
- The array API will offer the following syntax for data interchange:
93
-
94
- 1. A ``from_dlpack(x) `` function, which accepts (array) objects with a
95
- ``__dlpack__ `` method and uses that method to construct a new array
96
- containing the data from ``x ``.
97
- 2. ``__dlpack__(self, stream=None) `` and ``__dlpack_device__ `` methods on the
98
- array object, which will be called from within ``from_dlpack ``, to query
99
- what device the array is on (may be needed to pass in the correct
100
- stream, e.g. in the case of multiple GPUs) and to access the data.
101
-
102
-
103
- Semantics
104
- ~~~~~~~~~
105
-
106
- DLPack describe the memory layout of strided, n-dimensional arrays.
107
- When a user calls ``y = from_dlpack(x) ``, the library implementing ``x `` (the
108
- "producer") will provide access to the data from ``x `` to the library
109
- containing ``from_dlpack `` (the "consumer"). If possible, this must be
110
- zero-copy (i.e. ``y `` will be a *view * on ``x ``). If not possible, that library
111
- may make a copy of the data. In both cases:
112
-
113
- - the producer keeps owning the memory
114
- - ``y `` may or may not be a view, therefore the user must keep the recommendation to avoid mutating ``y `` in mind - see :ref: `copyview-mutability `.
115
- - Both ``x `` and ``y `` may continue to be used just like arrays created in other ways.
116
-
117
- If an array that is accessed via the interchange protocol lives on a
118
- device that the requesting library does not support, it is recommended to
119
- raise a ``TypeError ``.
120
-
121
- Stream handling through the ``stream `` keyword applies to CUDA and ROCm (perhaps
122
- to other devices that have a stream concept as well, however those haven't been
123
- considered in detail). The consumer must pass the stream it will use to the
124
- producer; the producer must synchronize or wait on the stream when necessary.
125
- In the common case of the default stream being used, synchronization will be
126
- unnecessary so asynchronous execution is enabled.
127
-
128
-
129
- Implementation
130
- ~~~~~~~~~~~~~~
131
-
132
- *Note that while this API standard largely tries to avoid discussing
133
- implementation details, some discussion and requirements are needed
134
- here because data interchange requires coordination between
135
- implementers on, e.g., memory management. *
136
-
137
- .. image :: /_static/images/DLPack_diagram.png
138
- :alt: Diagram of DLPack structs
139
-
140
- *DLPack diagram. Dark blue are the structs it defines, light blue
141
- struct members, gray text enum values of supported devices and data
142
- types. *
143
-
144
- The ``__dlpack__ `` method will produce a ``PyCapsule `` containing a
145
- ``DLManagedTensor ``, which will be consumed immediately within
146
- ``from_dlpack `` - therefore it is consumed exactly once, and it will not be
147
- visible to users of the Python API.
148
-
149
- The producer must set the ``PyCapsule `` name to ``"dltensor" `` so that
150
- it can be inspected by name, and set ``PyCapsule_Destructor `` that calls
151
- the ``deleter `` of the ``DLManagedTensor `` when the ``"dltensor" ``-named
152
- capsule is no longer needed.
153
-
154
- The consumer must transer ownership of the ``DLManangedTensor `` from the
155
- capsule to its own object. It does so by renaming the capsule to
156
- ``"used_dltensor" `` to ensure that ``PyCapsule_Destructor `` will not get
157
- called (ensured if ``PyCapsule_Destructor `` calls ``deleter `` only for
158
- capsules whose name is ``"dltensor" ``), but the ``deleter `` of the
159
- ``DLManagedTensor `` will be called by the destructor of the consumer
160
- library object created to own the ``DLManagerTensor `` obtained from the
161
- capsule.
162
-
163
- Note: the capsule names ``"dltensor" `` and ``"used_dltensor" `` must be
164
- statically allocated.
165
-
166
- When the ``strides `` field in the ``DLTensor `` struct is ``NULL ``, it indicates a
167
- row-major compact array. If the array is of size zero, the data pointer in
168
- ``DLTensor `` should be set to either ``NULL `` or ``0 ``.
169
-
170
- DLPack version used must be ``0.2 <= DLPACK_VERSION < 1.0 ``. For further
171
- details on DLPack design and how to implement support for it,
172
- refer to `github.com/dmlc/dlpack <https://github.com/dmlc/dlpack >`_.
173
-
174
- .. warning ::
175
- DLPack contains a ``device_id ``, which will be the device
176
- ID (an integer, ``0, 1, ... ``) which the producer library uses. In
177
- practice this will likely be the same numbering as that of the
178
- consumer, however that is not guaranteed. Depending on the hardware
179
- type, it may be possible for the consumer library implementation to
180
- look up the actual device from the pointer to the data - this is
181
- possible for example for CUDA device pointers.
182
-
183
- It is recommended that implementers of this array API consider and document
184
- whether the ``.device `` attribute of the array returned from ``from_dlpack `` is
185
- guaranteed to be in a certain order or not.
0 commit comments