Skip to content

Commit a0e31be

Browse files
committed
Fixed tutorial to be clearer and to recommend other path
1 parent 7d89c10 commit a0e31be

File tree

1 file changed

+125
-45
lines changed

1 file changed

+125
-45
lines changed

advanced_source/cpp_custom_ops.rst

Lines changed: 125 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -64,39 +64,76 @@ Using ``cpp_extension`` is as simple as writing the following ``setup.py``:
6464
ext_modules=[
6565
cpp_extension.CppExtension(
6666
"extension_cpp",
67-
["muladd.cpp"]
68-
py_limited_api=True)],
67+
["muladd.cpp"],
68+
# define Py_LIMITED_API with min version 3.9 to expose only the stable
69+
# limited API subset from Python.h
70+
extra_compile_args={"cxx": ["-DPy_LIMITED_API=0x03090000"]},
71+
py_limited_api=True)], # Build 1 wheel across multiple Python versions
6972
cmdclass={'build_ext': cpp_extension.BuildExtension},
70-
options={"bdist_wheel": {"py_limited_api": "cp39"}}
73+
options={"bdist_wheel": {"py_limited_api": "cp39"}} # 3.9 is minimum supported Python version
7174
)
7275
7376
If you need to compile CUDA code (for example, ``.cu`` files), then instead use
7477
`torch.utils.cpp_extension.CUDAExtension <https://pytorch.org/docs/stable/cpp_extension.html#torch.utils.cpp_extension.CUDAExtension>`_.
7578
Please see `extension-cpp <https://github.com/pytorch/extension-cpp>`_ for an
7679
example for how this is set up.
7780

78-
Note that you can build a single wheel for multiple CPython versions (similar to
79-
what you would do for pure python packages) starting with PyTorch 2.6. Specifically,
80-
if your custom library adheres to the `CPython Stable Limited API
81-
<https://docs.python.org/3/c-api/stable.html>`_ or avoids CPython entirely, you
82-
can build one Python agnostic wheel against a minimum supported CPython version
83-
through setuptools' ``py_limited_api`` flag.
81+
The above example represents what we refer to as a CPython agnostic wheel, meaning
82+
we are building a single wheel that can be run across multiple CPython versions (similar
83+
to pure Python packages). CPython agnosticism is desirable in minimizing the number of wheels your
84+
custom library needs to support and release. To achieve this, there are three key lines to note.
8485

85-
It is necessary to specify ``py_limited_api=True`` both within ``setup``
86-
and also as an option to the ``"bdist_wheel"`` command with the minimal supported
87-
Python version (in this case, 3.9). This ``setup`` would build one wheel that could
88-
be installed across multiple Python versions ``python>=3.9``.
86+
The first is the specification of ``Py_LIMITED_API`` in ``extra_compile_args`` to the
87+
minimum CPython version you would like to support:
8988

90-
.. note::
89+
.. code-block:: python
90+
extra_compile_args={"cxx": ["-DPy_LIMITED_API=0x03090000"]},
91+
92+
Defining the ``Py_LIMITED_API`` flag helps guarantee that the extension is in fact
93+
only using the `CPython Stable Limited API <https://docs.python.org/3/c-api/stable.html>`_,
94+
which is a requirement for the building a CPython agnostic wheel. If this requirement
95+
is not met, it is possible to build a wheel that looks CPython agnostic but will crash,
96+
or worse, be silently incorrect, in another CPython environment. Take care to avoid
97+
using unstable CPython APIs, for example APIs from libtorch_python (in particular
98+
pytorch/python bindings,) and to only use APIs from libtorch (ATen objects, operators
99+
and the dispatcher). We strongly recommend defining the ``Py_LIMITED_API`` flag to
100+
ensure the extension is compliant and safe as a CPython agnostic wheel.
101+
102+
The second and third lines inform setuptools that you intend to build a CPython agnostic
103+
wheel and will influence the naming of the wheel accordingly. It is necessary to specify
104+
``py_limited_api=True`` as an argument to CppExtension/CUDAExtension and also as an option
105+
to the ``"bdist_wheel"`` command with the minimal supported CPython version (in this case,
106+
3.9):
107+
108+
.. code-block:: python
109+
setup(name="extension_cpp",
110+
ext_modules=[
111+
cpp_extension.CppExtension(
112+
...,
113+
py_limited_api=True)], # Build 1 wheel across multiple Python versions
114+
...,
115+
options={"bdist_wheel": {"py_limited_api": "cp39"}} # 3.9 is minimum supported Python version
116+
)
117+
118+
This ``setup`` would build one wheel that could be installed across multiple CPython
119+
versions ``>=3.9``.
120+
121+
If your extension uses CPython APIs outside the stable limited set, then you should build
122+
a wheel per CPython version instead, like so:
123+
124+
.. code-block:: python
125+
126+
from setuptools import setup, Extension
127+
from torch.utils import cpp_extension
128+
129+
setup(name="extension_cpp",
130+
ext_modules=[
131+
cpp_extension.CppExtension(
132+
"extension_cpp",
133+
["muladd.cpp"])],
134+
cmdclass={'build_ext': cpp_extension.BuildExtension},
135+
)
91136
92-
You must verify independently that the built wheel is truly Python agnostic.
93-
Specifying ``py_limited_api`` does not check for any guarantees, so it is possible
94-
to build a wheel that looks Python agnostic but will crash, or worse, be silently
95-
incorrect, in another Python environment. Take care to avoid using unstable CPython
96-
APIs, for example APIs from libtorch_python (in particular pytorch/python bindings,)
97-
and to only use APIs from libtorch (ATen objects, operators and the dispatcher).
98-
For example, to give access to custom ops from Python, the library should register
99-
the ops through the dispatcher (covered below!).
100137
101138
Defining the custom op and adding backend implementations
102139
---------------------------------------------------------
@@ -241,15 +278,74 @@ matters (importing in the wrong order will lead to an error).
241278

242279
To use the custom operator with hybrid Python/C++ registrations, we must
243280
first load the C++ library that holds the custom operator definition
244-
and then call the ``torch.library`` registration APIs. This can happen in one
245-
of two ways:
281+
and then call the ``torch.library`` registration APIs. This can happen in
282+
three ways:
283+
284+
285+
1. The first way to load the C++ library that holds the custom operator definition
286+
is to define a dummy Python module for _C. Then, in Python, when you import the
287+
module with ``import _C``, the ``.so``s corresponding to the extension will be
288+
loaded and the ``TORCH_LIBRARY`` and ``TORCH_LIBRARY_IMPL`` static initializers
289+
will run. One can create a dummy Python module with ``PYBIND11_MODULE`` like below,
290+
but you will notice that this does not compile with ``Py_LIMITED_API``, because
291+
``pybind11`` does not promise to only use the stable limited CPython API! With
292+
the below code, you sadly cannot build a CPython agnostic wheel for your extension!
293+
(Foreshadowing: I wonder what the second way is ;)).
294+
295+
.. code-block:: cpp
296+
// in, say, not_agnostic/csrc/extension_BAD.cpp
297+
#include <pybind11/pybind11.h>
246298
299+
PYBIND11_MODULE("_C", m) {}
247300
248-
1. In this tutorial, our C++ custom operator is located in a shared library object,
249-
and we use ``torch.ops.load_library("/path/to/library.so")`` to load it. This
250-
is the blessed path for Python agnosticism, and you will not have a Python C
251-
extension module to import. See our `extension_cpp/__init__.py <https://github.com/pytorch/extension-cpp/blob/e4c4eb822889ea67f191071fa627d750e04bf047/extension_cpp/__init__.py>`_
252-
for an example:
301+
.. code-block:: python
302+
303+
# in, say, extension/__init__.py
304+
from . import _C
305+
306+
2. In this tutorial, because we value being able to build a single wheel across multiple
307+
CPython versions, we will replace the unstable ``PYBIND11`` call with stable API calls.
308+
The below code compiles with ``-DPy_LIMITED_API=0x03090000`` and successfully creates
309+
a dummy Python module for our ``_C`` extension so that it can be imported from Python.
310+
See `extension_cpp/__init__.py <https://github.com/pytorch/extension-cpp/blob/master/extension_cpp/__init__.py>`_
311+
and `extension_cpp/csrc/muladd.cpp <https://github.com/pytorch/extension-cpp/blob/master/extension_cpp/csrc/muladd.cpp>`_
312+
for more details:
313+
314+
.. code-block:: cpp
315+
#include <Python.h>
316+
317+
extern "C" {
318+
/* Creates a dummy empty _C module that can be imported from Python.
319+
The import from Python will load the .so consisting of this file
320+
in this extension, so that the TORCH_LIBRARY static initializers
321+
below are run. */
322+
PyObject* PyInit__C(void)
323+
{
324+
static struct PyModuleDef module_def = {
325+
PyModuleDef_HEAD_INIT,
326+
"_C", /* name of module */
327+
NULL, /* module documentation, may be NULL */
328+
-1, /* size of per-interpreter state of the module,
329+
or -1 if the module keeps state in global variables. */
330+
NULL, /* methods */
331+
};
332+
return PyModule_Create(&module_def);
333+
}
334+
}
335+
336+
.. code-block:: python
337+
338+
# in, say, extension/__init__.py
339+
from . import _C
340+
341+
3. If you want to avoid ``Python.h`` entirely in your C++ custom operator, you may
342+
use ``torch.ops.load_library("/path/to/library.so")`` in Python to load the ``.so``
343+
file(s) compiled from the extension. Note that, with this method, there is no ``_C``
344+
Python module created for the extension so you cannot call ``import _C`` from Python.
345+
Instead of relying on the import statement to trigger the custom operators to be
346+
registered, ``torch.ops.load_library("/path/to/library.so")`` will do the trick.
347+
The challenge then is shifted towards understanding where the ``.so`` files are
348+
located so that you can load them, which is not always trivial:
253349

254350
.. code-block:: python
255351
@@ -265,22 +361,6 @@ of two ways:
265361
from . import ops
266362
267363
268-
2. You may also see other custom extensions importing the Python C extension module.
269-
The module would be created in C++ and then imported in Python, like the code below.
270-
This code is not guaranteed to use the stable limited CPython API and would block
271-
your extension from building a Python-agnostic wheel! AVOID the following:
272-
273-
.. code-block:: cpp
274-
275-
// in, say, not_agnostic/csrc/extension_BAD.cpp
276-
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {}
277-
278-
.. code-block:: python
279-
280-
# in, say, extension_BAD/__init__.py
281-
from . import _C
282-
283-
284364
Adding training (autograd) support for an operator
285365
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
286366
Use ``torch.library.register_autograd`` to add training support for an operator. Prefer

0 commit comments

Comments
 (0)