Skip to content

Commit 6d8492d

Browse files
Merge branch 'main' into fsdp-tutorial-update
2 parents 3e4b84b + a0a9e3b commit 6d8492d

24 files changed

+767
-69
lines changed

.jenkins/build.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@ sudo apt-get install -y pandoc
2121

2222
#Install PyTorch Nightly for test.
2323
# Nightly - pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html
24-
# Install 2.2 for testing - uncomment to install nightly binaries (update the version as needed).
25-
# pip uninstall -y torch torchvision torchaudio torchtext torchdata
26-
# pip3 install torch==2.3.0 torchvision torchaudio --no-cache-dir --index-url https://download.pytorch.org/whl/test/cu121
24+
# Install 2.4 to merge all 2.4 PRs - uncomment to install nightly binaries (update the version as needed).
25+
pip uninstall -y torch torchvision torchaudio torchtext torchdata
26+
pip3 install torch==2.4.0 torchvision torchaudio --no-cache-dir --index-url https://download.pytorch.org/whl/test/cu124
2727

2828
# Install two language tokenizers for Translation with TorchText tutorial
2929
python -m spacy download en_core_web_sm

.jenkins/validate_tutorials_built.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@
2929
"intermediate_source/fx_conv_bn_fuser",
3030
"intermediate_source/_torch_export_nightly_tutorial", # does not work on release
3131
"advanced_source/super_resolution_with_onnxruntime",
32-
"advanced_source/python_custom_ops", # https://github.com/pytorch/pytorch/issues/127443
3332
"advanced_source/usb_semisup_learn", # fails with CUDA OOM error, should try on a different worker
3433
"prototype_source/fx_graph_mode_ptq_dynamic",
3534
"prototype_source/vmap_recipe",
@@ -54,8 +53,6 @@
5453
"intermediate_source/flask_rest_api_tutorial",
5554
"intermediate_source/text_to_speech_with_torchaudio",
5655
"intermediate_source/tensorboard_profiler_tutorial", # reenable after 2.0 release.
57-
"intermediate_source/inductor_debug_cpu", # reenable after 2942
58-
"beginner_source/onnx/onnx_registry_tutorial", # reenable after 2941 is fixed.
5956
"intermediate_source/torch_export_tutorial" # reenable after 2940 is fixed.
6057
]
6158

Loading
423 KB
Loading
3.52 KB
Loading
-22.1 KB
Binary file not shown.
19 KB
Loading

advanced_source/cpp_custom_ops.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -417,4 +417,4 @@ Conclusion
417417
In this tutorial, we went over the recommended approach to integrating Custom C++
418418
and CUDA operators with PyTorch. The ``TORCH_LIBRARY/torch.library`` APIs are fairly
419419
low-level. For more information about how to use the API, see
420-
`The Custom Operators Manual <https://pytorch.org/docs/main/notes/custom_operators.html>`_.
420+
`The Custom Operators Manual <https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html#the-custom-operators-manual>`_.

advanced_source/cpp_extension.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@ Custom C++ and CUDA Extensions
22
==============================
33
**Author**: `Peter Goldsborough <https://www.goldsborough.me/>`_
44

5+
.. warning::
6+
7+
This tutorial is deprecated as of PyTorch 2.4. Please see :ref:`custom-ops-landing-page`
8+
for the newest up-to-date guides on extending PyTorch with Custom C++/CUDA Extensions.
59

610
PyTorch provides a plethora of operations related to neural networks, arbitrary
711
tensor algebra, data wrangling and other purposes. However, you may still find
@@ -225,7 +229,7 @@ Instead of:
225229
Currently open issue for nvcc bug `here
226230
<https://github.com/pytorch/pytorch/issues/69460>`_.
227231
Complete workaround code example `here
228-
<https://github.com/facebookresearch/pytorch3d/commit/cb170ac024a949f1f9614ffe6af1c38d972f7d48>`_.
232+
<https://github.com/facebookresearch/pytorch3d/commit/cb170ac024a949f1f9614ffe6af1c38d972f7d48>`_.
229233

230234
Forward Pass
231235
************

advanced_source/custom_ops_landing_page.rst

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _custom-ops-landing-page:
22

3-
PyTorch Custom Operators Landing Page
4-
=====================================
3+
PyTorch Custom Operators
4+
===========================
55

66
PyTorch offers a large library of operators that work on Tensors (e.g. ``torch.add``,
77
``torch.sum``, etc). However, you may wish to bring a new custom operation to PyTorch
@@ -10,26 +10,27 @@ In order to do so, you must register the custom operation with PyTorch via the P
1010
`torch.library docs <https://pytorch.org/docs/stable/library.html>`_ or C++ ``TORCH_LIBRARY``
1111
APIs.
1212

13-
TL;DR
14-
-----
13+
1514

1615
Authoring a custom operator from Python
1716
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1817

1918
Please see :ref:`python-custom-ops-tutorial`.
2019

2120
You may wish to author a custom operator from Python (as opposed to C++) if:
21+
2222
- you have a Python function you want PyTorch to treat as an opaque callable, especially with
23-
respect to ``torch.compile`` and ``torch.export``.
23+
respect to ``torch.compile`` and ``torch.export``.
2424
- you have some Python bindings to C++/CUDA kernels and want those to compose with PyTorch
25-
subsystems (like ``torch.compile`` or ``torch.autograd``)
25+
subsystems (like ``torch.compile`` or ``torch.autograd``)
2626

2727
Integrating custom C++ and/or CUDA code with PyTorch
2828
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2929

3030
Please see :ref:`cpp-custom-ops-tutorial`.
3131

3232
You may wish to author a custom operator from C++ (as opposed to Python) if:
33+
3334
- you have custom C++ and/or CUDA code.
3435
- you plan to use this code with ``AOTInductor`` to do Python-less inference.
3536

advanced_source/dispatcher.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
Registering a Dispatched Operator in C++
22
========================================
33

4+
.. warning::
5+
6+
This tutorial is deprecated as of PyTorch 2.4. Please see :ref:`custom-ops-landing-page`
7+
for the newest up-to-date guides on extending PyTorch with Custom Operators.
8+
49
The dispatcher is an internal component of PyTorch which is responsible for
510
figuring out what code should actually get run when you call a function like
611
``torch::add``. This can be nontrivial, because PyTorch operations need

advanced_source/python_custom_ops.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -260,5 +260,5 @@ def f(x):
260260
# For more detailed information, see:
261261
#
262262
# - `the torch.library documentation <https://pytorch.org/docs/stable/library.html>`_
263-
# - `the Custom Operators Manual <https://pytorch.org/docs/main/notes/custom_operators.html>`_
263+
# - `the Custom Operators Manual <https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html#the-custom-operators-manual>`_
264264
#

advanced_source/torch_script_custom_ops.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
Extending TorchScript with Custom C++ Operators
22
===============================================
33

4+
.. warning::
5+
6+
This tutorial is deprecated as of PyTorch 2.4. Please see :ref:`custom-ops-landing-page`
7+
for the newest up-to-date guides on PyTorch Custom Operators.
8+
49
The PyTorch 1.0 release introduced a new programming model to PyTorch called
510
`TorchScript <https://pytorch.org/docs/master/jit.html>`_. TorchScript is a
611
subset of the Python programming language which can be parsed, compiled and

beginner_source/onnx/onnx_registry_tutorial.py

Lines changed: 11 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,6 @@ def forward(self, input_x, input_y):
9999
# NOTE: All attributes must be annotated with type hints.
100100
@onnxscript.script(custom_aten)
101101
def custom_aten_add(input_x, input_y, alpha: float = 1.0):
102-
alpha = opset18.CastLike(alpha, input_y)
103102
input_y = opset18.Mul(input_y, alpha)
104103
return opset18.Add(input_x, input_y)
105104

@@ -130,9 +129,9 @@ def custom_aten_add(input_x, input_y, alpha: float = 1.0):
130129
# graph node name is the function name
131130
assert onnx_program.model_proto.graph.node[0].op_type == "custom_aten_add"
132131
# function node domain is empty because we use standard ONNX operators
133-
assert onnx_program.model_proto.functions[0].node[3].domain == ""
132+
assert {node.domain for node in onnx_program.model_proto.functions[0].node} == {""}
134133
# function node name is the standard ONNX operator name
135-
assert onnx_program.model_proto.functions[0].node[3].op_type == "Add"
134+
assert {node.op_type for node in onnx_program.model_proto.functions[0].node} == {"Add", "Mul", "Constant"}
136135

137136

138137
######################################################################
@@ -231,33 +230,25 @@ def custom_aten_gelu(input_x, approximate: str = "none"):
231230

232231

233232
######################################################################
234-
# Let's inspect the model and verify the model uses :func:`custom_aten_gelu` instead of
235-
# :class:`aten::gelu`. Note the graph has one graph nodes for
236-
# ``custom_aten_gelu``, and inside ``custom_aten_gelu``, there is a function
237-
# node for ``Gelu`` with namespace ``com.microsoft``.
233+
# Let's inspect the model and verify the model uses op_type ``Gelu``
234+
# from namespace ``com.microsoft``.
235+
#
236+
# .. note::
237+
# :func:`custom_aten_gelu` does not exist in the graph because
238+
# functions with fewer than three operators are inlined automatically.
238239
#
239240

240241
# graph node domain is the custom domain we registered
241242
assert onnx_program.model_proto.graph.node[0].domain == "com.microsoft"
242243
# graph node name is the function name
243-
assert onnx_program.model_proto.graph.node[0].op_type == "custom_aten_gelu"
244-
# function node domain is the custom domain we registered
245-
assert onnx_program.model_proto.functions[0].node[0].domain == "com.microsoft"
246-
# function node name is the node name used in the function
247-
assert onnx_program.model_proto.functions[0].node[0].op_type == "Gelu"
244+
assert onnx_program.model_proto.graph.node[0].op_type == "Gelu"
248245

249246

250247
######################################################################
251-
# The following diagram shows ``custom_aten_gelu_model`` ONNX graph using Netron:
248+
# The following diagram shows ``custom_aten_gelu_model`` ONNX graph using Netron,
249+
# we can see the ``Gelu`` node from module ``com.microsoft`` used in the function:
252250
#
253251
# .. image:: /_static/img/onnx/custom_aten_gelu_model.png
254-
# :width: 70%
255-
# :align: center
256-
#
257-
# Inside the ``custom_aten_gelu`` function, we can see the ``Gelu`` node from module
258-
# ``com.microsoft`` used in the function:
259-
#
260-
# .. image:: /_static/img/onnx/custom_aten_gelu_function.png
261252
#
262253
# That is all we need to do. As an additional step, we can use ONNX Runtime to run the model,
263254
# and compare the results with PyTorch.

index.rst

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@ Welcome to PyTorch Tutorials
33

44
**What's new in PyTorch tutorials?**
55

6-
* `Using User-Defined Triton Kernels with torch.compile <https://pytorch.org/tutorials/recipes/torch_compile_user_defined_triton_kernel_tutorial.html>`__
7-
* `Large Scale Transformer model training with Tensor Parallel (TP) <https://pytorch.org/tutorials/intermediate/TP_tutorial.html>`__
8-
* `Accelerating BERT with semi-structured (2:4) sparsity <https://pytorch.org/tutorials/advanced/semi_structured_sparse.html>`__
9-
* `torch.export Tutorial with torch.export.Dim <https://pytorch.org/tutorials/intermediate/torch_export_tutorial.html>`__
10-
* `Extension points in nn.Module for load_state_dict and tensor subclasses <https://pytorch.org/tutorials/recipes/recipes/swap_tensors.html>`__
6+
* `Introduction to Distributed Pipeline Parallelism <https://pytorch.org/tutorials/intermediate/pipelining_tutorial.html>`__
7+
* `Introduction to Libuv TCPStore Backend <https://pytorch.org/tutorials/intermediate/TCPStore_libuv_backend.html>`__
8+
* `Asynchronous Saving with Distributed Checkpoint (DCP) <https://pytorch.org/tutorials/recipes/distributed_async_checkpoint_recipe.html>`__
9+
* `Python Custom Operators <https://pytorch.org/tutorials/advanced/python_custom_ops.html>`__
10+
* Updated `Getting Started with DeviceMesh <https://pytorch.org/tutorials/recipes/distributed_device_mesh.html>`__
1111

1212
.. raw:: html
1313

@@ -779,6 +779,13 @@ Welcome to PyTorch Tutorials
779779
:link: intermediate/FSDP_adavnced_tutorial.html
780780
:tags: Parallel-and-Distributed-Training
781781

782+
.. customcarditem::
783+
:header: Introduction to Libuv TCPStore Backend
784+
:card_description: TCPStore now uses a new server backend for faster connection and better scalability.
785+
:image: _static/img/thumbnails/cropped/Introduction-to-Libuv-Backend-TCPStore.png
786+
:link: intermediate/TCPStore_libuv_backend.html
787+
:tags: Parallel-and-Distributed-Training
788+
782789
.. Edge
783790
784791
.. customcarditem::
@@ -1134,6 +1141,7 @@ Additional Resources
11341141
intermediate/dist_tuto
11351142
intermediate/FSDP_tutorial
11361143
intermediate/FSDP_adavnced_tutorial
1144+
intermediate/TCPStore_libuv_backend
11371145
intermediate/TP_tutorial
11381146
intermediate/pipelining_tutorial
11391147
intermediate/process_group_cpp_extension_tutorial

0 commit comments

Comments
 (0)