Skip to content

Commit cf0aca5

Browse files
authored
Merge branch 'main' into angelayi/aoti_fix
2 parents 805a1b2 + d4c1e74 commit cf0aca5

30 files changed

+867
-407
lines changed

.ci/docker/requirements.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ tensorboard
2828
jinja2==3.1.3
2929
pytorch-lightning
3030
torchx
31-
torchrl==0.5.0
32-
tensordict==0.5.0
31+
torchrl==0.6.0
32+
tensordict==0.6.0
3333
ax-platform>=0.4.0
3434
nbformat>=5.9.2
3535
datasets

.jenkins/build.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,10 @@ sudo apt-get install -y pandoc
2222
#Install PyTorch Nightly for test.
2323
# Nightly - pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html
2424
# Install 2.5 to merge all 2.4 PRs - uncomment to install nightly binaries (update the version as needed).
25-
# pip uninstall -y torch torchvision torchaudio torchtext torchdata
26-
# pip3 install torch==2.5.0 torchvision torchaudio --no-cache-dir --index-url https://download.pytorch.org/whl/test/cu124
25+
sudo pip uninstall -y torch torchvision torchaudio torchtext torchdata
26+
sudo pip3 install torch==2.6.0 torchvision --no-cache-dir --index-url https://download.pytorch.org/whl/test/cu124
27+
sudo pip uninstall -y fbgemm-gpu torchrec
28+
sudo pip3 install fbgemm-gpu==1.1.0 torchrec==1.0.0 --no-cache-dir --index-url https://download.pytorch.org/whl/test/cu124
2729

2830
# Install two language tokenizers for Translation with TorchText tutorial
2931
python -m spacy download en_core_web_sm

.jenkins/validate_tutorials_built.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,10 @@
5151
"intermediate_source/flask_rest_api_tutorial",
5252
"intermediate_source/text_to_speech_with_torchaudio",
5353
"intermediate_source/tensorboard_profiler_tutorial", # reenable after 2.0 release.
54-
"intermediate_source/torch_export_tutorial" # reenable after 2940 is fixed.
54+
"intermediate_source/torch_export_tutorial", # reenable after 2940 is fixed.
55+
"advanced_source/pendulum",
56+
"beginner_source/onnx/export_simple_model_to_onnx_tutorial",
57+
"beginner_source/onnx/onnx_registry_tutorial"
5558
]
5659

5760
def tutorial_source_dirs() -> List[Path]:

.lycheeignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,9 @@ https://pytorch.org/tutorials/beginner/colab/n
1212

1313
# Ignore local host link from intermediate_source/tensorboard_tutorial.rst
1414
http://localhost:6006
15+
16+
# Ignore local host link from recipes_source/deployment_with_flask.rst
17+
http://localhost:5000/predict
18+
19+
# Ignore local host link from advanced_source/cpp_frontend.rst
20+
https://www.uber.com/blog/deep-neuroevolution/

CONTRIBUTING.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -218,9 +218,8 @@ described in the preceding sections:
218218
- [NLP From Scratch: Generating Names with a Character-Level RNN
219219
Tutorial](https://pytorch.org/tutorials/intermediate/char_rnn_generation_tutorial.html)
220220

221-
If you are creating a recipe, we recommend that you use [this
222-
template](https://github.com/pytorch/tutorials/blob/tutorials_refresh/recipes_source/recipes/example_recipe.py)
223-
as a guide.
221+
If you are creating a recipe, [this is a good
222+
example.](https://github.com/pytorch/tutorials/blob/main/recipes_source/recipes/what_is_state_dict.py)
224223

225224

226225
# Submission Process #

advanced_source/cpp_autograd.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -255,9 +255,9 @@ Out:
255255
[ CPUFloatType{3,4} ]
256256
257257
Please see the documentation for ``torch::autograd::backward``
258-
(`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1afa9b5d4329085df4b6b3d4b4be48914b.html>`_)
258+
(`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1a1403bf65b1f4f8c8506a9e6e5312d030.html>`_)
259259
and ``torch::autograd::grad``
260-
(`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1a1e03c42b14b40c306f9eb947ef842d9c.html>`_)
260+
(`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1ab9fa15dc09a8891c26525fb61d33401a.html>`_)
261261
for more information on how to use them.
262262

263263
Using custom autograd function in C++
@@ -394,9 +394,9 @@ C++ using the following table:
394394
+--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
395395
| Python | C++ |
396396
+================================+========================================================================================================================================================================+
397-
| ``torch.autograd.backward`` | ``torch::autograd::backward`` (`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1afa9b5d4329085df4b6b3d4b4be48914b.html>`_) |
397+
| ``torch.autograd.backward`` | ``torch::autograd::backward`` (`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1a1403bf65b1f4f8c8506a9e6e5312d030.html>`_) |
398398
+--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
399-
| ``torch.autograd.grad`` | ``torch::autograd::grad`` (`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1a1e03c42b14b40c306f9eb947ef842d9c.html>`_) |
399+
| ``torch.autograd.grad`` | ``torch::autograd::grad`` (`link <https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1ab9fa15dc09a8891c26525fb61d33401a.html>`_) |
400400
+--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
401401
| ``torch.Tensor.detach`` | ``torch::Tensor::detach`` (`link <https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor6detachEv>`_) |
402402
+--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

advanced_source/cpp_custom_ops.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@ Custom C++ and CUDA Operators
1919
* PyTorch 2.4 or later
2020
* Basic understanding of C++ and CUDA programming
2121

22+
.. note::
23+
24+
This tutorial will also work on AMD ROCm with no additional modifications.
25+
2226
PyTorch offers a large library of operators that work on Tensors (e.g. torch.add, torch.sum, etc).
2327
However, you may wish to bring a new custom operator to PyTorch. This tutorial demonstrates the
2428
blessed path to authoring a custom operator written in C++/CUDA.

advanced_source/cpp_frontend.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ the right tool for the job. Examples for such environments include:
5757
Multiprocessing is an alternative, but not as scalable and has significant
5858
shortcomings. C++ has no such constraints and threads are easy to use and
5959
create. Models requiring heavy parallelization, like those used in `Deep
60-
Neuroevolution <https://eng.uber.com/deep-neuroevolution/>`_, can benefit from
60+
Neuroevolution <https://www.uber.com/blog/deep-neuroevolution/>`_, can benefit from
6161
this.
6262
- **Existing C++ Codebases**: You may be the owner of an existing C++
6363
application doing anything from serving web pages in a backend server to
@@ -662,7 +662,7 @@ Defining the DCGAN Modules
662662
We now have the necessary background and introduction to define the modules for
663663
the machine learning task we want to solve in this post. To recap: our task is
664664
to generate images of digits from the `MNIST dataset
665-
<http://yann.lecun.com/exdb/mnist/>`_. We want to use a `generative adversarial
665+
<https://huggingface.co/datasets/ylecun/mnist>`_. We want to use a `generative adversarial
666666
network (GAN)
667667
<https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf>`_ to solve
668668
this task. In particular, we'll use a `DCGAN architecture

advanced_source/custom_ops_landing_page.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ You may wish to author a custom operator from Python (as opposed to C++) if:
2323
respect to ``torch.compile`` and ``torch.export``.
2424
- you have some Python bindings to C++/CUDA kernels and want those to compose with PyTorch
2525
subsystems (like ``torch.compile`` or ``torch.autograd``)
26+
- you are using Python (and not a C++-only environment like AOTInductor).
2627

2728
Integrating custom C++ and/or CUDA code with PyTorch
2829
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

advanced_source/pendulum.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,9 @@
3333
3434
In the process, we will touch three crucial components of TorchRL:
3535
36-
* `environments <https://pytorch.org/rl/reference/envs.html>`__
37-
* `transforms <https://pytorch.org/rl/reference/envs.html#transforms>`__
38-
* `models (policy and value function) <https://pytorch.org/rl/reference/modules.html>`__
36+
* `environments <https://pytorch.org/rl/stable/reference/envs.html>`__
37+
* `transforms <https://pytorch.org/rl/stable/reference/envs.html#transforms>`__
38+
* `models (policy and value function) <https://pytorch.org/rl/stable/reference/modules.html>`__
3939
4040
"""
4141

@@ -384,7 +384,7 @@ def _reset(self, tensordict):
384384
# convenient shortcuts to the content of the output and input spec containers.
385385
#
386386
# TorchRL offers multiple :class:`~torchrl.data.TensorSpec`
387-
# `subclasses <https://pytorch.org/rl/reference/data.html#tensorspec>`_ to
387+
# `subclasses <https://pytorch.org/rl/stable/reference/data.html#tensorspec>`_ to
388388
# encode the environment's input and output characteristics.
389389
#
390390
# Specs shape

advanced_source/python_custom_ops.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,12 @@
3030
into the function).
3131
- Adding training support to an arbitrary Python function
3232
33+
Use :func:`torch.library.custom_op` to create Python custom operators.
34+
Use the C++ ``TORCH_LIBRARY`` APIs to create C++ custom operators (these
35+
work in Python-less environments).
36+
See the `Custom Operators Landing Page <https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html>`_
37+
for more details.
38+
3339
Please note that if your operation can be expressed as a composition of
3440
existing PyTorch operators, then there is usually no need to use the custom operator
3541
API -- everything (for example ``torch.compile``, training support) should

beginner_source/blitz/autograd_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -191,15 +191,15 @@
191191
# .. math::
192192
#
193193
#
194-
# J^{T}\cdot \vec{v} = m \cdot \left(\begin{array}{ccc}
194+
# J^{T}\cdot \vec{v} = \left(\begin{array}{ccc}
195195
# \frac{\partial y_{1}}{\partial x_{1}} & \cdots & \frac{\partial y_{m}}{\partial x_{1}}\\
196196
# \vdots & \ddots & \vdots\\
197197
# \frac{\partial y_{1}}{\partial x_{n}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}}
198198
# \end{array}\right)\left(\begin{array}{c}
199199
# \frac{\partial l}{\partial y_{1}}\\
200200
# \vdots\\
201201
# \frac{\partial l}{\partial y_{m}}
202-
# \end{array}\right) = m \cdot \left(\begin{array}{c}
202+
# \end{array}\right) = \left(\begin{array}{c}
203203
# \frac{\partial l}{\partial x_{1}}\\
204204
# \vdots\\
205205
# \frac{\partial l}{\partial x_{n}}

beginner_source/onnx/README.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ ONNX
33

44
1. intro_onnx.py
55
Introduction to ONNX
6-
https://pytorch.org/tutorials/onnx/intro_onnx.html
6+
https://pytorch.org/tutorials/beginner/onnx/intro_onnx.html
77

88
2. export_simple_model_to_onnx_tutorial.py
99
Exporting a PyTorch model to ONNX

beginner_source/pytorch_with_examples.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ which will be optimized during learning.
149149

150150
In TensorFlow, packages like
151151
`Keras <https://github.com/fchollet/keras>`__,
152-
`TensorFlow-Slim <https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim>`__,
152+
`TensorFlow-Slim <https://github.com/google-research/tf-slim>`__,
153153
and `TFLearn <http://tflearn.org/>`__ provide higher-level abstractions
154154
over raw computational graphs that are useful for building neural
155155
networks.

en-wordlist.txt

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,8 @@ FX
8181
FX's
8282
FairSeq
8383
Fastpath
84+
FakeTensor
85+
FakeTensors
8486
FFN
8587
FloydHub
8688
FloydHub's
@@ -368,6 +370,8 @@ downsample
368370
downsamples
369371
dropdown
370372
dtensor
373+
dtype
374+
dtypes
371375
duration
372376
elementwise
373377
embeddings
@@ -392,6 +396,8 @@ FlexAttention
392396
fp
393397
frontend
394398
functionalized
399+
functionalizes
400+
functionalization
395401
functorch
396402
fuser
397403
geomean
@@ -613,6 +619,7 @@ triton
613619
uint
614620
UX
615621
umap
622+
unbacked
616623
uncomment
617624
uncommented
618625
underflowing
@@ -649,7 +656,6 @@ RecSys
649656
TorchRec
650657
sharding
651658
TBE
652-
dtype
653659
EBC
654660
sharder
655661
hyperoptimized

intermediate_source/FSDP_tutorial.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ It also comes with considerable engineering complexity to handle the training of
1111
`PyTorch FSDP <https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/>`__, released in PyTorch 1.11 makes this easier.
1212

1313
In this tutorial, we show how to use `FSDP APIs <https://pytorch.org/docs/stable/fsdp.html>`__, for simple MNIST models that can be extended to other larger models such as `HuggingFace BERT models <https://huggingface.co/blog/zero-deepspeed-fairscale>`__,
14-
`GPT 3 models up to 1T parameters <https://pytorch.medium.com/training-a-1-trillion-parameter-model-with-pytorch-fully-sharded-data-parallel-on-aws-3ac13aa96cff>`__ . The sample DDP MNIST code has been borrowed from `here <https://github.com/yqhu/mnist_examples>`__.
14+
`GPT 3 models up to 1T parameters <https://pytorch.medium.com/training-a-1-trillion-parameter-model-with-pytorch-fully-sharded-data-parallel-on-aws-3ac13aa96cff>`__ . The sample DDP MNIST code courtesy of `Patrick Hu <https://github.com/yqhu/>`_.
1515

1616

1717
How FSDP works
@@ -251,6 +251,7 @@ We add the following code snippets to a python script “FSDP_mnist.py”.
251251
init_end_event.record()
252252
253253
if rank == 0:
254+
init_end_event.synchronize()
254255
print(f"CUDA event elapsed time: {init_start_event.elapsed_time(init_end_event) / 1000}sec")
255256
print(f"{model}")
256257

intermediate_source/ddp_series_minGPT.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ training <ddp_series_multinode.html>`__ \|\| **minGPT Training**
66
Training “real-world” models with DDP
77
=====================================
88

9-
Authors: `Suraj Subramanian <https://github.com/suraj813>`__
9+
Authors: `Suraj Subramanian <https://github.com/subramen>`__
1010

1111
.. grid:: 2
1212

intermediate_source/ddp_series_multinode.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ training** \|\| `minGPT Training <ddp_series_minGPT.html>`__
66
Multinode Training
77
==================
88

9-
Authors: `Suraj Subramanian <https://github.com/suraj813>`__
9+
Authors: `Suraj Subramanian <https://github.com/subramen>`__
1010

1111
.. grid:: 2
1212

intermediate_source/dist_tuto.rst

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ the following template.
4747
"""run.py:"""
4848
#!/usr/bin/env python
4949
import os
50+
import sys
5051
import torch
5152
import torch.distributed as dist
5253
import torch.multiprocessing as mp
@@ -66,8 +67,12 @@ the following template.
6667
if __name__ == "__main__":
6768
world_size = 2
6869
processes = []
69-
mp.set_start_method("spawn")
70-
for rank in range(world_size):
70+
if "google.colab" in sys.modules:
71+
print("Running in Google Colab")
72+
mp.get_context("spawn")
73+
else:
74+
mp.set_start_method("spawn")
75+
for rank in range(size):
7176
p = mp.Process(target=init_process, args=(rank, world_size, run))
7277
p.start()
7378
processes.append(p)
@@ -156,7 +161,8 @@ we should not modify the sent tensor nor access the received tensor before ``req
156161
In other words,
157162

158163
- writing to ``tensor`` after ``dist.isend()`` will result in undefined behaviour.
159-
- reading from ``tensor`` after ``dist.irecv()`` will result in undefined behaviour.
164+
- reading from ``tensor`` after ``dist.irecv()`` will result in undefined
165+
behaviour, until ``req.wait()`` has been executed.
160166

161167
However, after ``req.wait()``
162168
has been executed we are guaranteed that the communication took place,

intermediate_source/dynamic_quantization_bert_tutorial.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ the following helper functions: one for converting the text examples
138138
into the feature vectors; The other one for measuring the F1 score of
139139
the predicted result.
140140

141-
The `glue_convert_examples_to_features <https://github.com/huggingface/transformers/blob/master/transformers/data/processors/glue.py>`_ function converts the texts into input features:
141+
The `glue_convert_examples_to_features <https://github.com/huggingface/transformers/blob/main/src/transformers/data/datasets/glue.py>`_ function converts the texts into input features:
142142

143143
- Tokenize the input sequences;
144144
- Insert [CLS] in the beginning;
@@ -147,7 +147,7 @@ The `glue_convert_examples_to_features <https://github.com/huggingface/transform
147147
- Generate token type ids to indicate whether a token belongs to the
148148
first sequence or the second sequence.
149149

150-
The `glue_compute_metrics <https://github.com/huggingface/transformers/blob/master/transformers/data/processors/glue.py>`_ function has the compute metrics with
150+
The `glue_compute_metrics <https://github.com/huggingface/transformers/blob/main/src/transformers/data/metrics/__init__.py#L60>`_ function has the compute metrics with
151151
the `F1 score <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html>`_, which
152152
can be interpreted as a weighted average of the precision and recall,
153153
where an F1 score reaches its best value at 1 and worst score at 0. The
@@ -273,7 +273,7 @@ We load the tokenizer and fine-tuned BERT sequence classifier model
273273
2.3 Define the tokenize and evaluation function
274274
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
275275

276-
We reuse the tokenize and evaluation function from `HuggingFace <https://github.com/huggingface/transformers/blob/master/examples/run_glue.py>`_.
276+
We reuse the tokenize and evaluation function from `HuggingFace <https://github.com/huggingface/transformers/blob/main/examples/legacy/pytorch-lightning/run_glue.py>`_.
277277

278278
.. code:: python
279279

intermediate_source/inductor_debug_cpu.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@
1919
#
2020
# Meanwhile, you may also find related tutorials about ``torch.compile``
2121
# around `basic usage <https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html>`_,
22-
# comprehensive `troubleshooting <https://pytorch.org/docs/stable/dynamo/troubleshooting.html>`_
23-
# and GPU-specific knowledge like `GPU performance profiling <https://github.com/pytorch/pytorch/blob/main/docs/source/compile/profiling_torch_compile.rst>`_.
22+
# comprehensive `troubleshooting <https://pytorch.org/docs/stable/torch.compiler_troubleshooting.html>`_
23+
# and GPU-specific knowledge like `GPU performance profiling <https://pytorch.org/docs/stable/torch.compiler_inductor_profiling.html>`_.
2424
#
2525
# We will start debugging with a motivating example that triggers compilation issues and accuracy problems
2626
# by demonstrating the process of debugging to pinpoint the problems.
@@ -343,7 +343,7 @@ def forward2(self, arg0_1):
343343
return (neg,)
344344

345345
######################################################################
346-
# For more usage details about Minifier, please refer to `Troubleshooting <https://pytorch.org/docs/stable/dynamo/troubleshooting.html>`_.
346+
# For more usage details about Minifier, please refer to `Troubleshooting <https://pytorch.org/docs/stable/torch.compiler_troubleshooting.html>`_.
347347

348348

349349
######################################################################

intermediate_source/reinforcement_ppo.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -639,7 +639,7 @@
639639
# number of steps (1000, which is our ``env`` horizon).
640640
# The ``rollout`` method of the ``env`` can take a policy as argument:
641641
# it will then execute this policy at each step.
642-
with set_exploration_type(ExplorationType.MEAN), torch.no_grad():
642+
with set_exploration_type(ExplorationType.DETERMINISTIC), torch.no_grad():
643643
# execute a rollout with the trained policy
644644
eval_rollout = env.rollout(1000, policy_module)
645645
logs["eval reward"].append(eval_rollout["next", "reward"].mean().item())

0 commit comments

Comments
 (0)