Skip to content

Commit 14c7499

Browse files
authored
Merge branch 'main' into xmfan/ca_tutorial
2 parents 83d4665 + 19fffda commit 14c7499

15 files changed

+1550
-269
lines changed

.ci/docker/requirements.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,5 +68,7 @@ iopath
6868
pygame==2.6.0
6969
pycocotools
7070
semilearn==0.3.2
71-
torchao==0.0.3
71+
torchao==0.5.0
7272
segment_anything==1.0
73+
torchrec==0.8.0
74+
fbgemm-gpu==0.8.0

.jenkins/metadata.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,9 @@
2828
"intermediate_source/model_parallel_tutorial.py": {
2929
"needs": "linux.16xlarge.nvidia.gpu"
3030
},
31+
"intermediate_source/torchrec_intro_tutorial.py": {
32+
"needs": "linux.g5.4xlarge.nvidia.gpu"
33+
},
3134
"recipes_source/torch_export_aoti_python.py": {
3235
"needs": "linux.g5.4xlarge.nvidia.gpu"
3336
},

beginner_source/dist_overview.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ Sharding primitives
3535

3636
``DTensor`` and ``DeviceMesh`` are primitives used to build parallelism in terms of sharded or replicated tensors on N-dimensional process groups.
3737

38-
- `DTensor <https://github.com/pytorch/pytorch/blob/main/torch/distributed/_tensor/README.md>`__ represents a tensor that is sharded and/or replicated, and communicates automatically to reshard tensors as needed by operations.
38+
- `DTensor <https://github.com/pytorch/pytorch/blob/main/torch/distributed/tensor/README.md>`__ represents a tensor that is sharded and/or replicated, and communicates automatically to reshard tensors as needed by operations.
3939
- `DeviceMesh <https://pytorch.org/docs/stable/distributed.html#devicemesh>`__ abstracts the accelerator device communicators into a multi-dimensional array, which manages the underlying ``ProcessGroup`` instances for collective communications in multi-dimensional parallelisms. Try out our `Device Mesh Recipe <https://pytorch.org/tutorials/recipes/distributed_device_mesh.html>`__ to learn more.
4040

4141
Communications APIs

en-wordlist.txt

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -619,3 +619,32 @@ warmup
619619
webp
620620
wsi
621621
wsis
622+
Meta's
623+
RecSys
624+
TorchRec
625+
sharding
626+
TBE
627+
dtype
628+
EBC
629+
sharder
630+
hyperoptimized
631+
DMP
632+
unsharded
633+
lookups
634+
KJTs
635+
amongst
636+
async
637+
everytime
638+
prototyped
639+
GBs
640+
HBM
641+
gloo
642+
nccl
643+
Localhost
644+
gpu
645+
torchmetrics
646+
url
647+
colab
648+
sharders
649+
Criteo
650+
torchrec

index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -853,7 +853,7 @@ Welcome to PyTorch Tutorials
853853
:header: Introduction to TorchRec
854854
:card_description: TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale recommender systems.
855855
:image: _static/img/thumbnails/torchrec.png
856-
:link: intermediate/torchrec_tutorial.html
856+
:link: intermediate/torchrec_intro_tutorial.html
857857
:tags: TorchRec,Recommender
858858

859859
.. customcarditem::
@@ -1188,7 +1188,7 @@ Additional Resources
11881188
:hidden:
11891189
:caption: Recommendation Systems
11901190

1191-
intermediate/torchrec_tutorial
1191+
intermediate/torchrec_intro_tutorial
11921192
advanced/sharding
11931193

11941194
.. toctree::

intermediate_source/scaled_dot_product_attention_tutorial.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -244,7 +244,7 @@ def generate_rand_batch(
244244

245245
######################################################################
246246
# Using SDPA with ``torch.compile``
247-
# =================================
247+
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
248248
#
249249
# With the release of PyTorch 2.0, a new feature called
250250
# ``torch.compile()`` has been introduced, which can provide
@@ -324,9 +324,9 @@ def generate_rand_batch(
324324
#
325325

326326
######################################################################
327-
# Using SDPA with attn_bias subclasses`
328-
# ==========================================
329-
#
327+
# Using SDPA with attn_bias subclasses
328+
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
329+
330330
# As of PyTorch 2.3, we have added a new submodule that contains tensor subclasses.
331331
# Designed to be used with ``torch.nn.functional.scaled_dot_product_attention``.
332332
# The module is named ``torch.nn.attention.bias`` and contains the following two
@@ -394,7 +394,7 @@ def generate_rand_batch(
394394

395395
######################################################################
396396
# Conclusion
397-
# ==========
397+
# ~~~~~~~~~~~
398398
#
399399
# In this tutorial, we have demonstrated the basic usage of
400400
# ``torch.nn.functional.scaled_dot_product_attention``. We have shown how

0 commit comments

Comments
 (0)