Skip to content

Fix the Title: Underline too short warnings #2731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion advanced_source/ddp_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,7 @@ def evaluate(eval_model, data_source):

######################################################################
# Evaluate the model with the test dataset
# -------------------------------------
# ----------------------------------------
#
# Apply the best model to check the result with the test dataset.

Expand Down
2 changes: 1 addition & 1 deletion advanced_source/dispatcher.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ for debugging in larger models where previously it can be hard to pin-point
exactly where the ``requires_grad``-ness is lost during the forward pass.

In-place or view ops
^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^

To ensure correctness and best possible performance, if your op mutates an input
in-place or returns a tensor that aliases with one of the inputs, two additional
Expand Down
2 changes: 1 addition & 1 deletion advanced_source/usb_semisup_learn.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@

######################################################################
# Use USB to Train ``SoftMatch`` with specific imbalanced algorithm on imbalanced CIFAR-10
# ------------------------------------------------------------------------------------
# ----------------------------------------------------------------------------------------
#
# Now let's say we have imbalanced labeled set and unlabeled set of CIFAR-10,
# and we want to train a ``SoftMatch`` model on it.
Expand Down
4 changes: 2 additions & 2 deletions beginner_source/basics/autogradqs_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
`Save & Load Model <saveloadrun_tutorial.html>`_

Automatic Differentiation with ``torch.autograd``
=======================================
=================================================

When training neural networks, the most frequently used algorithm is
**back propagation**. In this algorithm, parameters (model weights) are
Expand Down Expand Up @@ -170,7 +170,7 @@

######################################################################
# Optional Reading: Tensor Gradients and Jacobian Products
# --------------------------------------
# --------------------------------------------------------
#
# In many cases, we have a scalar loss function, and we need to compute
# the gradient with respect to some parameters. However, there are cases
Expand Down
4 changes: 2 additions & 2 deletions beginner_source/basics/buildmodel_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
`Save & Load Model <saveloadrun_tutorial.html>`_

Build the Neural Network
===================
========================

Neural networks comprise of layers/modules that perform operations on data.
The `torch.nn <https://pytorch.org/docs/stable/nn.html>`_ namespace provides all the building blocks you need to
Expand Down Expand Up @@ -197,5 +197,5 @@ def forward(self, x):

#################################################################
# Further Reading
# --------------
# -----------------
# - `torch.nn API <https://pytorch.org/docs/stable/nn.html>`_
14 changes: 7 additions & 7 deletions beginner_source/basics/data_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
`Save & Load Model <saveloadrun_tutorial.html>`_

Datasets & DataLoaders
===================
======================

"""

Expand Down Expand Up @@ -69,7 +69,7 @@

#################################################################
# Iterating and Visualizing the Dataset
# -----------------
# -------------------------------------
#
# We can index ``Datasets`` manually like a list: ``training_data[index]``.
# We use ``matplotlib`` to visualize some samples in our training data.
Expand Down Expand Up @@ -144,7 +144,7 @@ def __getitem__(self, idx):


#################################################################
# __init__
# ``__init__``
# ^^^^^^^^^^^^^^^^^^^^
#
# The __init__ function is run once when instantiating the Dataset object. We initialize
Expand All @@ -167,7 +167,7 @@ def __init__(self, annotations_file, img_dir, transform=None, target_transform=N


#################################################################
# __len__
# ``__len__``
# ^^^^^^^^^^^^^^^^^^^^
#
# The __len__ function returns the number of samples in our dataset.
Expand All @@ -180,7 +180,7 @@ def __len__(self):


#################################################################
# __getitem__
# ``__getitem__``
# ^^^^^^^^^^^^^^^^^^^^
#
# The __getitem__ function loads and returns a sample from the dataset at the given index ``idx``.
Expand Down Expand Up @@ -220,7 +220,7 @@ def __getitem__(self, idx):

###########################
# Iterate through the DataLoader
# --------------------------
# -------------------------------
#
# We have loaded that dataset into the ``DataLoader`` and can iterate through the dataset as needed.
# Each iteration below returns a batch of ``train_features`` and ``train_labels`` (containing ``batch_size=64`` features and labels respectively).
Expand All @@ -243,5 +243,5 @@ def __getitem__(self, idx):

#################################################################
# Further Reading
# --------------
# ----------------
# - `torch.utils.data API <https://pytorch.org/docs/stable/data.html>`_
4 changes: 2 additions & 2 deletions beginner_source/basics/intro.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,15 @@


Running the Tutorial Code
------------------
-------------------------
You can run this tutorial in a couple of ways:

- **In the cloud**: This is the easiest way to get started! Each section has a "Run in Microsoft Learn" and "Run in Google Colab" link at the top, which opens an integrated notebook in Microsoft Learn or Google Colab, respectively, with the code in a fully-hosted environment.
- **Locally**: This option requires you to setup PyTorch and TorchVision first on your local machine (`installation instructions <https://pytorch.org/get-started/locally/>`_). Download the notebook or copy the code into your favorite IDE.


How to Use this Guide
-----------------
---------------------
If you're familiar with other deep learning frameworks, check out the `0. Quickstart <quickstart_tutorial.html>`_ first
to quickly familiarize yourself with PyTorch's API.

Expand Down
4 changes: 2 additions & 2 deletions beginner_source/basics/tensorqs_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@

######################################################################
# Attributes of a Tensor
# ~~~~~~~~~~~~~~~~~
# ~~~~~~~~~~~~~~~~~~~~~~
#
# Tensor attributes describe their shape, datatype, and the device on which they are stored.

Expand All @@ -97,7 +97,7 @@

######################################################################
# Operations on Tensors
# ~~~~~~~~~~~~~~~~~
# ~~~~~~~~~~~~~~~~~~~~~~~
#
# Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing,
# indexing, slicing), sampling and more are
Expand Down
4 changes: 2 additions & 2 deletions beginner_source/blitz/autograd_tutorial.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# -*- coding: utf-8 -*-
"""
A Gentle Introduction to ``torch.autograd``
---------------------------------
===========================================

``torch.autograd`` is PyTorch’s automatic differentiation engine that powers
neural network training. In this section, you will get a conceptual
Expand Down Expand Up @@ -149,7 +149,7 @@

######################################################################
# Optional Reading - Vector Calculus using ``autograd``
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Mathematically, if you have a vector valued function
# :math:`\vec{y}=f(\vec{x})`, then the gradient of :math:`\vec{y}` with
Expand Down
2 changes: 1 addition & 1 deletion beginner_source/blitz/cifar10_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def imshow(img):

########################################################################
# 2. Define a Convolutional Neural Network
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Copy the neural network from the Neural Networks section before and modify it to
# take 3-channel images (instead of 1-channel images as it was defined).

Expand Down
2 changes: 1 addition & 1 deletion beginner_source/blitz/tensor_tutorial.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""
Tensors
--------------------------------------------
========

Tensors are a specialized data structure that are very similar to arrays
and matrices. In PyTorch, we use tensors to encode the inputs and
Expand Down
4 changes: 0 additions & 4 deletions beginner_source/ddp_series_fault_tolerance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,11 +93,7 @@ In elastic training, whenever there are any membership changes (adding or removi
on available devices. Having this structure ensures your training job can continue without manual intervention.





Diff for `multigpu.py <https://github.com/pytorch/examples/blob/main/distributed/ddp-tutorial-series/multigpu.py>`__ v/s `multigpu_torchrun.py <https://github.com/pytorch/examples/blob/main/distributed/ddp-tutorial-series/multigpu_torchrun.py>`__
-----------------------------------------------------------

Process group initialization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
1 change: 0 additions & 1 deletion beginner_source/ddp_series_multigpu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ Along the way, we will talk through important concepts in distributed training w


Diff for `single_gpu.py <https://github.com/pytorch/examples/blob/main/distributed/ddp-tutorial-series/single_gpu.py>`__ v/s `multigpu.py <https://github.com/pytorch/examples/blob/main/distributed/ddp-tutorial-series/multigpu.py>`__
----------------------------------------------------

These are the changes you typically make to a single-GPU training script to enable DDP.

Expand Down
2 changes: 1 addition & 1 deletion beginner_source/dist_overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ throws an exception, it is likely to lead to desynchronization (mismatched
adds fault tolerance and the ability to make use of a dynamic pool of machines (elasticity).

RPC-Based Distributed Training
----------------------------
------------------------------

Many training paradigms do not fit into data parallelism, e.g.,
parameter server paradigm, distributed pipeline parallelism, reinforcement
Expand Down
2 changes: 1 addition & 1 deletion beginner_source/knowledge_distillation_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
# - How to improve the performance of lightweight models by using more complex models as teachers
#
# Prerequisites
# ~~~~~~~~~~~
# ~~~~~~~~~~~~~
#
# * 1 GPU, 4GB of memory
# * PyTorch v2.0 or later
Expand Down
2 changes: 1 addition & 1 deletion beginner_source/nn_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@

###############################################################################
# Neural net from scratch (without ``torch.nn``)
# ---------------------------------------------
# -----------------------------------------------
#
# Let's first create a model using nothing but PyTorch tensor operations. We're assuming
# you're already familiar with the basics of neural networks. (If you're not, you can
Expand Down
3 changes: 2 additions & 1 deletion beginner_source/profiler.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Profiling your PyTorch Module
------------
-----------------------------

**Author:** `Suraj Subramanian <https://github.com/suraj813>`_

PyTorch includes a profiler API that is useful to identify the time and
Expand Down
13 changes: 7 additions & 6 deletions beginner_source/pytorch_with_examples.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
Learning PyTorch with Examples
******************************
==============================

**Author**: `Justin Johnson <https://github.com/jcjohnson/pytorch-examples>`_

.. note::
Expand Down Expand Up @@ -29,7 +30,7 @@ between the network output and the true output.
:local:

Tensors
=======
~~~~~~~

Warm-up: numpy
--------------
Expand Down Expand Up @@ -74,7 +75,7 @@ and backward passes through the network:


Autograd
========
~~~~~~~~

PyTorch: Tensors and autograd
-------------------------------
Expand Down Expand Up @@ -133,7 +134,7 @@ our model:
.. includenodoc:: /beginner/examples_autograd/polynomial_custom_function.py

``nn`` module
===========
~~~~~~~~~~~~~

PyTorch: ``nn``
---------------
Expand Down Expand Up @@ -219,7 +220,7 @@ We can easily implement this model as a Module subclass:
.. _examples-download:

Examples
========
~~~~~~~~

You can browse the above examples here.

Expand Down Expand Up @@ -261,7 +262,7 @@ Autograd
<div style='clear:both'></div>

``nn`` module
-----------
--------------

.. toctree::
:maxdepth: 2
Expand Down
8 changes: 5 additions & 3 deletions beginner_source/t5_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,8 +223,10 @@ def process_labels(labels, x):


#######################################################################
# Summarization Output (Might vary since we shuffle the dataloader)
# Summarization Output
# --------------------
#
# Summarization output might vary since we shuffle the dataloader.
#
# .. code-block::
#
Expand Down Expand Up @@ -315,7 +317,7 @@ def process_labels(labels, x):
# Sentiment Output
# ----------------
#
# ::
# .. code-block:: bash
#
# Example 1:
#
Expand Down Expand Up @@ -408,7 +410,7 @@ def process_labels(labels, x):
# Translation Output
# ------------------
#
# ::
# .. code-block:: bash
#
# Example 1:
#
Expand Down
2 changes: 1 addition & 1 deletion beginner_source/vt_tutorial.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""
Optimizing Vision Transformer Model for Deployment
===========================
==================================================

`Jeff Tang <https://github.com/jeffxtang>`_,
`Geeta Chauhan <https://github.com/gchauhan/>`_
Expand Down
6 changes: 3 additions & 3 deletions intermediate_source/FSDP_tutorial.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Getting Started with Fully Sharded Data Parallel(FSDP)
=====================================================
======================================================

**Author**: `Hamid Shojanazeri <https://github.com/HamidShojanazeri>`__, `Yanli Zhao <https://github.com/zhaojuanmao>`__, `Shen Li <https://mrshenli.github.io/>`__

Expand Down Expand Up @@ -56,7 +56,7 @@ One way to view FSDP's sharding is to decompose the DDP gradient all-reduce into
FSDP Allreduce

How to use FSDP
--------------
---------------
Here we use a toy model to run training on the MNIST dataset for demonstration purposes. The APIs and logic can be applied to training larger models as well.

*Setup*
Expand Down Expand Up @@ -267,7 +267,7 @@ We add the following code snippets to a python script “FSDP_mnist.py”.



2.5 Finally parse the arguments and set the main function
2.5 Finally, parse the arguments and set the main function

.. code-block:: python

Expand Down
4 changes: 2 additions & 2 deletions intermediate_source/ddp_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ and elasticity support, please refer to `TorchElastic <https://pytorch.org/elast
cleanup()

Combining DDP with Model Parallelism
----------------------------------
------------------------------------

DDP also works with multi-GPU models. DDP wrapping multi-GPU models is especially
helpful when training large models with a huge amount of data.
Expand Down Expand Up @@ -297,7 +297,7 @@ either the application or the model ``forward()`` method.
run_demo(demo_model_parallel, world_size)

Initialize DDP with torch.distributed.run/torchrun
----------------------------------
---------------------------------------------------

We can leverage PyTorch Elastic to simplify the DDP code and initialize the job more easily.
Let's still use the Toymodel example and create a file named ``elastic_ddp.py``.
Expand Down
2 changes: 1 addition & 1 deletion intermediate_source/dynamic_quantization_bert_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,7 @@ We reuse the tokenize and evaluation function from `Huggingface <https://github.


3. Apply the dynamic quantization
-------------------------------
---------------------------------

We call ``torch.quantization.quantize_dynamic`` on the model to apply
the dynamic quantization on the HuggingFace BERT model. Specifically,
Expand Down
Loading