Skip to content

Commit ec341ff

Browse files
authored
Merge branch 'master' into patch-1
2 parents 1e1cf20 + ff0cfa1 commit ec341ff

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+1221
-658
lines changed

.circleci/scripts/build_for_windows.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,10 @@ if [[ "${CIRCLE_JOB}" == *worker_* ]]; then
4949
python $DIR/remove_runnable_code.py advanced_source/static_quantization_tutorial.py advanced_source/static_quantization_tutorial.py || true
5050
python $DIR/remove_runnable_code.py beginner_source/hyperparameter_tuning_tutorial.py beginner_source/hyperparameter_tuning_tutorial.py || true
5151
python $DIR/remove_runnable_code.py beginner_source/audio_preprocessing_tutorial.py beginner_source/audio_preprocessing_tutorial.py || true
52-
# Temp remove for mnist download issue.
53-
python $DIR/remove_runnable_code.py beginner_source/fgsm_tutorial.py beginner_source/fgsm_tutorial.py || true
52+
python $DIR/remove_runnable_code.py beginner_source/dcgan_faces_tutorial.py beginner_source/dcgan_faces_tutorial.py || true
53+
python $DIR/remove_runnable_code.py intermediate_source/tensorboard_profiler_tutorial.py intermediate_source/tensorboard_profiler_tutorial.py || true
54+
# Temp remove for mnist download issue. (Re-enabled for 1.8.1)
55+
# python $DIR/remove_runnable_code.py beginner_source/fgsm_tutorial.py beginner_source/fgsm_tutorial.py || true
5456

5557
export WORKER_ID=$(echo "${CIRCLE_JOB}" | tr -dc '0-9')
5658
count=0

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,10 @@ In case you prefer to write your tutorial in jupyter, you can use [this script](
2828
- Then you can build using `make docs`. This will download the data, execute the tutorials and build the documentation to `docs/` directory. This will take about 60-120 min for systems with GPUs. If you do not have a GPU installed on your system, then see next step.
2929
- You can skip the computationally intensive graph generation by running `make html-noplot` to build basic html documentation to `_build/html`. This way, you can quickly preview your tutorial.
3030

31-
> If you get **ModuleNotFoundError: No module named 'pytorch_sphinx_theme' make: *** [html-noplot] Error 2**, from /tutorials/src/pytorch-sphinx-theme run `python setup.py install`.
31+
> If you get **ModuleNotFoundError: No module named 'pytorch_sphinx_theme' make: *** [html-noplot] Error 2** from /tutorials/src/pytorch-sphinx-theme or /venv/src/pytorch-sphinx-theme (while using virtualenv), run `python setup.py install`.
3232
3333

3434
## About contributing to PyTorch Documentation and Tutorials
3535
* You can find information about contributing to PyTorch documentation in the
3636
PyTorch Repo [README.md](https://github.com/pytorch/pytorch/blob/master/README.md) file.
37-
* Additional information can be found in [PyTorch CONTRIBUTING.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md).
37+
* Additional information can be found in [PyTorch CONTRIBUTING.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md).

_static/img/profiler_overview1.png

133 KB
Loading

_static/img/profiler_overview2.png

77.3 KB
Loading

_static/img/profiler_trace_view1.png

128 KB
Loading

_static/img/profiler_trace_view2.png

133 KB
Loading

_static/img/profiler_views_list.png

67.8 KB
Loading

_static/img/tensorboard_pr_curves.png

-190 KB
Loading
Loading

_templates/layout.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@
7575
</noscript>
7676

7777
<script type="text/javascript">
78-
var collapsedSections = ['PyTorch Recipes', 'Image and Video', 'Audio', 'Text', 'Reinforcement Learning', 'Deploying PyTorch Models in Production', 'Code Transforms with FX', 'Frontend APIs', 'Extending PyTorch', 'Model Optimization', 'Parallel and Distributed Training', 'Mobile'];
78+
var collapsedSections = ['PyTorch Recipes', 'Learning PyTorch', 'Image and Video', 'Audio', 'Text', 'Reinforcement Learning', 'Deploying PyTorch Models in Production', 'Code Transforms with FX', 'Frontend APIs', 'Extending PyTorch', 'Model Optimization', 'Parallel and Distributed Training', 'Mobile'];
7979
</script>
8080

8181
<img height="1" width="1" style="border-style:none;" alt="" src="https://www.googleadservices.com/pagead/conversion/795629140/?label=txkmCPmdtosBENSssfsC&amp;guid=ON&amp;script=0"/>

advanced_source/cpp_export.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ If you need to exclude some methods in your ``nn.Module``
115115
because they use Python features that TorchScript doesn't support yet,
116116
you could annotate those with ``@torch.jit.ignore``
117117

118-
``my_module`` is an instance of
118+
``sm`` is an instance of
119119
``ScriptModule`` that is ready for serialization.
120120

121121
Step 2: Serializing Your Script Module to a File
@@ -132,7 +132,7 @@ on the module and pass it a filename::
132132
traced_script_module.save("traced_resnet_model.pt")
133133

134134
This will produce a ``traced_resnet_model.pt`` file in your working directory.
135-
If you also would like to serialize ``my_module``, call ``my_module.save("my_module_model.pt")``
135+
If you also would like to serialize ``sm``, call ``sm.save("my_module_model.pt")``
136136
We have now officially left the realm of Python and are ready to cross over to the sphere
137137
of C++.
138138

advanced_source/ddp_pipeline.py

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,6 @@ def forward(self, x):
8989
class Encoder(nn.Module):
9090
def __init__(self, ntoken, ninp, dropout=0.5):
9191
super(Encoder, self).__init__()
92-
self.src_mask = None
9392
self.pos_encoder = PositionalEncoding(ninp, dropout)
9493
self.encoder = nn.Embedding(ntoken, ninp)
9594
self.ninp = ninp
@@ -99,17 +98,9 @@ def init_weights(self):
9998
initrange = 0.1
10099
self.encoder.weight.data.uniform_(-initrange, initrange)
101100

102-
def _generate_square_subsequent_mask(self, sz):
103-
mask = (torch.triu(torch.ones(sz, sz)) == 1).transpose(0, 1)
104-
mask = mask.float().masked_fill(mask == 0, float('-inf')).masked_fill(mask == 1, float(0.0))
105-
return mask
106-
107101
def forward(self, src):
108-
if self.src_mask is None or self.src_mask.size(0) != src.size(0):
109-
device = src.device
110-
mask = self._generate_square_subsequent_mask(src.size(0)).to(device)
111-
self.src_mask = mask
112-
102+
# Need (S, N) format for encoder.
103+
src = src.t()
113104
src = self.encoder(src) * math.sqrt(self.ninp)
114105
return self.pos_encoder(src)
115106

@@ -125,7 +116,8 @@ def init_weights(self):
125116
self.decoder.weight.data.uniform_(-initrange, initrange)
126117

127118
def forward(self, inp):
128-
return self.decoder(inp)
119+
# Need batch dimension first for output of pipeline.
120+
return self.decoder(inp).permute(1, 0, 2)
129121

130122
######################################################################
131123
# Start multiple processes for training
@@ -245,7 +237,8 @@ def get_batch(source, i):
245237
seq_len = min(bptt, len(source) - 1 - i)
246238
data = source[i:i+seq_len]
247239
target = source[i+1:i+1+seq_len].view(-1)
248-
return data, target
240+
# Need batch dimension first for pipeline parallelism.
241+
return data.t(), target
249242

250243
######################################################################
251244
# Model scale and Pipe initialization
@@ -318,8 +311,9 @@ def get_batch(source, i):
318311
# Need to use 'checkpoint=never' since as of PyTorch 1.8, Pipe checkpointing
319312
# doesn't work with DDP.
320313
from torch.distributed.pipeline.sync import Pipe
314+
chunks = 8
321315
model = Pipe(torch.nn.Sequential(
322-
*module_list), chunks = 8, checkpoint="never")
316+
*module_list), chunks = chunks, checkpoint="never")
323317

324318
# Initialize process group and wrap model in DDP.
325319
from torch.nn.parallel import DistributedDataParallel

beginner_source/PyTorch Cheat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ See [onnx](https://pytorch.org/docs/stable/onnx.html)
5050
from torchvision import datasets, models, transforms # vision datasets, architectures & transforms
5151
import torchvision.transforms as transforms # composable transforms
5252
```
53-
See [torchvision](https://pytorch.org/docs/stable/torchvision/index.html)
53+
See [torchvision](https://pytorch.org/vision/stable/index.html)
5454

5555
### Distributed Training
5656

beginner_source/basics/autogradqs_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@
4747
#
4848
# In this network, ``w`` and ``b`` are **parameters**, which we need to
4949
# optimize. Thus, we need to be able to compute the gradients of loss
50-
# function with respect to those variables. In orded to do that, we set
50+
# function with respect to those variables. In order to do that, we set
5151
# the ``requires_grad`` property of those tensors.
5252

5353
#######################################################################
@@ -58,7 +58,7 @@
5858
# A function that we apply to tensors to construct computational graph is
5959
# in fact an object of class ``Function``. This object knows how to
6060
# compute the function in the *forward* direction, and also how to compute
61-
# it's derivative during the *backward propagation* step. A reference to
61+
# its derivative during the *backward propagation* step. A reference to
6262
# the backward propagation function is stored in ``grad_fn`` property of a
6363
# tensor. You can find more information of ``Function`` `in the
6464
# documentation <https://pytorch.org/docs/stable/autograd.html#function>`__.

beginner_source/basics/buildmodel_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ def forward(self, x):
6767

6868
##############################################
6969
# We create an instance of ``NeuralNetwork``, and move it to the ``device``, and print
70-
# it's structure.
70+
# its structure.
7171

7272
model = NeuralNetwork().to(device)
7373
print(model)
@@ -119,7 +119,7 @@ def forward(self, x):
119119
# nn.Linear
120120
# ^^^^^^^^^^^^^^^^^^^^^^
121121
# The `linear layer <https://pytorch.org/docs/stable/generated/torch.nn.Linear.html>`_
122-
# is a module that applies a linear transformation on the input using it's stored weights and biases.
122+
# is a module that applies a linear transformation on the input using its stored weights and biases.
123123
#
124124
layer1 = nn.Linear(in_features=28*28, out_features=20)
125125
hidden1 = layer1(flat_image)

beginner_source/basics/data_tutorial.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
# PyTorch domain libraries provide a number of pre-loaded datasets (such as FashionMNIST) that
2626
# subclass ``torch.utils.data.Dataset`` and implement functions specific to the particular data.
2727
# They can be used to prototype and benchmark your model. You can find them
28-
# here: `Image Datasets <https://pytorch.org/docs/stable/torchvision/datasets.html>`_,
28+
# here: `Image Datasets <https://pytorch.org/vision/stable/datasets.html>`_,
2929
# `Text Datasets <https://pytorch.org/text/stable/datasets.html>`_, and
3030
# `Audio Datasets <https://pytorch.org/audio/stable/datasets.html>`_
3131
#
@@ -38,7 +38,7 @@
3838
# Fashion-MNIST is a dataset of Zalando’s article images consisting of of 60,000 training examples and 10,000 test examples.
3939
# Each example comprises a 28×28 grayscale image and an associated label from one of 10 classes.
4040
#
41-
# We load the `FashionMNIST Dataset <https://pytorch.org/docs/stable/torchvision/datasets.html#fashion-mnist>`_ with the following parameters:
41+
# We load the `FashionMNIST Dataset <https://pytorch.org/vision/stable/datasets.html#fashion-mnist>`_ with the following parameters:
4242
# - ``root`` is the path where the train/test data is stored,
4343
# - ``train`` specifies training or test dataset,
4444
# - ``download=True`` downloads the data from the internet if it's not available at ``root``.
@@ -225,7 +225,7 @@ def __getitem__(self, idx):
225225
# --------------------------
226226
#
227227
# We have loaded that dataset into the ``Dataloader`` and can iterate through the dataset as needed.
228-
# Each iteration below returns a batch of ``train_features`` and ``train_labels``(containing ``batch_size=64`` features and labels respectively).
228+
# Each iteration below returns a batch of ``train_features`` and ``train_labels`` (containing ``batch_size=64`` features and labels respectively).
229229
# Because we specified ``shuffle=True``, after we iterate over all batches the data is shuffled (for finer-grained control over
230230
# the data loading order, take a look at `Samplers <https://pytorch.org/docs/stable/data.html#data-loading-order-and-sampler>`_).
231231

beginner_source/basics/optimization_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@
1212
Optimizing Model Parameters
1313
===========================
1414
15-
Now that we have a model and data it's time to train, validate and test our model by optimizing it's parameters on
15+
Now that we have a model and data it's time to train, validate and test our model by optimizing its parameters on
1616
our data. Training a model is an iterative process; in each iteration (called an *epoch*) the model makes a guess about the output, calculates
1717
the error in its guess (*loss*), collects the derivatives of the error with respect to its parameters (as we saw in
1818
the `previous section <autograd_tutorial.html>`_), and **optimizes** these parameters using gradient descent. For a more
1919
detailed walkthrough of this process, check out this video on `backpropagation from 3Blue1Brown <https://www.youtube.com/watch?v=tIeHLnjs5U8>`__.
2020
21-
Pre-requisite Code
21+
Prerequisite Code
2222
-----------------
2323
We load the code from the previous sections on `Datasets & DataLoaders <data_tutorial.html>`_
2424
and `Build Model <buildmodel_tutorial.html>`_.

beginner_source/basics/quickstart_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
# all of which include datasets. For this tutorial, we will be using a TorchVision dataset.
3636
#
3737
# The ``torchvision.datasets`` module contains ``Dataset`` objects for many real-world vision data like
38-
# CIFAR, COCO (`full list here <https://pytorch.org/docs/stable/torchvision/datasets.html>`_). In this tutorial, we
38+
# CIFAR, COCO (`full list here <https://pytorch.org/vision/stable/datasets.html>`_). In this tutorial, we
3939
# use the FashionMNIST dataset. Every TorchVision ``Dataset`` includes two arguments: ``transform`` and
4040
# ``target_transform`` to modify the samples and labels respectively.
4141

beginner_source/basics/transforms_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
1919
All TorchVision datasets have two parameters -``transform`` to modify the features and
2020
``target_transform`` to modify the labels - that accept callables containing the transformation logic.
21-
The `torchvision.transforms <https://pytorch.org/docs/stable/torchvision/transforms.html>`_ module offers
21+
The `torchvision.transforms <https://pytorch.org/vision/stable/transforms.html>`_ module offers
2222
several commonly-used transforms out of the box.
2323
2424
The FashionMNIST features are in PIL Image format, and the labels are integers.
@@ -41,7 +41,7 @@
4141
# ToTensor()
4242
# -------------------------------
4343
#
44-
# `ToTensor <https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.ToTensor>`_
44+
# `ToTensor <https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ToTensor>`_
4545
# converts a PIL image or NumPy ``ndarray`` into a ``FloatTensor``. and scales
4646
# the image's pixel intensity values in the range [0., 1.]
4747
#

beginner_source/blitz/README.txt

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,11 @@ Deep Learning with PyTorch: A 60 Minute Blitz
1313
Neural Networks
1414
https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#
1515

16-
4. autograd_tutorial.py
17-
Automatic Differentiation
18-
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html
19-
20-
5. cifar10_tutorial.py
16+
4. cifar10_tutorial.py
2117
Training a Classifier
2218
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
2319

20+
5. data_parallel_tutorial.py
21+
Optional: Data Parallelism
22+
https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html
2423

beginner_source/blitz/cifar10_tutorial.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,15 +43,15 @@
4343
4444
We will do the following steps in order:
4545
46-
1. Load and normalizing the CIFAR10 training and test datasets using
46+
1. Load and normalize the CIFAR10 training and test datasets using
4747
``torchvision``
4848
2. Define a Convolutional Neural Network
4949
3. Define a loss function
5050
4. Train the network on the training data
5151
5. Test the network on the test data
5252
53-
1. Loading and normalizing CIFAR10
54-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
53+
1. Load and normalize CIFAR10
54+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5555
5656
Using ``torchvision``, it’s extremely easy to load CIFAR10.
5757
"""

beginner_source/blitz/neural_networks_tutorial.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ def __init__(self):
5858
def forward(self, x):
5959
# Max pooling over a (2, 2) window
6060
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
61-
# If the size is a square you can only specify a single number
61+
# If the size is a square, you can specify with a single number
6262
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
6363
x = x.view(-1, self.num_flat_features(x))
6464
x = F.relu(self.fc1(x))
@@ -176,8 +176,9 @@ def num_flat_features(self, x):
176176
# -> loss
177177
#
178178
# So, when we call ``loss.backward()``, the whole graph is differentiated
179-
# w.r.t. the loss, and all Tensors in the graph that have ``requires_grad=True``
180-
# will have their ``.grad`` Tensor accumulated with the gradient.
179+
# w.r.t. the neural net parameters, and all Tensors in the graph that have
180+
# ``requires_grad=True`` will have their ``.grad`` Tensor accumulated with the
181+
# gradient.
181182
#
182183
# For illustration, let us follow a few steps backward:
183184

beginner_source/chatbot_tutorial.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -471,7 +471,7 @@ def trimRareWords(voc, pairs, MIN_COUNT):
471471
# with mini-batches.
472472
#
473473
# Using mini-batches also means that we must be mindful of the variation
474-
# of sentence length in our batches. To accomodate sentences of different
474+
# of sentence length in our batches. To accommodate sentences of different
475475
# sizes in the same batch, we will make our batched input tensor of shape
476476
# *(max_length, batch_size)*, where sentences shorter than the
477477
# *max_length* are zero padded after an *EOS_token*.
@@ -615,7 +615,7 @@ def batch2TrainData(voc, pair_batch):
615615
# in normal sequential order, and one that is fed the input sequence in
616616
# reverse order. The outputs of each network are summed at each time step.
617617
# Using a bidirectional GRU will give us the advantage of encoding both
618-
# past and future context.
618+
# past and future contexts.
619619
#
620620
# Bidirectional RNN:
621621
#
@@ -700,7 +700,7 @@ def forward(self, input_seq, input_lengths, hidden=None):
700700
# states to generate the next word in the sequence. It continues
701701
# generating words until it outputs an *EOS_token*, representing the end
702702
# of the sentence. A common problem with a vanilla seq2seq decoder is that
703-
# if we rely soley on the context vector to encode the entire input
703+
# if we rely solely on the context vector to encode the entire input
704704
# sequence’s meaning, it is likely that we will have information loss.
705705
# This is especially the case when dealing with long input sequences,
706706
# greatly limiting the capability of our decoder.
@@ -950,7 +950,7 @@ def maskNLLLoss(inp, target, mask):
950950
# sequence (or batch of sequences). We use the ``GRU`` layer like this in
951951
# the ``encoder``. The reality is that under the hood, there is an
952952
# iterative process looping over each time step calculating hidden states.
953-
# Alternatively, you ran run these modules one time-step at a time. In
953+
# Alternatively, you can run these modules one time-step at a time. In
954954
# this case, we manually loop over the sequences during the training
955955
# process like we must do for the ``decoder`` model. As long as you
956956
# maintain the correct conceptual model of these modules, implementing
@@ -1115,7 +1115,7 @@ def trainIters(model_name, voc, pairs, encoder, decoder, encoder_optimizer, deco
11151115
# softmax value. This decoding method is optimal on a single time-step
11161116
# level.
11171117
#
1118-
# To facilite the greedy decoding operation, we define a
1118+
# To facilitate the greedy decoding operation, we define a
11191119
# ``GreedySearchDecoder`` class. When run, an object of this class takes
11201120
# an input sequence (``input_seq``) of shape *(input_seq length, 1)*, a
11211121
# scalar input length (``input_length``) tensor, and a ``max_length`` to

beginner_source/colab.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ At the top of the page click **Run in Google Colab**.
2020

2121
The file will open in Colab.
2222

23-
If you choose, **Runtime** then **Run All**, you'll get an error as the
23+
If you select **Runtime**, and then **Run All**, you'll get an error as the
2424
file can't be found.
2525

2626
To fix this, we'll copy the required file into our Google Drive account.
@@ -30,7 +30,7 @@ To fix this, we'll copy the required file into our Google Drive account.
3030
**cornell**.
3131
3. Visit the Cornell Movie Dialogs Corpus and download the ZIP file.
3232
4. Unzip the file on your local machine.
33-
5. Copy the file **movie\_lines.txt** to **data/cornell** folder you
33+
5. Copy the files **movie\_lines.txt** and **movie\_conversations.txt** to the **data/cornell** folder that you
3434
created in Google Drive.
3535

3636
Now we'll need to edit the file in\_ \_Colab to point to the file on
@@ -55,12 +55,12 @@ Change the two lines that follow:
5555

5656
We're now pointing to the file we uploaded to Drive.
5757

58-
Now when you click on the **Run cell** button for the code section,
58+
Now when you click the **Run cell** button for the code section,
5959
you'll be prompted to authorize Google Drive and you'll get an
6060
authorization code. Paste the code into the prompt in Colab and you
6161
should be set.
6262

63-
Rerun the notebook from **Runtime** / **Run All** menu command and
63+
Rerun the notebook from the **Runtime** / **Run All** menu command and
6464
you'll see it process. (Note that this tutorial takes a long time to
6565
run.)
6666

0 commit comments

Comments
 (0)