Skip to content

Commit 7081a2b

Browse files
authored
Merge branch 'master' into brianjo-speech-fix
2 parents ecdea63 + 760455d commit 7081a2b

16 files changed

+49
-52
lines changed

.circleci/scripts/build_for_windows.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ if [[ "${CIRCLE_JOB}" == *worker_* ]]; then
4949
python $DIR/remove_runnable_code.py advanced_source/static_quantization_tutorial.py advanced_source/static_quantization_tutorial.py || true
5050
python $DIR/remove_runnable_code.py beginner_source/hyperparameter_tuning_tutorial.py beginner_source/hyperparameter_tuning_tutorial.py || true
5151
python $DIR/remove_runnable_code.py beginner_source/audio_preprocessing_tutorial.py beginner_source/audio_preprocessing_tutorial.py || true
52+
python $DIR/remove_runnable_code.py beginner_source/dcgan_faces_tutorial.py beginner_source/dcgan_faces_tutorial.py || true
5253
python $DIR/remove_runnable_code.py intermediate_source/tensorboard_profiler_tutorial.py intermediate_source/tensorboard_profiler_tutorial.py || true
5354
# Temp remove for mnist download issue. (Re-enabled for 1.8.1)
5455
# python $DIR/remove_runnable_code.py beginner_source/fgsm_tutorial.py beginner_source/fgsm_tutorial.py || true

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,10 @@ In case you prefer to write your tutorial in jupyter, you can use [this script](
2828
- Then you can build using `make docs`. This will download the data, execute the tutorials and build the documentation to `docs/` directory. This will take about 60-120 min for systems with GPUs. If you do not have a GPU installed on your system, then see next step.
2929
- You can skip the computationally intensive graph generation by running `make html-noplot` to build basic html documentation to `_build/html`. This way, you can quickly preview your tutorial.
3030

31-
> If you get **ModuleNotFoundError: No module named 'pytorch_sphinx_theme' make: *** [html-noplot] Error 2**, from /tutorials/src/pytorch-sphinx-theme run `python setup.py install`.
31+
> If you get **ModuleNotFoundError: No module named 'pytorch_sphinx_theme' make: *** [html-noplot] Error 2** from /tutorials/src/pytorch-sphinx-theme or /venv/src/pytorch-sphinx-theme (while using virtualenv), run `python setup.py install`.
3232
3333

3434
## About contributing to PyTorch Documentation and Tutorials
3535
* You can find information about contributing to PyTorch documentation in the
3636
PyTorch Repo [README.md](https://github.com/pytorch/pytorch/blob/master/README.md) file.
37-
* Additional information can be found in [PyTorch CONTRIBUTING.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md).
37+
* Additional information can be found in [PyTorch CONTRIBUTING.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md).

beginner_source/basics/autogradqs_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@
4747
#
4848
# In this network, ``w`` and ``b`` are **parameters**, which we need to
4949
# optimize. Thus, we need to be able to compute the gradients of loss
50-
# function with respect to those variables. In orded to do that, we set
50+
# function with respect to those variables. In order to do that, we set
5151
# the ``requires_grad`` property of those tensors.
5252

5353
#######################################################################
@@ -58,7 +58,7 @@
5858
# A function that we apply to tensors to construct computational graph is
5959
# in fact an object of class ``Function``. This object knows how to
6060
# compute the function in the *forward* direction, and also how to compute
61-
# it's derivative during the *backward propagation* step. A reference to
61+
# its derivative during the *backward propagation* step. A reference to
6262
# the backward propagation function is stored in ``grad_fn`` property of a
6363
# tensor. You can find more information of ``Function`` `in the
6464
# documentation <https://pytorch.org/docs/stable/autograd.html#function>`__.

beginner_source/basics/buildmodel_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ def forward(self, x):
6767

6868
##############################################
6969
# We create an instance of ``NeuralNetwork``, and move it to the ``device``, and print
70-
# it's structure.
70+
# its structure.
7171

7272
model = NeuralNetwork().to(device)
7373
print(model)
@@ -119,7 +119,7 @@ def forward(self, x):
119119
# nn.Linear
120120
# ^^^^^^^^^^^^^^^^^^^^^^
121121
# The `linear layer <https://pytorch.org/docs/stable/generated/torch.nn.Linear.html>`_
122-
# is a module that applies a linear transformation on the input using it's stored weights and biases.
122+
# is a module that applies a linear transformation on the input using its stored weights and biases.
123123
#
124124
layer1 = nn.Linear(in_features=28*28, out_features=20)
125125
hidden1 = layer1(flat_image)

beginner_source/basics/data_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ def __getitem__(self, idx):
225225
# --------------------------
226226
#
227227
# We have loaded that dataset into the ``Dataloader`` and can iterate through the dataset as needed.
228-
# Each iteration below returns a batch of ``train_features`` and ``train_labels``(containing ``batch_size=64`` features and labels respectively).
228+
# Each iteration below returns a batch of ``train_features`` and ``train_labels`` (containing ``batch_size=64`` features and labels respectively).
229229
# Because we specified ``shuffle=True``, after we iterate over all batches the data is shuffled (for finer-grained control over
230230
# the data loading order, take a look at `Samplers <https://pytorch.org/docs/stable/data.html#data-loading-order-and-sampler>`_).
231231

beginner_source/blitz/README.txt

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,11 @@ Deep Learning with PyTorch: A 60 Minute Blitz
1313
Neural Networks
1414
https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#
1515

16-
4. autograd_tutorial.py
17-
Automatic Differentiation
18-
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html
19-
20-
5. cifar10_tutorial.py
16+
4. cifar10_tutorial.py
2117
Training a Classifier
2218
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
2319

20+
5. data_parallel_tutorial.py
21+
Optional: Data Parallelism
22+
https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html
2423

beginner_source/blitz/neural_networks_tutorial.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -176,8 +176,9 @@ def num_flat_features(self, x):
176176
# -> loss
177177
#
178178
# So, when we call ``loss.backward()``, the whole graph is differentiated
179-
# w.r.t. the loss, and all Tensors in the graph that have ``requires_grad=True``
180-
# will have their ``.grad`` Tensor accumulated with the gradient.
179+
# w.r.t. the neural net parameters, and all Tensors in the graph that have
180+
# ``requires_grad=True`` will have their ``.grad`` Tensor accumulated with the
181+
# gradient.
181182
#
182183
# For illustration, let us follow a few steps backward:
183184

beginner_source/chatbot_tutorial.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -471,7 +471,7 @@ def trimRareWords(voc, pairs, MIN_COUNT):
471471
# with mini-batches.
472472
#
473473
# Using mini-batches also means that we must be mindful of the variation
474-
# of sentence length in our batches. To accomodate sentences of different
474+
# of sentence length in our batches. To accommodate sentences of different
475475
# sizes in the same batch, we will make our batched input tensor of shape
476476
# *(max_length, batch_size)*, where sentences shorter than the
477477
# *max_length* are zero padded after an *EOS_token*.
@@ -615,7 +615,7 @@ def batch2TrainData(voc, pair_batch):
615615
# in normal sequential order, and one that is fed the input sequence in
616616
# reverse order. The outputs of each network are summed at each time step.
617617
# Using a bidirectional GRU will give us the advantage of encoding both
618-
# past and future context.
618+
# past and future contexts.
619619
#
620620
# Bidirectional RNN:
621621
#
@@ -700,7 +700,7 @@ def forward(self, input_seq, input_lengths, hidden=None):
700700
# states to generate the next word in the sequence. It continues
701701
# generating words until it outputs an *EOS_token*, representing the end
702702
# of the sentence. A common problem with a vanilla seq2seq decoder is that
703-
# if we rely soley on the context vector to encode the entire input
703+
# if we rely solely on the context vector to encode the entire input
704704
# sequence’s meaning, it is likely that we will have information loss.
705705
# This is especially the case when dealing with long input sequences,
706706
# greatly limiting the capability of our decoder.
@@ -950,7 +950,7 @@ def maskNLLLoss(inp, target, mask):
950950
# sequence (or batch of sequences). We use the ``GRU`` layer like this in
951951
# the ``encoder``. The reality is that under the hood, there is an
952952
# iterative process looping over each time step calculating hidden states.
953-
# Alternatively, you ran run these modules one time-step at a time. In
953+
# Alternatively, you can run these modules one time-step at a time. In
954954
# this case, we manually loop over the sequences during the training
955955
# process like we must do for the ``decoder`` model. As long as you
956956
# maintain the correct conceptual model of these modules, implementing
@@ -1115,7 +1115,7 @@ def trainIters(model_name, voc, pairs, encoder, decoder, encoder_optimizer, deco
11151115
# softmax value. This decoding method is optimal on a single time-step
11161116
# level.
11171117
#
1118-
# To facilite the greedy decoding operation, we define a
1118+
# To facilitate the greedy decoding operation, we define a
11191119
# ``GreedySearchDecoder`` class. When run, an object of this class takes
11201120
# an input sequence (``input_seq``) of shape *(input_seq length, 1)*, a
11211121
# scalar input length (``input_length``) tensor, and a ``max_length`` to

beginner_source/dcgan_faces_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@
7171
# :math:`D` and :math:`G` play a minimax game in which :math:`D` tries to
7272
# maximize the probability it correctly classifies reals and fakes
7373
# (:math:`logD(x)`), and :math:`G` tries to minimize the probability that
74-
# :math:`D` will predict its outputs are fake (:math:`log(1-D(G(x)))`).
74+
# :math:`D` will predict its outputs are fake (:math:`log(1-D(G(z)))`).
7575
# From the paper, the GAN loss function is
7676
#
7777
# .. math:: \underset{G}{\text{min}} \underset{D}{\text{max}}V(D,G) = \mathbb{E}_{x\sim p_{data}(x)}\big[logD(x)\big] + \mathbb{E}_{z\sim p_{z}(z)}\big[log(1-D(G(z)))\big]

beginner_source/nlp/pytorch_tutorial.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
All of deep learning is computations on tensors, which are
1010
generalizations of a matrix that can be indexed in more than 2
1111
dimensions. We will see exactly what this means in-depth later. First,
12-
lets look what we can do with tensors.
12+
let's look what we can do with tensors.
1313
"""
1414
# Author: Robert Guthrie
1515

@@ -162,7 +162,7 @@
162162
# other operation, etc.)
163163
#
164164
# If ``requires_grad=True``, the Tensor object keeps track of how it was
165-
# created. Lets see it in action.
165+
# created. Let's see it in action.
166166
#
167167

168168
# Tensor factory methods have a ``requires_grad`` flag
@@ -187,7 +187,7 @@
187187
# But how does that help us compute a gradient?
188188
#
189189

190-
# Lets sum up all the entries in z
190+
# Let's sum up all the entries in z
191191
s = z.sum()
192192
print(s)
193193
print(s.grad_fn)
@@ -222,7 +222,7 @@
222222

223223

224224
######################################################################
225-
# Lets have Pytorch compute the gradient, and see that we were right:
225+
# Let's have Pytorch compute the gradient, and see that we were right:
226226
# (note if you run this block multiple times, the gradient will increment.
227227
# That is because Pytorch *accumulates* the gradient into the .grad
228228
# property, since for many models this is very convenient.)

beginner_source/nn_tutorial.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,6 @@
8585
torch.tensor, (x_train, y_train, x_valid, y_valid)
8686
)
8787
n, c = x_train.shape
88-
x_train, x_train.shape, y_train.min(), y_train.max()
8988
print(x_train, y_train)
9089
print(x_train.shape)
9190
print(y_train.min(), y_train.max())

beginner_source/transformer_tutorial.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,16 @@
4545
#
4646

4747
import math
48+
4849
import torch
4950
import torch.nn as nn
5051
import torch.nn.functional as F
52+
from torch.nn import TransformerEncoder, TransformerEncoderLayer
5153

5254
class TransformerModel(nn.Module):
5355

5456
def __init__(self, ntoken, ninp, nhead, nhid, nlayers, dropout=0.5):
5557
super(TransformerModel, self).__init__()
56-
from torch.nn import TransformerEncoder, TransformerEncoderLayer
5758
self.model_type = 'Transformer'
5859
self.pos_encoder = PositionalEncoding(ninp, dropout)
5960
encoder_layers = TransformerEncoderLayer(ninp, nhead, nhid, dropout)
@@ -251,12 +252,13 @@ def get_batch(source, i):
251252
# function to scale all the gradient together to prevent exploding.
252253
#
253254

255+
import time
256+
254257
criterion = nn.CrossEntropyLoss()
255258
lr = 5.0 # learning rate
256259
optimizer = torch.optim.SGD(model.parameters(), lr=lr)
257260
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 1.0, gamma=0.95)
258261

259-
import time
260262
def train():
261263
model.train() # Turn on the train mode
262264
total_loss = 0.

index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -497,7 +497,7 @@ Additional Resources
497497
:header: PyTorch Cheat Sheet
498498
:description: Quick overview to essential PyTorch elements.
499499
:button_link: beginner/ptcheat.html
500-
:button_text: Download
500+
:button_text: Open
501501

502502
.. customcalloutitem::
503503
:header: Tutorials on GitHub
@@ -509,7 +509,7 @@ Additional Resources
509509
:header: Run Tutorials on Google Colab
510510
:description: Learn how to copy tutorial data into Google Drive so that you can run tutorials on Google Colab.
511511
:button_link: beginner/colab.html
512-
:button_text: Download
512+
:button_text: Open
513513

514514
.. End of callout section
515515

intermediate_source/reinforcement_q_learning.py

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@
6363
import numpy as np
6464
import matplotlib
6565
import matplotlib.pyplot as plt
66-
from collections import namedtuple
66+
from collections import namedtuple, deque
6767
from itertools import count
6868
from PIL import Image
6969

@@ -115,16 +115,11 @@
115115
class ReplayMemory(object):
116116

117117
def __init__(self, capacity):
118-
self.capacity = capacity
119-
self.memory = []
120-
self.position = 0
118+
self.memory = deque([],maxlen=capacity)
121119

122120
def push(self, *args):
123-
"""Saves a transition."""
124-
if len(self.memory) < self.capacity:
125-
self.memory.append(None)
126-
self.memory[self.position] = Transition(*args)
127-
self.position = (self.position + 1) % self.capacity
121+
"""Save a transition"""
122+
self.memory.append(Transition(*args))
128123

129124
def sample(self, batch_size):
130125
return random.sample(self.memory, batch_size)

intermediate_source/rpc_tutorial.rst

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ Prerequisites:
99
- `RPC API documents <https://pytorch.org/docs/master/rpc.html>`__
1010

1111
This tutorial uses two simple examples to demonstrate how to build distributed
12-
training with the `torch.distributed.rpc <https://pytorch.org/docs/master/rpc.html>`__
13-
package which is first introduced as a prototype feature in PyTorch v1.4.
12+
training with the `torch.distributed.rpc <https://pytorch.org/docs/stable/rpc.html>`__
13+
package which was first introduced as an experimental feature in PyTorch v1.4.
1414
Source code of the two examples can be found in
1515
`PyTorch examples <https://github.com/pytorch/examples>`__.
1616

@@ -36,19 +36,19 @@ paradigms. For example:
3636
machines.
3737

3838

39-
The `torch.distributed.rpc <https://pytorch.org/docs/master/rpc.html>`__ package
40-
can help with the above scenarios. In case 1, `RPC <https://pytorch.org/docs/master/rpc.html#rpc>`__
41-
and `RRef <https://pytorch.org/docs/master/rpc.html#rref>`__ allow sending data
39+
The `torch.distributed.rpc <https://pytorch.org/docs/stable/rpc.html>`__ package
40+
can help with the above scenarios. In case 1, `RPC <https://pytorch.org/docs/stable/rpc.html#rpc>`__
41+
and `RRef <https://pytorch.org/docs/stable/rpc.html#rref>`__ allow sending data
4242
from one worker to another while easily referencing remote data objects. In
43-
case 2, `distributed autograd <https://pytorch.org/docs/master/rpc.html#distributed-autograd-framework>`__
44-
and `distributed optimizer <https://pytorch.org/docs/master/rpc.html#module-torch.distributed.optim>`__
43+
case 2, `distributed autograd <https://pytorch.org/docs/stable/rpc.html#distributed-autograd-framework>`__
44+
and `distributed optimizer <https://pytorch.org/docs/stable/rpc.html#module-torch.distributed.optim>`__
4545
make executing backward pass and optimizer step as if it is local training. In
4646
the next two sections, we will demonstrate APIs of
47-
`torch.distributed.rpc <https://pytorch.org/docs/master/rpc.html>`__ using a
47+
`torch.distributed.rpc <https://pytorch.org/docs/stable/rpc.html>`__ using a
4848
reinforcement learning example and a language model example. Please note, this
4949
tutorial does not aim at building the most accurate or efficient models to
5050
solve given problems, instead, the main goal here is to show how to use the
51-
`torch.distributed.rpc <https://pytorch.org/docs/master/rpc.html>`__ package to
51+
`torch.distributed.rpc <https://pytorch.org/docs/stable/rpc.html>`__ package to
5252
build distributed training applications.
5353

5454

@@ -289,10 +289,10 @@ observers. The agent serves as master by repeatedly calling ``run_episode`` and
289289
``finish_episode`` until the running reward surpasses the reward threshold
290290
specified by the environment. All observers passively waiting for commands
291291
from the agent. The code is wrapped by
292-
`rpc.init_rpc <https://pytorch.org/docs/master/rpc.html#torch.distributed.rpc.init_rpc>`__ and
293-
`rpc.shutdown <https://pytorch.org/docs/master/rpc.html#torch.distributed.rpc.shutdown>`__,
292+
`rpc.init_rpc <https://pytorch.org/docs/stable/rpc.html#torch.distributed.rpc.init_rpc>`__ and
293+
`rpc.shutdown <https://pytorch.org/docs/stable/rpc.html#torch.distributed.rpc.shutdown>`__,
294294
which initializes and terminates RPC instances respectively. More details are
295-
available in the `API page <https://pytorch.org/docs/master/rpc.html>`__.
295+
available in the `API page <https://pytorch.org/docs/stable/rpc.html>`__.
296296

297297

298298
.. code:: python
@@ -442,7 +442,7 @@ takes a GPU tensor, you need to move it to the proper device explicitly.
442442
With the above sub-modules, we can now piece them together using RPC to
443443
create an RNN model. In the code below ``ps`` represents a parameter server,
444444
which hosts parameters of the embedding table and the decoder. The constructor
445-
uses the `remote <https://pytorch.org/docs/master/rpc.html#torch.distributed.rpc.remote>`__
445+
uses the `remote <https://pytorch.org/docs/stable/rpc.html#torch.distributed.rpc.remote>`__
446446
API to create an ``EmbeddingTable`` object and a ``Decoder`` object on the
447447
parameter server, and locally creates the ``LSTM`` sub-module. During the
448448
forward pass, the trainer uses the ``EmbeddingTable`` ``RRef`` to find the

intermediate_source/torchvision_tutorial.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ training and evaluation, and will use the evaluation scripts from
6464
One note on the ``labels``. The model considers class ``0`` as background. If your dataset does not contain the background class, you should not have ``0`` in your ``labels``. For example, assuming you have just two classes, *cat* and *dog*, you can define ``1`` (not ``0``) to represent *cats* and ``2`` to represent *dogs*. So, for instance, if one of the images has both classes, your ``labels`` tensor should look like ``[1,2]``.
6565

6666
Additionally, if you want to use aspect ratio grouping during training
67-
(so that each batch only contains images with similar aspect ratio),
67+
(so that each batch only contains images with similar aspect ratios),
6868
then it is recommended to also implement a ``get_height_and_width``
6969
method, which returns the height and the width of the image. If this
7070
method is not provided, we query all elements of the dataset via

0 commit comments

Comments
 (0)