Skip to content

Commit a0adb7a

Browse files
authored
Merge branch 'master' into link
2 parents 1ab8b36 + e478586 commit a0adb7a

File tree

7 files changed

+18
-15
lines changed

7 files changed

+18
-15
lines changed

.circleci/scripts/build_for_windows.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ if [[ "${CIRCLE_JOB}" == *worker_* ]]; then
4949
python $DIR/remove_runnable_code.py advanced_source/static_quantization_tutorial.py advanced_source/static_quantization_tutorial.py || true
5050
python $DIR/remove_runnable_code.py beginner_source/hyperparameter_tuning_tutorial.py beginner_source/hyperparameter_tuning_tutorial.py || true
5151
python $DIR/remove_runnable_code.py beginner_source/audio_preprocessing_tutorial.py beginner_source/audio_preprocessing_tutorial.py || true
52+
python $DIR/remove_runnable_code.py beginner_source/dcgan_faces_tutorial.py beginner_source/dcgan_faces_tutorial.py || true
5253
python $DIR/remove_runnable_code.py intermediate_source/tensorboard_profiler_tutorial.py intermediate_source/tensorboard_profiler_tutorial.py || true
5354
# Temp remove for mnist download issue. (Re-enabled for 1.8.1)
5455
# python $DIR/remove_runnable_code.py beginner_source/fgsm_tutorial.py beginner_source/fgsm_tutorial.py || true

beginner_source/basics/autogradqs_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@
4747
#
4848
# In this network, ``w`` and ``b`` are **parameters**, which we need to
4949
# optimize. Thus, we need to be able to compute the gradients of loss
50-
# function with respect to those variables. In orded to do that, we set
50+
# function with respect to those variables. In order to do that, we set
5151
# the ``requires_grad`` property of those tensors.
5252

5353
#######################################################################

beginner_source/chatbot_tutorial.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -471,7 +471,7 @@ def trimRareWords(voc, pairs, MIN_COUNT):
471471
# with mini-batches.
472472
#
473473
# Using mini-batches also means that we must be mindful of the variation
474-
# of sentence length in our batches. To accomodate sentences of different
474+
# of sentence length in our batches. To accommodate sentences of different
475475
# sizes in the same batch, we will make our batched input tensor of shape
476476
# *(max_length, batch_size)*, where sentences shorter than the
477477
# *max_length* are zero padded after an *EOS_token*.
@@ -615,7 +615,7 @@ def batch2TrainData(voc, pair_batch):
615615
# in normal sequential order, and one that is fed the input sequence in
616616
# reverse order. The outputs of each network are summed at each time step.
617617
# Using a bidirectional GRU will give us the advantage of encoding both
618-
# past and future context.
618+
# past and future contexts.
619619
#
620620
# Bidirectional RNN:
621621
#
@@ -700,7 +700,7 @@ def forward(self, input_seq, input_lengths, hidden=None):
700700
# states to generate the next word in the sequence. It continues
701701
# generating words until it outputs an *EOS_token*, representing the end
702702
# of the sentence. A common problem with a vanilla seq2seq decoder is that
703-
# if we rely soley on the context vector to encode the entire input
703+
# if we rely solely on the context vector to encode the entire input
704704
# sequence’s meaning, it is likely that we will have information loss.
705705
# This is especially the case when dealing with long input sequences,
706706
# greatly limiting the capability of our decoder.
@@ -950,7 +950,7 @@ def maskNLLLoss(inp, target, mask):
950950
# sequence (or batch of sequences). We use the ``GRU`` layer like this in
951951
# the ``encoder``. The reality is that under the hood, there is an
952952
# iterative process looping over each time step calculating hidden states.
953-
# Alternatively, you ran run these modules one time-step at a time. In
953+
# Alternatively, you can run these modules one time-step at a time. In
954954
# this case, we manually loop over the sequences during the training
955955
# process like we must do for the ``decoder`` model. As long as you
956956
# maintain the correct conceptual model of these modules, implementing
@@ -1115,7 +1115,7 @@ def trainIters(model_name, voc, pairs, encoder, decoder, encoder_optimizer, deco
11151115
# softmax value. This decoding method is optimal on a single time-step
11161116
# level.
11171117
#
1118-
# To facilite the greedy decoding operation, we define a
1118+
# To facilitate the greedy decoding operation, we define a
11191119
# ``GreedySearchDecoder`` class. When run, an object of this class takes
11201120
# an input sequence (``input_seq``) of shape *(input_seq length, 1)*, a
11211121
# scalar input length (``input_length``) tensor, and a ``max_length`` to

beginner_source/nlp/pytorch_tutorial.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
All of deep learning is computations on tensors, which are
1010
generalizations of a matrix that can be indexed in more than 2
1111
dimensions. We will see exactly what this means in-depth later. First,
12-
lets look what we can do with tensors.
12+
let's look what we can do with tensors.
1313
"""
1414
# Author: Robert Guthrie
1515

@@ -162,7 +162,7 @@
162162
# other operation, etc.)
163163
#
164164
# If ``requires_grad=True``, the Tensor object keeps track of how it was
165-
# created. Lets see it in action.
165+
# created. Let's see it in action.
166166
#
167167

168168
# Tensor factory methods have a ``requires_grad`` flag
@@ -187,7 +187,7 @@
187187
# But how does that help us compute a gradient?
188188
#
189189

190-
# Lets sum up all the entries in z
190+
# Let's sum up all the entries in z
191191
s = z.sum()
192192
print(s)
193193
print(s.grad_fn)
@@ -222,7 +222,7 @@
222222

223223

224224
######################################################################
225-
# Lets have Pytorch compute the gradient, and see that we were right:
225+
# Let's have Pytorch compute the gradient, and see that we were right:
226226
# (note if you run this block multiple times, the gradient will increment.
227227
# That is because Pytorch *accumulates* the gradient into the .grad
228228
# property, since for many models this is very convenient.)

beginner_source/transformer_tutorial.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,16 @@
4545
#
4646

4747
import math
48+
4849
import torch
4950
import torch.nn as nn
5051
import torch.nn.functional as F
52+
from torch.nn import TransformerEncoder, TransformerEncoderLayer
5153

5254
class TransformerModel(nn.Module):
5355

5456
def __init__(self, ntoken, ninp, nhead, nhid, nlayers, dropout=0.5):
5557
super(TransformerModel, self).__init__()
56-
from torch.nn import TransformerEncoder, TransformerEncoderLayer
5758
self.model_type = 'Transformer'
5859
self.pos_encoder = PositionalEncoding(ninp, dropout)
5960
encoder_layers = TransformerEncoderLayer(ninp, nhead, nhid, dropout)
@@ -251,12 +252,13 @@ def get_batch(source, i):
251252
# function to scale all the gradient together to prevent exploding.
252253
#
253254

255+
import time
256+
254257
criterion = nn.CrossEntropyLoss()
255258
lr = 5.0 # learning rate
256259
optimizer = torch.optim.SGD(model.parameters(), lr=lr)
257260
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 1.0, gamma=0.95)
258261

259-
import time
260262
def train():
261263
model.train() # Turn on the train mode
262264
total_loss = 0.

index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -497,7 +497,7 @@ Additional Resources
497497
:header: PyTorch Cheat Sheet
498498
:description: Quick overview to essential PyTorch elements.
499499
:button_link: beginner/ptcheat.html
500-
:button_text: Download
500+
:button_text: Open
501501

502502
.. customcalloutitem::
503503
:header: Tutorials on GitHub
@@ -509,7 +509,7 @@ Additional Resources
509509
:header: Run Tutorials on Google Colab
510510
:description: Learn how to copy tutorial data into Google Drive so that you can run tutorials on Google Colab.
511511
:button_link: beginner/colab.html
512-
:button_text: Download
512+
:button_text: Open
513513

514514
.. End of callout section
515515

intermediate_source/torchvision_tutorial.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ training and evaluation, and will use the evaluation scripts from
6464
One note on the ``labels``. The model considers class ``0`` as background. If your dataset does not contain the background class, you should not have ``0`` in your ``labels``. For example, assuming you have just two classes, *cat* and *dog*, you can define ``1`` (not ``0``) to represent *cats* and ``2`` to represent *dogs*. So, for instance, if one of the images has both classes, your ``labels`` tensor should look like ``[1,2]``.
6565

6666
Additionally, if you want to use aspect ratio grouping during training
67-
(so that each batch only contains images with similar aspect ratio),
67+
(so that each batch only contains images with similar aspect ratios),
6868
then it is recommended to also implement a ``get_height_and_width``
6969
method, which returns the height and the width of the image. If this
7070
method is not provided, we query all elements of the dataset via

0 commit comments

Comments
 (0)