Skip to content

Commit 3f3b399

Browse files
authored
Merge branch 'master' into chatbot_tutorial_typofix
2 parents 2d2880d + df551a8 commit 3f3b399

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+1372
-723
lines changed

.circleci/scripts/build_for_windows.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,10 @@ if [[ "${CIRCLE_JOB}" == *worker_* ]]; then
4949
python $DIR/remove_runnable_code.py advanced_source/static_quantization_tutorial.py advanced_source/static_quantization_tutorial.py || true
5050
python $DIR/remove_runnable_code.py beginner_source/hyperparameter_tuning_tutorial.py beginner_source/hyperparameter_tuning_tutorial.py || true
5151
python $DIR/remove_runnable_code.py beginner_source/audio_preprocessing_tutorial.py beginner_source/audio_preprocessing_tutorial.py || true
52-
# Temp remove for mnist download issue.
53-
python $DIR/remove_runnable_code.py beginner_source/fgsm_tutorial.py beginner_source/fgsm_tutorial.py || true
52+
python $DIR/remove_runnable_code.py beginner_source/dcgan_faces_tutorial.py beginner_source/dcgan_faces_tutorial.py || true
53+
python $DIR/remove_runnable_code.py intermediate_source/tensorboard_profiler_tutorial.py intermediate_source/tensorboard_profiler_tutorial.py || true
54+
# Temp remove for mnist download issue. (Re-enabled for 1.8.1)
55+
# python $DIR/remove_runnable_code.py beginner_source/fgsm_tutorial.py beginner_source/fgsm_tutorial.py || true
5456

5557
export WORKER_ID=$(echo "${CIRCLE_JOB}" | tr -dc '0-9')
5658
count=0

_static/img/profiler_overview1.png

133 KB
Loading

_static/img/profiler_overview2.png

77.3 KB
Loading

_static/img/profiler_trace_view1.png

128 KB
Loading

_static/img/profiler_trace_view2.png

133 KB
Loading

_static/img/profiler_views_list.png

67.8 KB
Loading

_static/img/tensorboard_pr_curves.png

-190 KB
Loading
Loading

_templates/layout.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@
7575
</noscript>
7676

7777
<script type="text/javascript">
78-
var collapsedSections = ['PyTorch Recipes', 'Image and Video', 'Audio', 'Text', 'Reinforcement Learning', 'Deploying PyTorch Models in Production', 'Code Transforms with FX', 'Frontend APIs', 'Extending PyTorch', 'Model Optimization', 'Parallel and Distributed Training', 'Mobile'];
78+
var collapsedSections = ['PyTorch Recipes', 'Learning PyTorch', 'Image and Video', 'Audio', 'Text', 'Reinforcement Learning', 'Deploying PyTorch Models in Production', 'Code Transforms with FX', 'Frontend APIs', 'Extending PyTorch', 'Model Optimization', 'Parallel and Distributed Training', 'Mobile'];
7979
</script>
8080

8181
<img height="1" width="1" style="border-style:none;" alt="" src="https://www.googleadservices.com/pagead/conversion/795629140/?label=txkmCPmdtosBENSssfsC&amp;guid=ON&amp;script=0"/>

advanced_source/cpp_export.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ If you need to exclude some methods in your ``nn.Module``
115115
because they use Python features that TorchScript doesn't support yet,
116116
you could annotate those with ``@torch.jit.ignore``
117117

118-
``my_module`` is an instance of
118+
``sm`` is an instance of
119119
``ScriptModule`` that is ready for serialization.
120120

121121
Step 2: Serializing Your Script Module to a File
@@ -132,7 +132,7 @@ on the module and pass it a filename::
132132
traced_script_module.save("traced_resnet_model.pt")
133133

134134
This will produce a ``traced_resnet_model.pt`` file in your working directory.
135-
If you also would like to serialize ``my_module``, call ``my_module.save("my_module_model.pt")``
135+
If you also would like to serialize ``sm``, call ``sm.save("my_module_model.pt")``
136136
We have now officially left the realm of Python and are ready to cross over to the sphere
137137
of C++.
138138

advanced_source/cpp_extension.rst

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -115,13 +115,13 @@ PyTorch has no knowledge of the *algorithm* you are implementing. It knows only
115115
of the individual operations you use to compose your algorithm. As such, PyTorch
116116
must execute your operations individually, one after the other. Since each
117117
individual call to the implementation (or *kernel*) of an operation, which may
118-
involve launch of a CUDA kernel, has a certain amount of overhead, this overhead
119-
may become significant across many function calls. Furthermore, the Python
120-
interpreter that is running our code can itself slow down our program.
118+
involve the launch of a CUDA kernel, has a certain amount of overhead, this
119+
overhead may become significant across many function calls. Furthermore, the
120+
Python interpreter that is running our code can itself slow down our program.
121121

122122
A definite method of speeding things up is therefore to rewrite parts in C++ (or
123123
CUDA) and *fuse* particular groups of operations. Fusing means combining the
124-
implementations of many functions into a single functions, which profits from
124+
implementations of many functions into a single function, which profits from
125125
fewer kernel launches as well as other optimizations we can perform with
126126
increased visibility of the global flow of data.
127127

@@ -509,12 +509,12 @@ and with our new C++ version::
509509
Forward: 349.335 us | Backward 443.523 us
510510

511511
We can already see a significant speedup for the forward function (more than
512-
30%). For the backward function a speedup is visible, albeit not major one. The
513-
backward pass I wrote above was not particularly optimized and could definitely
514-
be improved. Also, PyTorch's automatic differentiation engine can automatically
515-
parallelize computation graphs, may use a more efficient flow of operations
516-
overall, and is also implemented in C++, so it's expected to be fast.
517-
Nevertheless, this is a good start.
512+
30%). For the backward function, a speedup is visible, albeit not a major one.
513+
The backward pass I wrote above was not particularly optimized and could
514+
definitely be improved. Also, PyTorch's automatic differentiation engine can
515+
automatically parallelize computation graphs, may use a more efficient flow of
516+
operations overall, and is also implemented in C++, so it's expected to be
517+
fast. Nevertheless, this is a good start.
518518

519519
Performance on GPU Devices
520520
**************************
@@ -571,7 +571,7 @@ And C++/ATen::
571571

572572
That's a great overall speedup compared to non-CUDA code. However, we can pull
573573
even more performance out of our C++ code by writing custom CUDA kernels, which
574-
we'll dive into soon. Before that, let's dicuss another way of building your C++
574+
we'll dive into soon. Before that, let's discuss another way of building your C++
575575
extensions.
576576

577577
JIT Compiling Extensions
@@ -851,7 +851,7 @@ and ``Double``), you can use ``AT_DISPATCH_ALL_TYPES``.
851851

852852
Note that we perform some operations with plain ATen. These operations will
853853
still run on the GPU, but using ATen's default implementations. This makes
854-
sense, because ATen will use highly optimized routines for things like matrix
854+
sense because ATen will use highly optimized routines for things like matrix
855855
multiplies (e.g. ``addmm``) or convolutions which would be much harder to
856856
implement and improve ourselves.
857857

@@ -903,7 +903,7 @@ You can see in the CUDA kernel that we work directly on pointers with the right
903903
type. Indeed, working directly with high level type agnostic tensors inside cuda
904904
kernels would be very inefficient.
905905

906-
However, this comes at a cost of ease of use and readibility, especially for
906+
However, this comes at a cost of ease of use and readability, especially for
907907
highly dimensional data. In our example, we know for example that the contiguous
908908
``gates`` tensor has 3 dimensions:
909909

@@ -920,7 +920,7 @@ arithmetic.
920920
gates.data<scalar_t>()[n*3*state_size + row*state_size + column]
921921
922922
923-
In addition to being verbose, this expression needs stride to be explicitely
923+
In addition to being verbose, this expression needs stride to be explicitly
924924
known, and thus passed to the kernel function within its arguments. You can see
925925
that in the case of kernel functions accepting multiple tensors with different
926926
sizes you will end up with a very long list of arguments.
@@ -1101,7 +1101,7 @@ on it:
11011101
const int threads = 1024;
11021102
const dim3 blocks((state_size + threads - 1) / threads, batch_size);
11031103
1104-
AT_DISPATCH_FLOATING_TYPES(X.type(), "lltm_forward_cuda", ([&] {
1104+
AT_DISPATCH_FLOATING_TYPES(X.type(), "lltm_backward_cuda", ([&] {
11051105
lltm_cuda_backward_kernel<scalar_t><<<blocks, threads>>>(
11061106
d_old_cell.packed_accessor32<scalar_t,2,torch::RestrictPtrTraits>(),
11071107
d_gates.packed_accessor32<scalar_t,3,torch::RestrictPtrTraits>(),

advanced_source/ddp_pipeline.py

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,6 @@ def forward(self, x):
8989
class Encoder(nn.Module):
9090
def __init__(self, ntoken, ninp, dropout=0.5):
9191
super(Encoder, self).__init__()
92-
self.src_mask = None
9392
self.pos_encoder = PositionalEncoding(ninp, dropout)
9493
self.encoder = nn.Embedding(ntoken, ninp)
9594
self.ninp = ninp
@@ -99,17 +98,9 @@ def init_weights(self):
9998
initrange = 0.1
10099
self.encoder.weight.data.uniform_(-initrange, initrange)
101100

102-
def _generate_square_subsequent_mask(self, sz):
103-
mask = (torch.triu(torch.ones(sz, sz)) == 1).transpose(0, 1)
104-
mask = mask.float().masked_fill(mask == 0, float('-inf')).masked_fill(mask == 1, float(0.0))
105-
return mask
106-
107101
def forward(self, src):
108-
if self.src_mask is None or self.src_mask.size(0) != src.size(0):
109-
device = src.device
110-
mask = self._generate_square_subsequent_mask(src.size(0)).to(device)
111-
self.src_mask = mask
112-
102+
# Need (S, N) format for encoder.
103+
src = src.t()
113104
src = self.encoder(src) * math.sqrt(self.ninp)
114105
return self.pos_encoder(src)
115106

@@ -125,7 +116,8 @@ def init_weights(self):
125116
self.decoder.weight.data.uniform_(-initrange, initrange)
126117

127118
def forward(self, inp):
128-
return self.decoder(inp)
119+
# Need batch dimension first for output of pipeline.
120+
return self.decoder(inp).permute(1, 0, 2)
129121

130122
######################################################################
131123
# Start multiple processes for training
@@ -245,7 +237,8 @@ def get_batch(source, i):
245237
seq_len = min(bptt, len(source) - 1 - i)
246238
data = source[i:i+seq_len]
247239
target = source[i+1:i+1+seq_len].view(-1)
248-
return data, target
240+
# Need batch dimension first for pipeline parallelism.
241+
return data.t(), target
249242

250243
######################################################################
251244
# Model scale and Pipe initialization
@@ -318,8 +311,9 @@ def get_batch(source, i):
318311
# Need to use 'checkpoint=never' since as of PyTorch 1.8, Pipe checkpointing
319312
# doesn't work with DDP.
320313
from torch.distributed.pipeline.sync import Pipe
314+
chunks = 8
321315
model = Pipe(torch.nn.Sequential(
322-
*module_list), chunks = 8, checkpoint="never")
316+
*module_list), chunks = chunks, checkpoint="never")
323317

324318
# Initialize process group and wrap model in DDP.
325319
from torch.nn.parallel import DistributedDataParallel

advanced_source/super_resolution_with_onnxruntime.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ def _initialize_weights(self):
145145
do_constant_folding=True, # whether to execute constant folding for optimization
146146
input_names = ['input'], # the model's input names
147147
output_names = ['output'], # the model's output names
148-
dynamic_axes={'input' : {0 : 'batch_size'}, # variable lenght axes
148+
dynamic_axes={'input' : {0 : 'batch_size'}, # variable length axes
149149
'output' : {0 : 'batch_size'}})
150150

151151
######################################################################

beginner_source/Intro_to_TorchScript_tutorial.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ def forward(self, x, h):
7777
# cell <https://colah.github.io/posts/2015-08-Understanding-LSTMs/>`__–that
7878
# is–it’s a function that is applied on a loop.
7979
#
80-
# We instantiated the module, and made ``x`` and ``y``, which are just 3x4
80+
# We instantiated the module, and made ``x`` and ``h``, which are just 3x4
8181
# matrices of random values. Then we invoked the cell with
8282
# ``my_cell(x, h)``. This in turn calls our ``forward`` function.
8383
#
@@ -274,6 +274,8 @@ def forward(self, x, h):
274274

275275
my_cell = MyCell(MyDecisionGate())
276276
traced_cell = torch.jit.trace(my_cell, (x, h))
277+
278+
print(traced_cell.dg.code)
277279
print(traced_cell.code)
278280

279281

@@ -293,8 +295,10 @@ def forward(self, x, h):
293295
scripted_gate = torch.jit.script(MyDecisionGate())
294296

295297
my_cell = MyCell(scripted_gate)
296-
traced_cell = torch.jit.script(my_cell)
297-
print(traced_cell.code)
298+
scripted_cell = torch.jit.script(my_cell)
299+
300+
print(scripted_gate.code)
301+
print(scripted_cell.code)
298302

299303

300304
######################################################################

beginner_source/PyTorch Cheat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ See [onnx](https://pytorch.org/docs/stable/onnx.html)
5050
from torchvision import datasets, models, transforms # vision datasets, architectures & transforms
5151
import torchvision.transforms as transforms # composable transforms
5252
```
53-
See [torchvision](https://pytorch.org/docs/stable/torchvision/index.html)
53+
See [torchvision](https://pytorch.org/vision/stable/index.html)
5454

5555
### Distributed Training
5656

beginner_source/README.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,4 @@ Beginner Tutorials
2323

2424
6. transformer_translation.py
2525
Language Translation with Transformers
26-
https://pytorch.org/tutorials/beginner/transformer_translation.html
26+
https://pytorch.org/tutorials/beginner/transformer_tutorial.html

beginner_source/basics/autogradqs_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@
4747
#
4848
# In this network, ``w`` and ``b`` are **parameters**, which we need to
4949
# optimize. Thus, we need to be able to compute the gradients of loss
50-
# function with respect to those variables. In orded to do that, we set
50+
# function with respect to those variables. In order to do that, we set
5151
# the ``requires_grad`` property of those tensors.
5252

5353
#######################################################################

beginner_source/basics/data_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
# PyTorch domain libraries provide a number of pre-loaded datasets (such as FashionMNIST) that
2626
# subclass ``torch.utils.data.Dataset`` and implement functions specific to the particular data.
2727
# They can be used to prototype and benchmark your model. You can find them
28-
# here: `Image Datasets <https://pytorch.org/docs/stable/torchvision/datasets.html>`_,
28+
# here: `Image Datasets <https://pytorch.org/vision/stable/datasets.html>`_,
2929
# `Text Datasets <https://pytorch.org/text/stable/datasets.html>`_, and
3030
# `Audio Datasets <https://pytorch.org/audio/stable/datasets.html>`_
3131
#
@@ -38,7 +38,7 @@
3838
# Fashion-MNIST is a dataset of Zalando’s article images consisting of of 60,000 training examples and 10,000 test examples.
3939
# Each example comprises a 28×28 grayscale image and an associated label from one of 10 classes.
4040
#
41-
# We load the `FashionMNIST Dataset <https://pytorch.org/docs/stable/torchvision/datasets.html#fashion-mnist>`_ with the following parameters:
41+
# We load the `FashionMNIST Dataset <https://pytorch.org/vision/stable/datasets.html#fashion-mnist>`_ with the following parameters:
4242
# - ``root`` is the path where the train/test data is stored,
4343
# - ``train`` specifies training or test dataset,
4444
# - ``download=True`` downloads the data from the internet if it's not available at ``root``.

beginner_source/basics/optimization_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@
1212
Optimizing Model Parameters
1313
===========================
1414
15-
Now that we have a model and data it's time to train, validate and test our model by optimizing it's parameters on
15+
Now that we have a model and data it's time to train, validate and test our model by optimizing its parameters on
1616
our data. Training a model is an iterative process; in each iteration (called an *epoch*) the model makes a guess about the output, calculates
1717
the error in its guess (*loss*), collects the derivatives of the error with respect to its parameters (as we saw in
1818
the `previous section <autograd_tutorial.html>`_), and **optimizes** these parameters using gradient descent. For a more
1919
detailed walkthrough of this process, check out this video on `backpropagation from 3Blue1Brown <https://www.youtube.com/watch?v=tIeHLnjs5U8>`__.
2020
21-
Pre-requisite Code
21+
Prerequisite Code
2222
-----------------
2323
We load the code from the previous sections on `Datasets & DataLoaders <data_tutorial.html>`_
2424
and `Build Model <buildmodel_tutorial.html>`_.

beginner_source/basics/quickstart_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
# all of which include datasets. For this tutorial, we will be using a TorchVision dataset.
3636
#
3737
# The ``torchvision.datasets`` module contains ``Dataset`` objects for many real-world vision data like
38-
# CIFAR, COCO (`full list here <https://pytorch.org/docs/stable/torchvision/datasets.html>`_). In this tutorial, we
38+
# CIFAR, COCO (`full list here <https://pytorch.org/vision/stable/datasets.html>`_). In this tutorial, we
3939
# use the FashionMNIST dataset. Every TorchVision ``Dataset`` includes two arguments: ``transform`` and
4040
# ``target_transform`` to modify the samples and labels respectively.
4141

beginner_source/basics/transforms_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
1919
All TorchVision datasets have two parameters -``transform`` to modify the features and
2020
``target_transform`` to modify the labels - that accept callables containing the transformation logic.
21-
The `torchvision.transforms <https://pytorch.org/docs/stable/torchvision/transforms.html>`_ module offers
21+
The `torchvision.transforms <https://pytorch.org/vision/stable/transforms.html>`_ module offers
2222
several commonly-used transforms out of the box.
2323
2424
The FashionMNIST features are in PIL Image format, and the labels are integers.
@@ -41,7 +41,7 @@
4141
# ToTensor()
4242
# -------------------------------
4343
#
44-
# `ToTensor <https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.ToTensor>`_
44+
# `ToTensor <https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ToTensor>`_
4545
# converts a PIL image or NumPy ``ndarray`` into a ``FloatTensor``. and scales
4646
# the image's pixel intensity values in the range [0., 1.]
4747
#

beginner_source/blitz/autograd_tutorial.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -170,14 +170,12 @@
170170
# vector-Jacobian product. That is, given any vector :math:`\vec{v}`, compute the product
171171
# :math:`J^{T}\cdot \vec{v}`
172172
#
173-
# If :math:`v` happens to be the gradient of a scalar function
173+
# If :math:`\vec{v}` happens to be the gradient of a scalar function :math:`l=g\left(\vec{y}\right)`:
174174
#
175175
# .. math::
176176
#
177177
#
178-
# l
179-
# =
180-
# g\left(\vec{y}\right)
178+
# \vec{v}
181179
# =
182180
# \left(\begin{array}{ccc}\frac{\partial l}{\partial y_{1}} & \cdots & \frac{\partial l}{\partial y_{m}}\end{array}\right)^{T}
183181
#

beginner_source/blitz/cifar10_tutorial.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -43,15 +43,15 @@
4343
4444
We will do the following steps in order:
4545
46-
1. Load and normalizing the CIFAR10 training and test datasets using
46+
1. Load and normalize the CIFAR10 training and test datasets using
4747
``torchvision``
4848
2. Define a Convolutional Neural Network
4949
3. Define a loss function
5050
4. Train the network on the training data
5151
5. Test the network on the test data
5252
53-
1. Loading and normalizing CIFAR10
54-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
53+
1. Load and normalize CIFAR10
54+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5555
5656
Using ``torchvision``, it’s extremely easy to load CIFAR10.
5757
"""
@@ -62,6 +62,8 @@
6262
########################################################################
6363
# The output of torchvision datasets are PILImage images of range [0, 1].
6464
# We transform them to Tensors of normalized range [-1, 1].
65+
66+
########################################################################
6567
# .. note::
6668
# If running on Windows and you get a BrokenPipeError, try setting
6769
# the num_worker of torch.utils.data.DataLoader() to 0.
@@ -123,7 +125,7 @@ def imshow(img):
123125

124126
class Net(nn.Module):
125127
def __init__(self):
126-
super(Net, self).__init__()
128+
super().__init__()
127129
self.conv1 = nn.Conv2d(3, 6, 5)
128130
self.pool = nn.MaxPool2d(2, 2)
129131
self.conv2 = nn.Conv2d(6, 16, 5)
@@ -318,7 +320,7 @@ def forward(self, x):
318320
#
319321
# inputs, labels = data[0].to(device), data[1].to(device)
320322
#
321-
# Why dont I notice MASSIVE speedup compared to CPU? Because your network
323+
# Why don't I notice MASSIVE speedup compared to CPU? Because your network
322324
# is really small.
323325
#
324326
# **Exercise:** Try increasing the width of your network (argument 2 of

beginner_source/blitz/neural_networks_tutorial.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ def __init__(self):
5858
def forward(self, x):
5959
# Max pooling over a (2, 2) window
6060
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
61-
# If the size is a square you can only specify a single number
61+
# If the size is a square, you can specify with a single number
6262
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
6363
x = x.view(-1, self.num_flat_features(x))
6464
x = F.relu(self.fc1(x))
@@ -176,7 +176,7 @@ def num_flat_features(self, x):
176176
# -> loss
177177
#
178178
# So, when we call ``loss.backward()``, the whole graph is differentiated
179-
# w.r.t. the loss, and all Tensors in the graph that has ``requires_grad=True``
179+
# w.r.t. the loss, and all Tensors in the graph that have ``requires_grad=True``
180180
# will have their ``.grad`` Tensor accumulated with the gradient.
181181
#
182182
# For illustration, let us follow a few steps backward:

0 commit comments

Comments
 (0)