Skip to content

Commit dd6a310

Browse files
committed
Build e28cf87
1 parent 1a85a1d commit dd6a310

File tree

71 files changed

+10505
-5153
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+10505
-5153
lines changed

_downloads/advanced_tutorial.ipynb

Lines changed: 122 additions & 0 deletions
Large diffs are not rendered by default.

_downloads/advanced_tutorial.py

Lines changed: 373 additions & 0 deletions
Large diffs are not rendered by default.

_downloads/deep_learning_tutorial.ipynb

Lines changed: 183 additions & 0 deletions
Large diffs are not rendered by default.

_downloads/deep_learning_tutorial.py

Lines changed: 390 additions & 0 deletions
Large diffs are not rendered by default.

_downloads/dynamic_net.ipynb

Lines changed: 27 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,54 @@
11
{
2+
"metadata": {
3+
"kernelspec": {
4+
"language": "python",
5+
"display_name": "Python 3",
6+
"name": "python3"
7+
},
8+
"language_info": {
9+
"version": "3.5.2",
10+
"pygments_lexer": "ipython3",
11+
"name": "python",
12+
"nbconvert_exporter": "python",
13+
"mimetype": "text/x-python",
14+
"file_extension": ".py",
15+
"codemirror_mode": {
16+
"version": 3,
17+
"name": "ipython"
18+
}
19+
}
20+
},
21+
"nbformat": 4,
222
"cells": [
323
{
4-
"cell_type": "code",
24+
"outputs": [],
525
"metadata": {
626
"collapsed": false
727
},
828
"source": [
929
"%matplotlib inline"
1030
],
1131
"execution_count": null,
12-
"outputs": []
32+
"cell_type": "code"
1333
},
1434
{
15-
"cell_type": "markdown",
35+
"metadata": {},
1636
"source": [
1737
"\nPyTorch: Control Flow + Weight Sharing\n--------------------------------------\n\nTo showcase the power of PyTorch dynamic graphs, we will implement a very strange\nmodel: a fully-connected ReLU network that on each forward pass randomly chooses\na number between 1 and 4 and has that many hidden layers, reusing the same\nweights multiple times to compute the innermost hidden layers.\n\n"
1838
],
19-
"metadata": {}
39+
"cell_type": "markdown"
2040
},
2141
{
22-
"cell_type": "code",
42+
"outputs": [],
2343
"metadata": {
2444
"collapsed": false
2545
},
2646
"source": [
2747
"import random\nimport torch\nfrom torch.autograd import Variable\n\nclass DynamicNet(torch.nn.Module):\n def __init__(self, D_in, H, D_out):\n \"\"\"\n In the constructor we construct three nn.Linear instances that we will use\n in the forward pass.\n \"\"\"\n super(DynamicNet, self).__init__()\n self.input_linear = torch.nn.Linear(D_in, H)\n self.middle_linear = torch.nn.Linear(H, H)\n self.output_linear = torch.nn.Linear(H, D_out)\n\n def forward(self, x):\n \"\"\"\n For the forward pass of the model, we randomly choose either 0, 1, 2, or 3\n and reuse the middle_linear Module that many times to compute hidden layer\n representations.\n\n Since each forward pass builds a dynamic computation graph, we can use normal\n Python control-flow operators like loops or conditional statements when\n defining the forward pass of the model.\n\n Here we also see that it is perfectly safe to reuse the same Module many\n times when defining a computational graph. This is a big improvement from Lua\n Torch, where each Module could be used only once.\n \"\"\"\n h_relu = self.input_linear(x).clamp(min=0)\n for _ in range(random.randint(0, 3)):\n h_relu = self.middle_linear(h_relu).clamp(min=0)\n y_pred = self.output_linear(h_relu)\n return y_pred\n\n\n# N is batch size; D_in is input dimension;\n# H is hidden dimension; D_out is output dimension.\nN, D_in, H, D_out = 64, 1000, 100, 10\n\n# Create random Tensors to hold inputs and outputs, and wrap them in Variables\nx = Variable(torch.randn(N, D_in))\ny = Variable(torch.randn(N, D_out), requires_grad=False)\n\n# Construct our model by instantiating the class defined above\nmodel = DynamicNet(D_in, H, D_out)\n\n# Construct our loss function and an Optimizer. Training this strange model with\n# vanilla stochastic gradient descent is tough, so we use momentum\ncriterion = torch.nn.MSELoss(size_average=False)\noptimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9)\nfor t in range(500):\n # Forward pass: Compute predicted y by passing x to the model\n y_pred = model(x)\n\n # Compute and print loss\n loss = criterion(y_pred, y)\n print(t, loss.data[0])\n\n # Zero gradients, perform a backward pass, and update the weights.\n optimizer.zero_grad()\n loss.backward()\n optimizer.step()"
2848
],
2949
"execution_count": null,
30-
"outputs": []
50+
"cell_type": "code"
3151
}
3252
],
33-
"metadata": {
34-
"kernelspec": {
35-
"name": "python3",
36-
"language": "python",
37-
"display_name": "Python 3"
38-
},
39-
"language_info": {
40-
"codemirror_mode": {
41-
"name": "ipython",
42-
"version": 3
43-
},
44-
"name": "python",
45-
"pygments_lexer": "ipython3",
46-
"nbconvert_exporter": "python",
47-
"mimetype": "text/x-python",
48-
"file_extension": ".py",
49-
"version": "3.5.2"
50-
}
51-
},
52-
"nbformat_minor": 0,
53-
"nbformat": 4
53+
"nbformat_minor": 0
5454
}

_downloads/pytorch_tutorial.ipynb

Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,255 @@
1+
{
2+
"nbformat": 4,
3+
"cells": [
4+
{
5+
"source": [
6+
"%matplotlib inline"
7+
],
8+
"cell_type": "code",
9+
"execution_count": null,
10+
"outputs": [],
11+
"metadata": {
12+
"collapsed": false
13+
}
14+
},
15+
{
16+
"cell_type": "markdown",
17+
"source": [
18+
"\nIntroduction to PyTorch\n***********************\n\nIntroduction to Torch's tensor library\n======================================\n\nAll of deep learning is computations on tensors, which are\ngeneralizations of a matrix that can be indexed in more than 2\ndimensions. We will see exactly what this means in-depth later. First,\nlets look what we can do with tensors.\n\n"
19+
],
20+
"metadata": {}
21+
},
22+
{
23+
"source": [
24+
"# Author: Robert Guthrie\n\nimport torch\nimport torch.autograd as autograd\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.optim as optim\n\ntorch.manual_seed(1)"
25+
],
26+
"cell_type": "code",
27+
"execution_count": null,
28+
"outputs": [],
29+
"metadata": {
30+
"collapsed": false
31+
}
32+
},
33+
{
34+
"cell_type": "markdown",
35+
"source": [
36+
"Creating Tensors\n~~~~~~~~~~~~~~~~\n\nTensors can be created from Python lists with the torch.Tensor()\nfunction.\n\n\n"
37+
],
38+
"metadata": {}
39+
},
40+
{
41+
"source": [
42+
"# Create a torch.Tensor object with the given data. It is a 1D vector\nV_data = [1., 2., 3.]\nV = torch.Tensor(V_data)\nprint(V)\n\n# Creates a matrix\nM_data = [[1., 2., 3.], [4., 5., 6]]\nM = torch.Tensor(M_data)\nprint(M)\n\n# Create a 3D tensor of size 2x2x2.\nT_data = [[[1., 2.], [3., 4.]],\n [[5., 6.], [7., 8.]]]\nT = torch.Tensor(T_data)\nprint(T)"
43+
],
44+
"cell_type": "code",
45+
"execution_count": null,
46+
"outputs": [],
47+
"metadata": {
48+
"collapsed": false
49+
}
50+
},
51+
{
52+
"cell_type": "markdown",
53+
"source": [
54+
"What is a 3D tensor anyway? Think about it like this. If you have a\nvector, indexing into the vector gives you a scalar. If you have a\nmatrix, indexing into the matrix gives you a vector. If you have a 3D\ntensor, then indexing into the tensor gives you a matrix!\n\nA note on terminology:\nwhen I say \"tensor\" in this tutorial, it refers\nto any torch.Tensor object. Matrices and vectors are special cases of\ntorch.Tensors, where their dimension is 1 and 2 respectively. When I am\ntalking about 3D tensors, I will explicitly use the term \"3D tensor\".\n\n\n"
55+
],
56+
"metadata": {}
57+
},
58+
{
59+
"source": [
60+
"# Index into V and get a scalar\nprint(V[0])\n\n# Index into M and get a vector\nprint(M[0])\n\n# Index into T and get a matrix\nprint(T[0])"
61+
],
62+
"cell_type": "code",
63+
"execution_count": null,
64+
"outputs": [],
65+
"metadata": {
66+
"collapsed": false
67+
}
68+
},
69+
{
70+
"cell_type": "markdown",
71+
"source": [
72+
"You can also create tensors of other datatypes. The default, as you can\nsee, is Float. To create a tensor of integer types, try\ntorch.LongTensor(). Check the documentation for more data types, but\nFloat and Long will be the most common.\n\n\n"
73+
],
74+
"metadata": {}
75+
},
76+
{
77+
"cell_type": "markdown",
78+
"source": [
79+
"You can create a tensor with random data and the supplied dimensionality\nwith torch.randn()\n\n\n"
80+
],
81+
"metadata": {}
82+
},
83+
{
84+
"source": [
85+
"x = torch.randn((3, 4, 5))\nprint(x)"
86+
],
87+
"cell_type": "code",
88+
"execution_count": null,
89+
"outputs": [],
90+
"metadata": {
91+
"collapsed": false
92+
}
93+
},
94+
{
95+
"cell_type": "markdown",
96+
"source": [
97+
"Operations with Tensors\n~~~~~~~~~~~~~~~~~~~~~~~\n\nYou can operate on tensors in the ways you would expect.\n\n"
98+
],
99+
"metadata": {}
100+
},
101+
{
102+
"source": [
103+
"x = torch.Tensor([1., 2., 3.])\ny = torch.Tensor([4., 5., 6.])\nz = x + y\nprint(z)"
104+
],
105+
"cell_type": "code",
106+
"execution_count": null,
107+
"outputs": [],
108+
"metadata": {
109+
"collapsed": false
110+
}
111+
},
112+
{
113+
"cell_type": "markdown",
114+
"source": [
115+
"See `the documentation <http://pytorch.org/docs/torch.html>`__ for a\ncomplete list of the massive number of operations available to you. They\nexpand beyond just mathematical operations.\n\nOne helpful operation that we will make use of later is concatenation.\n\n\n"
116+
],
117+
"metadata": {}
118+
},
119+
{
120+
"source": [
121+
"# By default, it concatenates along the first axis (concatenates rows)\nx_1 = torch.randn(2, 5)\ny_1 = torch.randn(3, 5)\nz_1 = torch.cat([x_1, y_1])\nprint(z_1)\n\n# Concatenate columns:\nx_2 = torch.randn(2, 3)\ny_2 = torch.randn(2, 5)\n# second arg specifies which axis to concat along\nz_2 = torch.cat([x_2, y_2], 1)\nprint(z_2)\n\n# If your tensors are not compatible, torch will complain. Uncomment to see the error\n# torch.cat([x_1, x_2])"
122+
],
123+
"cell_type": "code",
124+
"execution_count": null,
125+
"outputs": [],
126+
"metadata": {
127+
"collapsed": false
128+
}
129+
},
130+
{
131+
"cell_type": "markdown",
132+
"source": [
133+
"Reshaping Tensors\n~~~~~~~~~~~~~~~~~\n\nUse the .view() method to reshape a tensor. This method receives heavy\nuse, because many neural network components expect their inputs to have\na certain shape. Often you will need to reshape before passing your data\nto the component.\n\n\n"
134+
],
135+
"metadata": {}
136+
},
137+
{
138+
"source": [
139+
"x = torch.randn(2, 3, 4)\nprint(x)\nprint(x.view(2, 12)) # Reshape to 2 rows, 12 columns\n# Same as above. If one of the dimensions is -1, its size can be inferred\nprint(x.view(2, -1))"
140+
],
141+
"cell_type": "code",
142+
"execution_count": null,
143+
"outputs": [],
144+
"metadata": {
145+
"collapsed": false
146+
}
147+
},
148+
{
149+
"cell_type": "markdown",
150+
"source": [
151+
"Computation Graphs and Automatic Differentiation\n================================================\n\nThe concept of a computation graph is essential to efficient deep\nlearning programming, because it allows you to not have to write the\nback propagation gradients yourself. A computation graph is simply a\nspecification of how your data is combined to give you the output. Since\nthe graph totally specifies what parameters were involved with which\noperations, it contains enough information to compute derivatives. This\nprobably sounds vague, so lets see what is going on using the\nfundamental class of Pytorch: autograd.Variable.\n\nFirst, think from a programmers perspective. What is stored in the\ntorch.Tensor objects we were creating above? Obviously the data and the\nshape, and maybe a few other things. But when we added two tensors\ntogether, we got an output tensor. All this output tensor knows is its\ndata and shape. It has no idea that it was the sum of two other tensors\n(it could have been read in from a file, it could be the result of some\nother operation, etc.)\n\nThe Variable class keeps track of how it was created. Lets see it in\naction.\n\n\n"
152+
],
153+
"metadata": {}
154+
},
155+
{
156+
"source": [
157+
"# Variables wrap tensor objects\nx = autograd.Variable(torch.Tensor([1., 2., 3]), requires_grad=True)\n# You can access the data with the .data attribute\nprint(x.data)\n\n# You can also do all the same operations you did with tensors with Variables.\ny = autograd.Variable(torch.Tensor([4., 5., 6]), requires_grad=True)\nz = x + y\nprint(z.data)\n\n# BUT z knows something extra.\nprint(z.creator)"
158+
],
159+
"cell_type": "code",
160+
"execution_count": null,
161+
"outputs": [],
162+
"metadata": {
163+
"collapsed": false
164+
}
165+
},
166+
{
167+
"cell_type": "markdown",
168+
"source": [
169+
"So Variables know what created them. z knows that it wasn't read in from\na file, it wasn't the result of a multiplication or exponential or\nwhatever. And if you keep following z.creator, you will find yourself at\nx and y.\n\nBut how does that help us compute a gradient?\n\n\n"
170+
],
171+
"metadata": {}
172+
},
173+
{
174+
"source": [
175+
"# Lets sum up all the entries in z\ns = z.sum()\nprint(s)\nprint(s.creator)"
176+
],
177+
"cell_type": "code",
178+
"execution_count": null,
179+
"outputs": [],
180+
"metadata": {
181+
"collapsed": false
182+
}
183+
},
184+
{
185+
"cell_type": "markdown",
186+
"source": [
187+
"So now, what is the derivative of this sum with respect to the first\ncomponent of x? In math, we want\n\n\\begin{align}\\frac{\\partial s}{\\partial x_0}\\end{align}\n\n\n\nWell, s knows that it was created as a sum of the tensor z. z knows\nthat it was the sum x + y. So\n\n\\begin{align}s = \\overbrace{x_0 + y_0}^\\text{$z_0$} + \\overbrace{x_1 + y_1}^\\text{$z_1$} + \\overbrace{x_2 + y_2}^\\text{$z_2$}\\end{align}\n\nAnd so s contains enough information to determine that the derivative\nwe want is 1!\n\nOf course this glosses over the challenge of how to actually compute\nthat derivative. The point here is that s is carrying along enough\ninformation that it is possible to compute it. In reality, the\ndevelopers of Pytorch program the sum() and + operations to know how to\ncompute their gradients, and run the back propagation algorithm. An\nin-depth discussion of that algorithm is beyond the scope of this\ntutorial.\n\n\n"
188+
],
189+
"metadata": {}
190+
},
191+
{
192+
"cell_type": "markdown",
193+
"source": [
194+
"Lets have Pytorch compute the gradient, and see that we were right:\n(note if you run this block multiple times, the gradient will increment.\nThat is because Pytorch *accumulates* the gradient into the .grad\nproperty, since for many models this is very convenient.)\n\n\n"
195+
],
196+
"metadata": {}
197+
},
198+
{
199+
"source": [
200+
"# calling .backward() on any variable will run backprop, starting from it.\ns.backward()\nprint(x.grad)"
201+
],
202+
"cell_type": "code",
203+
"execution_count": null,
204+
"outputs": [],
205+
"metadata": {
206+
"collapsed": false
207+
}
208+
},
209+
{
210+
"cell_type": "markdown",
211+
"source": [
212+
"Understanding what is going on in the block below is crucial for being a\nsuccessful programmer in deep learning.\n\n\n"
213+
],
214+
"metadata": {}
215+
},
216+
{
217+
"source": [
218+
"x = torch.randn((2, 2))\ny = torch.randn((2, 2))\nz = x + y # These are Tensor types, and backprop would not be possible\n\nvar_x = autograd.Variable(x)\nvar_y = autograd.Variable(y)\n# var_z contains enough information to compute gradients, as we saw above\nvar_z = var_x + var_y\nprint(var_z.creator)\n\nvar_z_data = var_z.data # Get the wrapped Tensor object out of var_z...\n# Re-wrap the tensor in a new variable\nnew_var_z = autograd.Variable(var_z_data)\n\n# ... does new_var_z have information to backprop to x and y?\n# NO!\nprint(new_var_z.creator)\n# And how could it? We yanked the tensor out of var_z (that is \n# what var_z.data is). This tensor doesn't know anything about\n# how it was computed. We pass it into new_var_z, and this is all the\n# information new_var_z gets. If var_z_data doesn't know how it was \n# computed, theres no way new_var_z will.\n# In essence, we have broken the variable away from its past history"
219+
],
220+
"cell_type": "code",
221+
"execution_count": null,
222+
"outputs": [],
223+
"metadata": {
224+
"collapsed": false
225+
}
226+
},
227+
{
228+
"cell_type": "markdown",
229+
"source": [
230+
"Here is the basic, extremely important rule for computing with\nautograd.Variables (note this is more general than Pytorch. There is an\nequivalent object in every major deep learning toolkit):\n\n**If you want the error from your loss function to backpropogate to a\ncomponent of your network, you MUST NOT break the Variable chain from\nthat component to your loss Variable. If you do, the loss will have no\nidea your component exists, and its parameters can't be updated.**\n\nI say this in bold, because this error can creep up on you in very\nsubtle ways (I will show some such ways below), and it will not cause\nyour code to crash or complain, so you must be careful.\n\n\n"
231+
],
232+
"metadata": {}
233+
}
234+
],
235+
"nbformat_minor": 0,
236+
"metadata": {
237+
"kernelspec": {
238+
"display_name": "Python 3",
239+
"language": "python",
240+
"name": "python3"
241+
},
242+
"language_info": {
243+
"nbconvert_exporter": "python",
244+
"codemirror_mode": {
245+
"version": 3,
246+
"name": "ipython"
247+
},
248+
"version": "3.5.2",
249+
"pygments_lexer": "ipython3",
250+
"name": "python",
251+
"file_extension": ".py",
252+
"mimetype": "text/x-python"
253+
}
254+
}
255+
}

0 commit comments

Comments
 (0)