15
15
# This tutorial will give an introduction to DCGANs through an example. We
16
16
# will train a generative adversarial network (GAN) to generate new
17
17
# celebrities after showing it pictures of many real celebrities. Most of
18
- # the code here is from the dcgan implementation in
18
+ # the code here is from the DCGAN implementation in
19
19
# `pytorch/examples <https://github.com/pytorch/examples>`__, and this
20
20
# document will give a thorough explanation of the implementation and shed
21
21
# light on how and why this model works. But don’t worry, no prior
30
30
# What is a GAN?
31
31
# ~~~~~~~~~~~~~~
32
32
#
33
- # GANs are a framework for teaching a DL model to capture the training
34
- # data’s distribution so we can generate new data from that same
33
+ # GANs are a framework for teaching a deep learning model to capture the training
34
+ # data distribution so we can generate new data from that same
35
35
# distribution. GANs were invented by Ian Goodfellow in 2014 and first
36
36
# described in the paper `Generative Adversarial
37
37
# Nets <https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf>`__.
145
145
#
146
146
# Let’s define some inputs for the run:
147
147
#
148
- # - ** dataroot** - the path to the root of the dataset folder. We will
149
- # talk more about the dataset in the next section
150
- # - ** workers** - the number of worker threads for loading the data with
151
- # the DataLoader
152
- # - ** batch_size** - the batch size used in training. The DCGAN paper
153
- # uses a batch size of 128
154
- # - ** image_size** - the spatial size of the images used for training.
148
+ # - `` dataroot`` - the path to the root of the dataset folder. We will
149
+ # talk more about the dataset in the next section.
150
+ # - `` workers`` - the number of worker threads for loading the data with
151
+ # the `` DataLoader``.
152
+ # - `` batch_size`` - the batch size used in training. The DCGAN paper
153
+ # uses a batch size of 128.
154
+ # - `` image_size`` - the spatial size of the images used for training.
155
155
# This implementation defaults to 64x64. If another size is desired,
156
156
# the structures of D and G must be changed. See
157
157
# `here <https://github.com/pytorch/examples/issues/70>`__ for more
158
- # details
159
- # - **nc** - number of color channels in the input images. For color
160
- # images this is 3
161
- # - **nz** - length of latent vector
162
- # - ** ngf** - relates to the depth of feature maps carried through the
163
- # generator
164
- # - ** ndf** - sets the depth of feature maps propagated through the
165
- # discriminator
166
- # - ** num_epochs** - number of training epochs to run. Training for
158
+ # details.
159
+ # - ``nc`` - number of color channels in the input images. For color
160
+ # images this is 3.
161
+ # - ``nz`` - length of latent vector.
162
+ # - `` ngf`` - relates to the depth of feature maps carried through the
163
+ # generator.
164
+ # - `` ndf`` - sets the depth of feature maps propagated through the
165
+ # discriminator.
166
+ # - `` num_epochs`` - number of training epochs to run. Training for
167
167
# longer will probably lead to better results but will also take much
168
- # longer
169
- # - **lr** - learning rate for training. As described in the DCGAN paper,
170
- # this number should be 0.0002
171
- # - ** beta1** - beta1 hyperparameter for Adam optimizers. As described in
172
- # paper, this number should be 0.5
173
- # - ** ngpu** - number of GPUs available. If this is 0, code will run in
168
+ # longer.
169
+ # - ``lr`` - learning rate for training. As described in the DCGAN paper,
170
+ # this number should be 0.0002.
171
+ # - `` beta1`` - beta1 hyperparameter for Adam optimizers. As described in
172
+ # paper, this number should be 0.5.
173
+ # - `` ngpu`` - number of GPUs available. If this is 0, code will run in
174
174
# CPU mode. If this number is greater than 0 it will run on that number
175
- # of GPUs
176
- #
175
+ # of GPUs.
176
+ #
177
177
178
178
# Root directory for dataset
179
179
dataroot = "data/celeba"
206
206
# Learning rate for optimizers
207
207
lr = 0.0002
208
208
209
- # Beta1 hyperparam for Adam optimizers
209
+ # Beta1 hyperparameter for Adam optimizers
210
210
beta1 = 0.5
211
211
212
212
# Number of GPUs available. Use 0 for CPU mode.
221
221
# dataset <http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html>`__ which can
222
222
# be downloaded at the linked site, or in `Google
223
223
# Drive <https://drive.google.com/drive/folders/0B7EVK8r0v71pTUZsaXdaSnZBZzg>`__.
224
- # The dataset will download as a file named * img_align_celeba.zip* . Once
225
- # downloaded, create a directory named * celeba* and extract the zip file
226
- # into that directory. Then, set the * dataroot* input for this notebook to
227
- # the * celeba* directory you just created. The resulting directory
224
+ # The dataset will download as a file named `` img_align_celeba.zip`` . Once
225
+ # downloaded, create a directory named `` celeba`` and extract the zip file
226
+ # into that directory. Then, set the `` dataroot`` input for this notebook to
227
+ # the `` celeba`` directory you just created. The resulting directory
228
228
# structure should be:
229
229
#
230
230
# ::
237
237
# -> 537394.jpg
238
238
# ...
239
239
#
240
- # This is an important step because we will be using the ImageFolder
240
+ # This is an important step because we will be using the `` ImageFolder``
241
241
# dataset class, which requires there to be subdirectories in the
242
- # dataset’s root folder. Now, we can create the dataset, create the
242
+ # dataset root folder. Now, we can create the dataset, create the
243
243
# dataloader, set the device to run on, and finally visualize some of the
244
244
# training data.
245
245
#
282
282
# ~~~~~~~~~~~~~~~~~~~~~
283
283
#
284
284
# From the DCGAN paper, the authors specify that all model weights shall
285
- # be randomly initialized from a Normal distribution with mean=0,
286
- # stdev=0.02. The ``weights_init`` function takes an initialized model as
285
+ # be randomly initialized from a Normal distribution with `` mean=0`` ,
286
+ # `` stdev=0.02`` . The ``weights_init`` function takes an initialized model as
287
287
# input and reinitializes all convolutional, convolutional-transpose, and
288
288
# batch normalization layers to meet this criteria. This function is
289
289
# applied to the models immediately after initialization.
290
290
#
291
291
292
- # custom weights initialization called on netG and netD
292
+ # custom weights initialization called on `` netG`` and `` netD``
293
293
def weights_init (m ):
294
294
classname = m .__class__ .__name__
295
295
if classname .find ('Conv' ) != - 1 :
@@ -319,10 +319,10 @@ def weights_init(m):
319
319
# .. figure:: /_static/img/dcgan_generator.png
320
320
# :alt: dcgan_generator
321
321
#
322
- # Notice, how the inputs we set in the input section (*nz*, * ngf* , and
323
- # *nc* ) influence the generator architecture in code. *nz* is the length
324
- # of the z input vector, * ngf* relates to the size of the feature maps
325
- # that are propagated through the generator, and *nc* is the number of
322
+ # Notice, how the inputs we set in the input section (``nz``, `` ngf`` , and
323
+ # ``nc`` ) influence the generator architecture in code. ``nz`` is the length
324
+ # of the z input vector, `` ngf`` relates to the size of the feature maps
325
+ # that are propagated through the generator, and ``nc`` is the number of
326
326
# channels in the output image (set to 3 for RGB images). Below is the
327
327
# code for the generator.
328
328
#
@@ -338,22 +338,22 @@ def __init__(self, ngpu):
338
338
nn .ConvTranspose2d ( nz , ngf * 8 , 4 , 1 , 0 , bias = False ),
339
339
nn .BatchNorm2d (ngf * 8 ),
340
340
nn .ReLU (True ),
341
- # state size. (ngf*8) x 4 x 4
341
+ # state size. `` (ngf*8) x 4 x 4``
342
342
nn .ConvTranspose2d (ngf * 8 , ngf * 4 , 4 , 2 , 1 , bias = False ),
343
343
nn .BatchNorm2d (ngf * 4 ),
344
344
nn .ReLU (True ),
345
- # state size. (ngf*4) x 8 x 8
345
+ # state size. `` (ngf*4) x 8 x 8``
346
346
nn .ConvTranspose2d ( ngf * 4 , ngf * 2 , 4 , 2 , 1 , bias = False ),
347
347
nn .BatchNorm2d (ngf * 2 ),
348
348
nn .ReLU (True ),
349
- # state size. (ngf*2) x 16 x 16
349
+ # state size. `` (ngf*2) x 16 x 16``
350
350
nn .ConvTranspose2d ( ngf * 2 , ngf , 4 , 2 , 1 , bias = False ),
351
351
nn .BatchNorm2d (ngf ),
352
352
nn .ReLU (True ),
353
- # state size. (ngf) x 32 x 32
353
+ # state size. `` (ngf) x 32 x 32``
354
354
nn .ConvTranspose2d ( ngf , nc , 4 , 2 , 1 , bias = False ),
355
355
nn .Tanh ()
356
- # state size. (nc) x 64 x 64
356
+ # state size. `` (nc) x 64 x 64``
357
357
)
358
358
359
359
def forward (self , input ):
@@ -369,12 +369,12 @@ def forward(self, input):
369
369
# Create the generator
370
370
netG = Generator (ngpu ).to (device )
371
371
372
- # Handle multi-gpu if desired
372
+ # Handle multi-GPU if desired
373
373
if (device .type == 'cuda' ) and (ngpu > 1 ):
374
374
netG = nn .DataParallel (netG , list (range (ngpu )))
375
375
376
- # Apply the weights_init function to randomly initialize all weights
377
- # to mean=0, stdev=0.02.
376
+ # Apply the `` weights_init`` function to randomly initialize all weights
377
+ # to `` mean=0``, `` stdev=0.02`` .
378
378
netG .apply (weights_init )
379
379
380
380
# Print the model
@@ -408,22 +408,22 @@ def __init__(self, ngpu):
408
408
super (Discriminator , self ).__init__ ()
409
409
self .ngpu = ngpu
410
410
self .main = nn .Sequential (
411
- # input is (nc) x 64 x 64
411
+ # input is `` (nc) x 64 x 64``
412
412
nn .Conv2d (nc , ndf , 4 , 2 , 1 , bias = False ),
413
413
nn .LeakyReLU (0.2 , inplace = True ),
414
- # state size. (ndf) x 32 x 32
414
+ # state size. `` (ndf) x 32 x 32``
415
415
nn .Conv2d (ndf , ndf * 2 , 4 , 2 , 1 , bias = False ),
416
416
nn .BatchNorm2d (ndf * 2 ),
417
417
nn .LeakyReLU (0.2 , inplace = True ),
418
- # state size. (ndf*2) x 16 x 16
418
+ # state size. `` (ndf*2) x 16 x 16``
419
419
nn .Conv2d (ndf * 2 , ndf * 4 , 4 , 2 , 1 , bias = False ),
420
420
nn .BatchNorm2d (ndf * 4 ),
421
421
nn .LeakyReLU (0.2 , inplace = True ),
422
- # state size. (ndf*4) x 8 x 8
422
+ # state size. `` (ndf*4) x 8 x 8``
423
423
nn .Conv2d (ndf * 4 , ndf * 8 , 4 , 2 , 1 , bias = False ),
424
424
nn .BatchNorm2d (ndf * 8 ),
425
425
nn .LeakyReLU (0.2 , inplace = True ),
426
- # state size. (ndf*8) x 4 x 4
426
+ # state size. `` (ndf*8) x 4 x 4``
427
427
nn .Conv2d (ndf * 8 , 1 , 4 , 1 , 0 , bias = False ),
428
428
nn .Sigmoid ()
429
429
)
@@ -440,12 +440,12 @@ def forward(self, input):
440
440
# Create the Discriminator
441
441
netD = Discriminator (ngpu ).to (device )
442
442
443
- # Handle multi-gpu if desired
443
+ # Handle multi-GPU if desired
444
444
if (device .type == 'cuda' ) and (ngpu > 1 ):
445
445
netD = nn .DataParallel (netD , list (range (ngpu )))
446
446
447
- # Apply the weights_init function to randomly initialize all weights
448
- # to mean=0, stdev=0.2.
447
+ # Apply the `` weights_init`` function to randomly initialize all weights
448
+ # like this: `` to mean=0, stdev=0.2`` .
449
449
netD .apply (weights_init )
450
450
451
451
# Print the model
@@ -485,7 +485,7 @@ def forward(self, input):
485
485
# images form out of the noise.
486
486
#
487
487
488
- # Initialize BCELoss function
488
+ # Initialize the `` BCELoss`` function
489
489
criterion = nn .BCELoss ()
490
490
491
491
# Create batch of latent vectors that we will use to visualize
@@ -509,7 +509,8 @@ def forward(self, input):
509
509
# we can train it. Be mindful that training GANs is somewhat of an art
510
510
# form, as incorrect hyperparameter settings lead to mode collapse with
511
511
# little explanation of what went wrong. Here, we will closely follow
512
- # Algorithm 1 from Goodfellow’s paper, while abiding by some of the best
512
+ # Algorithm 1 from the `Goodfellow’s paper <https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf>`__,
513
+ # while abiding by some of the best
513
514
# practices shown in `ganhacks <https://github.com/soumith/ganhacks>`__.
514
515
# Namely, we will “construct different mini-batches for real and fake”
515
516
# images, and also adjust G’s objective function to maximize
@@ -523,7 +524,8 @@ def forward(self, input):
523
524
# terms of Goodfellow, we wish to “update the discriminator by ascending
524
525
# its stochastic gradient”. Practically, we want to maximize
525
526
# :math:`log(D(x)) + log(1-D(G(z)))`. Due to the separate mini-batch
526
- # suggestion from ganhacks, we will calculate this in two steps. First, we
527
+ # suggestion from `ganhacks <https://github.com/soumith/ganhacks>`__,
528
+ # we will calculate this in two steps. First, we
527
529
# will construct a batch of real samples from the training set, forward
528
530
# pass through :math:`D`, calculate the loss (:math:`log(D(x))`), then
529
531
# calculate the gradients in a backward pass. Secondly, we will construct
@@ -545,7 +547,7 @@ def forward(self, input):
545
547
# G’s gradients in a backward pass, and finally updating G’s parameters
546
548
# with an optimizer step. It may seem counter-intuitive to use the real
547
549
# labels as GT labels for the loss function, but this allows us to use the
548
- # :math:`log(x)` part of the BCELoss (rather than the :math:`log(1-x)`
550
+ # :math:`log(x)` part of the `` BCELoss`` (rather than the :math:`log(1-x)`
549
551
# part) which is exactly what we want.
550
552
#
551
553
# Finally, we will do some statistic reporting and at the end of each
0 commit comments