Update dcgan_faces_tutorial.py

samarth4149 · web-flow · commit b65fcc650bf8 · 2020-01-10T15:15:00.000-05:00
Previous comment might have been slightly misleading. A bit easier to understand, if made explicit, that gradients of errD_real and errD_fake w.r.t. parameters of netD get added up/accumulated because of the successive backward calls without a zero_grad() in between.
diff --git a/beginner_source/dcgan_faces_tutorial.py b/beginner_source/dcgan_faces_tutorial.py
@@ -611,9 +611,12 @@ def forward(self, input):
         # Calculate D's loss on the all-fake batch
         errD_fake = criterion(output, label)
         # Calculate the gradients for this batch
+        # Without a zero_grad() call, this accumulates (sums) 
+        #   the gradients from this call with the previously computed
+        #   gradients from errD_real 
         errD_fake.backward()
         D_G_z1 = output.mean().item()
-        # Add the gradients from the all-real and all-fake batches
+        # Compute error of D as sum over the fake and the real batches
         errD = errD_real + errD_fake
         # Update D
         optimizerD.step()