You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: beginner_source/knowledge_distillation_tutorial.py
+10-9Lines changed: 10 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,7 @@
29
29
#
30
30
# * 1 GPU, 4GB of memory
31
31
# * PyTorch v2.0 or later
32
-
# * CIFAR-10 dataset (downloaded by the script and saved it in a directory called ``/data``)
32
+
# * CIFAR-10 dataset (downloaded by the script and saved in a directory called ``/data``)
33
33
34
34
importtorch
35
35
importtorch.nnasnn
@@ -156,7 +156,7 @@ def forward(self, x):
156
156
# One function is called ``train`` and takes the following arguments:
157
157
#
158
158
# - ``model``: A model instance to train (update its weights) via this function.
159
-
# - ``train_loader``: we defined our ``train_loader`` above, and its job is to feed the data into the model.
159
+
# - ``train_loader``: We defined our ``train_loader`` above, and its job is to feed the data into the model.
160
160
# - ``epochs``: How many times we loop over the dataset.
161
161
# - ``learning_rate``: The learning rate determines how large our steps towards convergence should be. Too large or too small steps can be detrimental.
162
162
# - ``device``: Determines the device to run the workload on. Can be either CPU or GPU depending on availability.
# but keep in mind, if you change the number of neurons / filters chances are a shape mismatch might occur.
733
733
#
734
734
# For more information, see:
735
+
#
735
736
# * `Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: Neural Information Processing System Deep Learning Workshop (2015) <https://arxiv.org/abs/1503.02531>`_
736
737
#
737
738
# * `Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets. In: Proceedings of the International Conference on Learning Representations (2015) <https://arxiv.org/abs/1412.6550>`_
0 commit comments