You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Feel free to play around with the temperature parameter that controls the softness of the softmax function and the loss coefficients.
355
355
# In neural networks, it is easy to include to include additional loss functions to the main objectives to achieve goals like better generalization.
356
-
# Let's try including an objective for the student, but now let's focus on their hidden states rather than their output layers. In the previous example,
357
-
# the teacher's representation after the convolutional layers had the following shape:
# Same for the student, with the only exception being the number of filters, where here we have fewer filters.
362
-
#
356
+
# Let's try including an objective for the student, but now let's focus on their hidden states rather than their output layers.
363
357
# Our goal is to convey information from the teacher's representation to the student by including a naive loss function,
364
358
# whose minimization implies that the flattened vectors that are subsequently passed to the classifiers have become more *similar* as the loss decreases.
365
359
# Of course, the teacher does not update its weights, so the minimization depends only on the student's weights.
0 commit comments