We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 9e00157 commit 03b8905Copy full SHA for 03b8905
beginner_source/transformer_tutorial.py
@@ -41,7 +41,10 @@
41
# the earlier positions in the sequence. For the language modeling task, any
42
# tokens on the future positions should be masked. To produce a probability
43
# distribution over output words, the output of the ``nn.TransformerEncoder``
44
-# model is passed through a linear layer followed by a log-softmax function.
+# model is passed through a linear layer to output unnormalized logits.
45
+# The log-softmax function isn't applied here due to the later use of
46
+# `CrossEntropyLoss <https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html>`__,
47
+# which requires the inputs to be unnormalized logits.
48
#
49
50
import math
0 commit comments