You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: index.rst
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -528,7 +528,7 @@ What's new in PyTorch tutorials?
528
528
:header: (beta) Implement High-Performance Transformers with SCALED DOT PRODUCT ATTENTION
529
529
:card_description: This tutorial explores the new torch.nn.functional.scaled_dot_product_attention and how it can be used to construct Transformer components.
Copy file name to clipboardExpand all lines: intermediate_source/scaled_dot_product_attention_tutorial.py
+17-12Lines changed: 17 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@
12
12
# In this tutorial, we want to highlight a new ``torch.nn.functional`` function
13
13
# that can be helpful for implementing transformer architectures. The
14
14
# function is named ``torch.nn.functional.scaled_dot_product_attention``.
15
-
# For detailed description of the function, see the `PyTorch# documentation <https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html#torch.nn.functional.scaled_dot_product_attention>`__.
15
+
# For detailed description of the function, see the `PyTorch documentation <https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html#torch.nn.functional.scaled_dot_product_attention>`__.
16
16
# This function has already been incorporated into ``torch.nn.MultiheadAttention`` and ``torch.nn.TransformerEncoderLayer``.
17
17
#
18
18
# Overview
@@ -22,10 +22,7 @@
22
22
# the definition found in the paper `Attention is all you
23
23
# need <https://arxiv.org/abs/1706.03762>`__. While this function can be
24
24
# written in PyTorch using existing functions, for GPU tensors this
25
-
# function will implicitly dispatch to an optimized implementation. The
26
-
# function is also highly modular and can be used to implement other
0 commit comments