Open
Description
https://github.com/pytorch/tutorials/blob/master/beginner_source/transformer_tutorial.py#L300
batch_size
is a bad variable name, because it is not the dimension usually referred to as the batch size.. Instead, it is the sequence length associated with a particular batch. I suggest using batch_seq_len
for clarity.
cc @suraj813