[BUG] - Accelerating PyTorch Transformers by replacing nn.Transformer with Nested Tensors and torch.compile()

### Add Link

https://pytorch.org/tutorials/intermediate/transformer_building_blocks.html

### Describe the bug

Unfinished sentence in the tutorial: 

"Thanks to [this PR](https://github.com/pytorch/pytorch/pull/133882) this is no longer the case. Instead, fully masked rows in scaled_dot_product_attention **[missing text]**. For cases where nn.MHA does not employ the “fast-path”, this will also apply."

### Describe your environment

Brave Browser.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] - Accelerating PyTorch Transformers by replacing nn.Transformer with Nested Tensors and torch.compile() #3176

Add Link

Describe the bug

Describe your environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] - Accelerating PyTorch Transformers by replacing nn.Transformer with Nested Tensors and torch.compile() #3176

Description

Add Link

Describe the bug

Describe your environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions