File tree 1 file changed +12
-0
lines changed
1 file changed +12
-0
lines changed Original file line number Diff line number Diff line change @@ -468,3 +468,15 @@ We look forward to leveraging the approach we used here to more applications in
468
468
- We are working to improve the performance of FlexAttention to match FlashAttention3 on H100 GPUs.
469
469
- FlexAttention requires that all sequence lengths be a multiple of 128 \- this will be addressed soon.
470
470
- We plan on adding GQA support soon \- for now, you can just replicate the kv heads.
471
+
472
+
473
+ ### Acknowledgements
474
+
475
+ We want to highlight some prior work (and people) that have inspired FlexAttention.
476
+
477
+ - Tri Dao's work on FlashAttention
478
+ - Francisco Massa and the Xformers team for BlockSparseAttention in Triton
479
+ - The Jax team's work on SplashAttention
480
+ - Philippe Tillet and Keren Zhou for helping us with Triton
481
+ - Ali Hassani for discussions on neighborhood attention
482
+ - Everybody who's complained about attention kernels not supporting their favorite attention variant :)
You can’t perform that action at this time.
0 commit comments