You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2024-12-02-hadacore.md
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -45,11 +45,11 @@ We process fragments of 256 elements in parallel using warp-level Tensor Core op
45
45
46
46
We benchmark HadaCore against the[ Dao AI Lab Hadamard Kernel](https://github.com/Dao-AILab) on both NVIDIA H100 and A100 GPUs across varying Hadamard and input tensor sizes.
47
47
48
-
{:style="width:100%"}
48
+
{:style="width:100%"}
49
49
50
50
51
51
52
-
*Figure 5: HadaCore Kernel Speedup on NVIDIA A100 over Dao AI Lab Fast Hadamard Kernel*
52
+
*Figure 4: HadaCore Kernel Speedup on NVIDIA A100 over Dao AI Lab Fast Hadamard Kernel*
53
53
54
54
55
55
{:style="width:100%; margin-top: 35px;"}
@@ -58,10 +58,10 @@ We benchmark HadaCore against the[ Dao AI Lab Hadamard Kernel](https://github.co
58
58
*Color coded Speedup Table for NVIDIA A100, Green = Speedup over Baseline*
59
59
60
60
61
-
{:style="width:100%; margin-top: 35px;"}
61
+
{:style="width:100%; margin-top: 35px;"}
62
62
63
63
64
-
*Figure 4: HadaCore Kernel Speedup on NVIDIA H100 over Dao AI Lab Fast Hadamard Kernel*
64
+
*Figure 5: HadaCore Kernel Speedup on NVIDIA H100 over Dao AI Lab Fast Hadamard Kernel*
65
65
66
66
67
67
{:style="width:100%; margin-top: 35px;"}
0 commit comments