Description
On x86_64-pc-windows-msvc, the system ABI considers xmm0-5
as volatile/caller-save registers, and xmm6-15
as non-volatile/callee-save registers. However, clobber_abi("C")
incorrectly marks xmm6-15
as clobbered, which produces a bunch of wasteful register saves/restores.
This behaves according to the documentation (which specifies that xmm6-15 will be clobbered), but it's almost certainly not what the developer wanted due to the unnecessary clobbering causing performance problems. The workaround is to avoid using clobber_abi
on Windows and to manually clobber registers; this is a pain and may cause problems down the line as more volatile registers are added to the ABI.
So a breaking change seems warranted here.
#[no_mangle]
pub fn foo() {
unsafe {
std::arch::asm! {
"call bar",
clobber_abi("C"),
}
}
}
Asm:
foo:
sub rsp, 168
movaps xmmword ptr [rsp + 144], xmm15
movaps xmmword ptr [rsp + 128], xmm14
movaps xmmword ptr [rsp + 112], xmm13
movaps xmmword ptr [rsp + 96], xmm12
movaps xmmword ptr [rsp + 80], xmm11
movaps xmmword ptr [rsp + 64], xmm10
movaps xmmword ptr [rsp + 48], xmm9
movaps xmmword ptr [rsp + 32], xmm8
movaps xmmword ptr [rsp + 16], xmm7
movaps xmmword ptr [rsp], xmm6
call bar
movaps xmm6, xmmword ptr [rsp]
movaps xmm7, xmmword ptr [rsp + 16]
movaps xmm8, xmmword ptr [rsp + 32]
movaps xmm9, xmmword ptr [rsp + 48]
movaps xmm10, xmmword ptr [rsp + 64]
movaps xmm11, xmmword ptr [rsp + 80]
movaps xmm12, xmmword ptr [rsp + 96]
movaps xmm13, xmmword ptr [rsp + 112]
movaps xmm14, xmmword ptr [rsp + 128]
movaps xmm15, xmmword ptr [rsp + 144]
add rsp, 168
ret
I expected that foo
would just establish a frame and call bar
without saving/restoring of the XMM registers.
Meta
rustc --version --verbose
:
rustc 1.78.0 (9b00956e5 2024-04-29)
binary: rustc
commit-hash: 9b00956e56009bab2aa15d7bff10916599e3d6d6
commit-date: 2024-04-29
host: x86_64-unknown-linux-gnu
release: 1.78.0
LLVM version: 18.1.2
But this repros in nightly.