Description
The generated code for passing arguments larger than a machine word looks inefficient.
Test case:
#[inline(never)]
pub fn bar(x: &str) { println!("{}", x) }
pub fn foo(x: &str) { bar(x); bar(x); }
On x86_64-unknown-linux-gnu, compiling with rustc test.rs -O -C no-stack-check --crate-type dylib --emit asm
, I see this code for foo
:
.section .text._ZN3foo20hb6f131ac36a30532PaaE,"ax",@progbits
.globl _ZN3foo20hb6f131ac36a30532PaaE
.align 16, 0x90
.type _ZN3foo20hb6f131ac36a30532PaaE,@function
_ZN3foo20hb6f131ac36a30532PaaE:
.cfi_startproc
pushq %rbx
.Ltmp4:
.cfi_def_cfa_offset 16
subq $16, %rsp
.Ltmp5:
.cfi_def_cfa_offset 32
.Ltmp6:
.cfi_offset %rbx, -16
movq %rdi, %rbx
movups (%rbx), %xmm0
movaps %xmm0, (%rsp)
leaq (%rsp), %rdi
callq _ZN3bar20hf21270c370b3427feaaE@PLT
movups (%rbx), %xmm0
movaps %xmm0, (%rsp)
leaq (%rsp), %rdi
callq _ZN3bar20hf21270c370b3427feaaE@PLT
addq $16, %rsp
popq %rbx
retq
.Ltmp7:
.size _ZN3foo20hb6f131ac36a30532PaaE, .Ltmp7-_ZN3foo20hb6f131ac36a30532PaaE
.cfi_endproc
foo
receives the address of the &str
in %rdi
. It copies it into a new stack location for each call, then passes the address of that location to bar
.
Could foo
forward the address of the &str
along without making stack copies?
If I remove one of the bar
calls from foo
, then the function also ought to become a tail call, but it doesn't. Tail call optimization does occur if I replace the &str
types with &&str
.
The calling convention for passing &str
(and other arguments larger than a machine word?) seems to be:
- Make a copy of the argument on the stack.
- Pass the address of the copy in the conventional manner (in a register or on the stack).
- The callee may modify the copy.
i.e. We seem to be passing values both by-value and by-reference.
With the current convention, I think we could get smaller code by eliding some of the copies. If the copies were instead immutable, I think we could elide more copies.
Compiler version:
rustc 1.0.0-nightly (b47aebe3f 2015-02-26) (built 2015-02-27)
binary: rustc
commit-hash: b47aebe3fc2da06c760fd8ea19f84cbc41d34831
commit-date: 2015-02-26
build-date: 2015-02-27
host: x86_64-unknown-linux-gnu
release: 1.0.0-nightly