Skip to content

Missed optimization: catch_unwind not removed when closure never unwinds #64222

Closed
@gnzlbg

Description

@gnzlbg

Consider this C++ code (https://gcc.godbolt.org/z/iaSj6g)

extern "C" void foo() noexcept;

int bar() {
    try {
        foo();
        return 42;
    } catch(...) {
        return 13;
    }
}

which gets optimized to:

bar():                                # @bar()
        push    rax
        call    foo
        mov     eax, 42
        pop     rcx
        ret

Now consider the semantically identical Rust code (https://gcc.godbolt.org/z/2tXtOM):

#![feature(unwind_attributes)]

extern "C" {
    // never unwinds:
    #[unwind(abort)] fn foo(); 
}

pub unsafe fn bar() -> i32 {
    std::panic::catch_unwind(|| { foo(); 42 }).unwrap_or(13)
}

which generates (https://gcc.godbolt.org/z/fzSBl7 - note: to workaround #[unwind(abort)] currently not emitting nounwind I've added the attributes manually to the Rust output and piped the output through opt -O3 before piping it to llc -O3):

;; this is only a small part of what it generates
example::bar: # @example::bar
  push rbp
  push r14
  push rbx
  sub rsp, 32
  mov qword ptr [rsp + 16], 0
  mov qword ptr [rsp + 24], 0
  lea rsi, [rsp + 12]
  lea rdx, [rsp + 16]
  lea rcx, [rsp + 24]
  mov edi, offset std::panicking::try::do_call
  call qword ptr [rip + __rust_maybe_catch_panic@GOTPCREL]
  test eax, eax
  je .LBB2_1
  mov rdi, -1
  call qword ptr [rip + std::panicking::update_panic_count@GOTPCREL]
  mov r14, qword ptr [rsp + 16]
  mov rbx, qword ptr [rsp + 24]
  mov rdi, r14
  call qword ptr [rbx]
  mov rsi, qword ptr [rbx + 8]
  mov ebp, 13
  test rsi, rsi
  je .LBB2_5
  mov rdx, qword ptr [rbx + 16]
  mov rdi, r14
  call qword ptr [rip + __rust_dealloc@GOTPCREL]
  jmp .LBB2_5
.LBB2_1: # %_ZN3std5panic12catch_unwind17h19eaab96d48c9410E.exit
  mov ebp, dword ptr [rsp + 12]
.LBB2_5: # %"_ZN4core6result19Result$LT$T$C$E$GT$9unwrap_or17hd512dca21f83b98fE.exit"
  mov eax, ebp
  add rsp, 32
  pop rbx
  pop r14
  pop rbp
  ret
  mov rbp, rax
  mov rdi, r14
  mov rsi, rbx
  call alloc::alloc::box_free
  mov rdi, rbp
  call _Unwind_Resume

Removing the unnecessary catch unwind I get instead:

example::bar:
  push rax
  call qword ptr [rip + foo@GOTPCREL]
  mov eax, 42
  pop rcx
  ret

which is the same code that clang generates and what I expect to be generated when the closure passed to catch_unwind never unwinds.

For catch_unwind to be a zero cost abstraction, it needs to generate the same machine code as C++ in this case when optimizations are turned on.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions