Skip to content

[x86] default codegen should be more branchy #32360

Open
@rotateright

Description

@rotateright
Bugzilla Link 33013
Version trunk
OS All
CC @RKSimon,@michaeljclark,@TNorthover,@ZviRackover

Extended Description

The default x86 target is something like an Intel big core (ie, it's good at absorbing the cost of correctly predicted branches, good at predicting those branches, and good at speculating execution past those branches).

Therefore, we shouldn't favor cmov codegen for IR select as much as we currently do. Example:

int foo(float x) {
if (x < 42.0f)
return x;
return 12;
}

define i32 @​foo(float %x) {
%cmp = fcmp olt float %x, 4.200000e+01
%conv = fptosi float %x to i32
%ret = select i1 %cmp, i32 %conv, i32 12
ret i32 %ret
}

$ clang -O2 cmovfp.c -S -o -
movss LCPI0_0(%rip), %xmm1 ## xmm1 = mem[0],zero,zero,zero
ucomiss %xmm0, %xmm1
cvttss2si %xmm0, %ecx
movl $12, %eax
cmoval %ecx, %eax
retq

Note that gcc and icc will use compare and branch on this example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions