Skip to content

long double return miscompiled on Solaris/sparcv9 #47073

Closed
@rorth

Description

@rorth
Bugzilla Link 47729
Version trunk
OS Solaris
CC @efriedma-quic,@jrtc27,@jyknight,@jfbastien

Extended Description

Several tests FAIL on Solaris/sparcv9 where long double is 128 bits:

Builtins-sparcv9-sunos :: addtf3_test.c
Builtins-sparcv9-sunos :: divtf3_test.c
Builtins-sparcv9-sunos :: extenddftf2_test.c
Builtins-sparcv9-sunos :: extendsftf2_test.c
Builtins-sparcv9-sunos :: floatditf_test.c
Builtins-sparcv9-sunos :: floatsitf_test.c
Builtins-sparcv9-sunos :: floattitf_test.c
Builtins-sparcv9-sunos :: floatunditf_test.c
Builtins-sparcv9-sunos :: floatunsitf_test.c
Builtins-sparcv9-sunos :: floatuntitf_test.c
Builtins-sparcv9-sunos :: multf3_test.c
Builtins-sparcv9-sunos :: subtf3_test.c

E.g. addtf3_test.c FAILs with

error in test__addtf3(36.40888825164657541977, 0.96444431369742592240) = 37.37333256534401470898, expected 37.37333256534400134216

The error doesn't happen in a 1-stage build with gcc or in a Debug build.

Via side-by-side debugging with addtf3.c.o compiled with clang -O vs. gcc -O
(everything else from a regular 2-stage clang build), it turned out that both
compilers produce the same result until the very end of __addtf3. The only
difference is in the final fromRep call, which can be seen with this testcase:

$ cat fr.c
typedef long double fp_t;
typedef __uint128_t rep_t;

fp_t fromRep(rep_t x) {
const union {
fp_t f;
rep_t i;
} rep = {.i = x};
return rep.f;
}

gcc -m64 -O produces

fromRep:
add %sp, -144, %sp
stx %o0, [%sp+2175]
stx %o1, [%sp+2183]
ldd [%sp+2175], %f0
ldd [%sp+2183], %f2
jmp %o7+8
add %sp, 144, %sp

while clang yields

fromRep: ! @​fromRep
! %bb.0: ! %entry
save %sp, -144, %sp
add %fp, 2031, %i2
or %i2, 8, %i2
stx %i0, [%fp+2031]
ldd [%fp+2031], %f0
ldd [%i2], %f2
stx %i1, [%i2]
ret
restore

The long double return value is supposed to be in %f0 and %f2. gcc handles
this just fine, and clang gets it right for %f0, too. However, it stores the
contents of an uninitialized stack slot in %f2 and only then stores the second
half (%i1) of the arg there.

I don't have the slightest idea how to fix this codegen bug, but I have a
workaround patch (to be posted for reference shortly) that wraps the affected
functions in #pragma clang optimize off/on (nothing more than a hack to show
that this fixes all the failures above).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions