long double return miscompiled on Solaris/sparcv9

|  |  |
| --- | --- |
| Bugzilla Link | [47729](https://llvm.org/bz47729) |
| Version | trunk |
| OS | Solaris |
| CC | @efriedma-quic,@jrtc27,@jyknight,@jfbastien |

## Extended Description 
Several tests FAIL on Solaris/sparcv9 where long double is 128 bits:

  Builtins-sparcv9-sunos :: addtf3_test.c
  Builtins-sparcv9-sunos :: divtf3_test.c
  Builtins-sparcv9-sunos :: extenddftf2_test.c
  Builtins-sparcv9-sunos :: extendsftf2_test.c
  Builtins-sparcv9-sunos :: floatditf_test.c
  Builtins-sparcv9-sunos :: floatsitf_test.c
  Builtins-sparcv9-sunos :: floattitf_test.c
  Builtins-sparcv9-sunos :: floatunditf_test.c
  Builtins-sparcv9-sunos :: floatunsitf_test.c
  Builtins-sparcv9-sunos :: floatuntitf_test.c
  Builtins-sparcv9-sunos :: multf3_test.c
  Builtins-sparcv9-sunos :: subtf3_test.c

E.g. addtf3_test.c FAILs with

error in test__addtf3(36.40888825164657541977, 0.96444431369742592240) = 37.37333256534401470898, expected 37.37333256534400134216

The error doesn't happen in a 1-stage build with gcc or in a Debug build.

Via side-by-side debugging with addtf3.c.o compiled with clang -O vs. gcc -O
(everything else from a regular 2-stage clang build), it turned out that both
compilers produce the same result until the very end of __addtf3.  The only
difference is in the final fromRep call, which can be seen with this testcase:

$ cat fr.c
typedef long double fp_t;
typedef __uint128_t rep_t;

fp_t fromRep(rep_t x) {
  const union {
    fp_t f;
    rep_t i;
  } rep = {.i = x};
  return rep.f;
}

gcc -m64 -O produces

fromRep:
	add	%sp, -144, %sp
	stx	%o0, [%sp+2175]
	stx	%o1, [%sp+2183]
	ldd	[%sp+2175], %f0
	ldd	[%sp+2183], %f2
	jmp	%o7+8
	 add	%sp, 144, %sp

while clang yields

fromRep:                                ! @&#8203;fromRep
! %bb.0:                                ! %entry
	save %sp, -144, %sp
	add %fp, 2031, %i2
	or %i2, 8, %i2
	stx %i0, [%fp+2031]
	ldd [%fp+2031], %f0
	ldd [%i2], %f2
	stx %i1, [%i2]
	ret
	restore

The long double return value is supposed to be in %f0 and %f2.  gcc handles
this just fine, and clang gets it right for %f0, too.  However, it stores the
contents of an uninitialized stack slot in %f2 and only then stores the second
half (%i1) of the arg there.

I don't have the slightest idea how to fix this codegen bug, but I have a
workaround patch (to be posted for reference shortly) that wraps the affected
functions in #pragma clang optimize off/on (nothing more than a hack to show
that this fixes all the failures above).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

long double return miscompiled on Solaris/sparcv9 #47073

Extended Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development


Bugzilla Link	47729
Version	trunk
OS	Solaris
CC	@efriedma-quic,@jrtc27,@jyknight,@jfbastien

long double return miscompiled on Solaris/sparcv9 #47073

Description

Extended Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions