Skip to content

[MIPS][float] Fixed SingleFloat codegen on N32/N64 targets #140575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 17 additions & 9 deletions llvm/lib/Target/Mips/MipsCallingConv.td
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,8 @@ def RetCC_MipsN : CallingConv<[
//
// f128 should only occur for the N64 ABI where long double is 128-bit. On
// N32, long double is equivalent to double.
CCIfType<[i64], CCIfOrigArgWasF128<CCDelegateTo<RetCC_F128>>>,
CCIfSubtargetNot<"isSingleFloat()",
CCIfType<[i64], CCIfOrigArgWasF128<CCDelegateTo<RetCC_F128>>>>,

// Aggregate returns are positioned at the lowest address in the slot for
// both little and big-endian targets. When passing in registers, this
Expand Down Expand Up @@ -333,9 +334,10 @@ def CC_Mips_FixedArg : CallingConv<[
//
// f128 should only occur for the N64 ABI where long double is 128-bit. On
// N32, long double is equivalent to double.
CCIfType<[i64],
CCIfSubtargetNot<"useSoftFloat()",
CCIfOrigArgWasF128<CCBitConvertToType<f64>>>>,
CCIfType<[i64],
CCIfSubtargetNot<"isSingleFloat()",
CCIfSubtargetNot<"useSoftFloat()",
CCIfOrigArgWasF128<CCBitConvertToType<f64>>>>>,

CCIfCC<"CallingConv::Fast", CCDelegateTo<CC_Mips_FastCC>>,

Expand All @@ -359,8 +361,8 @@ def CC_Mips : CallingConv<[
// Callee-saved register lists.
//===----------------------------------------------------------------------===//

def CSR_SingleFloatOnly : CalleeSavedRegs<(add (sequence "F%u", 31, 20), RA, FP,
(sequence "S%u", 7, 0))>;
def CSR_O32_SingleFloat : CalleeSavedRegs<(add(sequence "F%u", 31, 20), RA, FP,
(sequence "S%u", 7, 0))>;

def CSR_O32_FPXX : CalleeSavedRegs<(add (sequence "D%u", 15, 10), RA, FP,
(sequence "S%u", 7, 0))> {
Expand All @@ -374,13 +376,19 @@ def CSR_O32_FP64 :
CalleeSavedRegs<(add (decimate (sequence "D%u_64", 30, 20), 2), RA, FP,
(sequence "S%u", 7, 0))>;

def CSR_N32 : CalleeSavedRegs<(add D20_64, D22_64, D24_64, D26_64, D28_64,
D30_64, RA_64, FP_64, GP_64,
(sequence "S%u_64", 7, 0))>;
def CSR_N32 : CalleeSavedRegs<(add(decimate(sequence "D%u_64", 30, 20), 2),
RA_64, FP_64, GP_64, (sequence "S%u_64", 7, 0))>;

def CSR_N32_SingleFloat
: CalleeSavedRegs<(add(decimate(sequence "F%u", 30, 20), 2), RA_64, FP_64,
GP_64, (sequence "S%u_64", 7, 0))>;

def CSR_N64 : CalleeSavedRegs<(add (sequence "D%u_64", 31, 24), RA_64, FP_64,
GP_64, (sequence "S%u_64", 7, 0))>;

def CSR_N64_SingleFloat : CalleeSavedRegs<(add(sequence "F%u", 31, 24), RA_64,
FP_64, GP_64, (sequence "S%u_64", 7, 0))>;

def CSR_Mips16RetHelper :
CalleeSavedRegs<(add V0, V1, FP,
(sequence "A%u", 3, 0), (sequence "S%u", 7, 0),
Expand Down
20 changes: 14 additions & 6 deletions llvm/lib/Target/Mips/MipsISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4295,10 +4295,16 @@ parseRegForInlineAsmConstraint(StringRef C, MVT VT) const {
return std::make_pair(0U, nullptr);

if (Prefix == "$f") { // Parse $f0-$f31.
// If the size of FP registers is 64-bit or Reg is an even number, select
// the 64-bit register class. Otherwise, select the 32-bit register class.
if (VT == MVT::Other)
VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32;
// If the targets is single float only, always select 32-bit registers,
// otherwise if the size of FP registers is 64-bit or Reg is an even number,
// select the 64-bit register class. Otherwise, select the 32-bit register
// class.
if (VT == MVT::Other) {
if (Subtarget.isSingleFloat())
VT = MVT::f32;
else
VT = (Subtarget.isFP64bit() || !(Reg % 2)) ? MVT::f64 : MVT::f32;
Comment on lines +4302 to +4306
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is properly covered in the tests, the only asm is a full clobber list which I'm assuming is to stress calling convention handling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I'll prepare a test case for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new test case for it. Also while writing that test I discovered another small bug in the inline asm constraint logic, I fixed it in the latest revision and added test for it as well.

}

RC = getRegClassFor(VT);

Expand Down Expand Up @@ -4338,10 +4344,12 @@ MipsTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
return std::make_pair(0U, &Mips::CPU16RegsRegClass);
return std::make_pair(0U, &Mips::GPR32RegClass);
}
if ((VT == MVT::i64 || (VT == MVT::f64 && Subtarget.useSoftFloat())) &&
if ((VT == MVT::i64 || (VT == MVT::f64 && Subtarget.useSoftFloat()) ||
(VT == MVT::f64 && Subtarget.isSingleFloat())) &&
!Subtarget.isGP64bit())
return std::make_pair(0U, &Mips::GPR32RegClass);
if ((VT == MVT::i64 || (VT == MVT::f64 && Subtarget.useSoftFloat())) &&
if ((VT == MVT::i64 || (VT == MVT::f64 && Subtarget.useSoftFloat()) ||
(VT == MVT::f64 && Subtarget.isSingleFloat())) &&
Subtarget.isGP64bit())
return std::make_pair(0U, &Mips::GPR64RegClass);
// This will generate an error message
Expand Down
38 changes: 30 additions & 8 deletions llvm/lib/Target/Mips/MipsRegisterInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -102,14 +102,25 @@ MipsRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
: CSR_Interrupt_32_SaveList;
}

if (Subtarget.isSingleFloat())
return CSR_SingleFloatOnly_SaveList;
// N64 ABI
if (Subtarget.isABI_N64()) {
if (Subtarget.isSingleFloat())
return CSR_N64_SingleFloat_SaveList;

if (Subtarget.isABI_N64())
return CSR_N64_SaveList;
}

// N32 ABI
if (Subtarget.isABI_N32()) {
if (Subtarget.isSingleFloat())
return CSR_N32_SingleFloat_SaveList;

if (Subtarget.isABI_N32())
return CSR_N32_SaveList;
}

// O32 ABI
if (Subtarget.isSingleFloat())
return CSR_O32_SingleFloat_SaveList;

if (Subtarget.isFP64bit())
return CSR_O32_FP64_SaveList;
Expand All @@ -124,14 +135,25 @@ const uint32_t *
MipsRegisterInfo::getCallPreservedMask(const MachineFunction &MF,
CallingConv::ID) const {
const MipsSubtarget &Subtarget = MF.getSubtarget<MipsSubtarget>();
if (Subtarget.isSingleFloat())
return CSR_SingleFloatOnly_RegMask;
// N64 ABI
if (Subtarget.isABI_N64()) {
if (Subtarget.isSingleFloat())
return CSR_N64_SingleFloat_RegMask;

if (Subtarget.isABI_N64())
return CSR_N64_RegMask;
}

// N32 ABI
if (Subtarget.isABI_N32()) {
if (Subtarget.isSingleFloat())
return CSR_N32_SingleFloat_RegMask;

if (Subtarget.isABI_N32())
return CSR_N32_RegMask;
}

// O32 ABI
if (Subtarget.isSingleFloat())
return CSR_O32_SingleFloat_RegMask;

if (Subtarget.isFP64bit())
return CSR_O32_FP64_RegMask;
Expand Down
15 changes: 15 additions & 0 deletions llvm/lib/Target/Mips/MipsSEISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
#include "llvm/CodeGen/SelectionDAG.h"
#include "llvm/CodeGen/SelectionDAGNodes.h"
#include "llvm/CodeGen/TargetInstrInfo.h"
#include "llvm/CodeGen/TargetLowering.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include "llvm/CodeGen/ValueTypes.h"
#include "llvm/CodeGenTypes/MachineValueType.h"
Expand Down Expand Up @@ -211,6 +212,20 @@ MipsSETargetLowering::MipsSETargetLowering(const MipsTargetMachine &TM,
}
}

// Targets with 64bits integer registers, but no 64bit floating point register
// do not support conversion between them
if (Subtarget.isGP64bit() && Subtarget.isSingleFloat() &&
!Subtarget.useSoftFloat()) {
setOperationAction(ISD::FP_TO_SINT, MVT::i64, Expand);
setOperationAction(ISD::FP_TO_UINT, MVT::i64, Expand);
setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Expand);
setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Expand);
setOperationAction(ISD::SINT_TO_FP, MVT::i64, Expand);
setOperationAction(ISD::UINT_TO_FP, MVT::i64, Expand);
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i64, Expand);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i64, Expand);
}

setOperationAction(ISD::SMUL_LOHI, MVT::i32, Custom);
setOperationAction(ISD::UMUL_LOHI, MVT::i32, Custom);
setOperationAction(ISD::MULHS, MVT::i32, Custom);
Expand Down
148 changes: 148 additions & 0 deletions llvm/test/CodeGen/Mips/cconv/arguments-hard-single-float-varargs.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
; RUN: llc -mtriple=mips -relocation-model=static -mattr=single-float < %s \
; RUN: | FileCheck --check-prefixes=ALL,SYM32,O32 %s
; RUN: llc -mtriple=mipsel -relocation-model=static -mattr=single-float < %s \
; RUN: | FileCheck --check-prefixes=ALL,SYM32,O32 %s

; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n32 -mattr=single-float < %s \
; RUN: | FileCheck --check-prefixes=ALL,SYM32,N32,NEW,NEWBE %s
; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n32 -mattr=single-float < %s \
; RUN: | FileCheck --check-prefixes=ALL,SYM32,N32,NEW,NEWLE %s

; RUN: llc -mtriple=mips64 -relocation-model=static -target-abi n64 -mattr=single-float < %s \
; RUN: | FileCheck --check-prefixes=ALL,SYM64,N64,NEW,NEWBE %s
; RUN: llc -mtriple=mips64el -relocation-model=static -target-abi n64 -mattr=single-float < %s \
; RUN: | FileCheck --check-prefixes=ALL,SYM64,N64,NEW,NEWLE %s

@floats = global [11 x float] zeroinitializer
@doubles = global [11 x double] zeroinitializer

define void @double_args(double %a, ...)
nounwind {
entry:
%0 = getelementptr [11 x double], ptr @doubles, i32 0, i32 1
store volatile double %a, ptr %0

%ap = alloca ptr
call void @llvm.va_start(ptr %ap)
%b = va_arg ptr %ap, double
%1 = getelementptr [11 x double], ptr @doubles, i32 0, i32 2
store volatile double %b, ptr %1
call void @llvm.va_end(ptr %ap)
ret void
}

; ALL-LABEL: double_args:
; We won't test the way the global address is calculated in this test. This is
; just to get the register number for the other checks.
; SYM32-DAG: addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)
; SYM64-DAG: daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(doubles)

; O32 forbids using floating point registers for the non-variable portion.
; N32/N64 allow it.
; O32-DAG: sw $4, 8([[R2]])
; O32-DAG: sw $5, 12([[R2]])
; NEW-DAG: sd $4, 8([[R2]])

; The varargs portion is dumped to stack
; O32-DAG: sw $6, 16($sp)
; O32-DAG: sw $7, 20($sp)
; NEW-DAG: sd $5, 8($sp)
; NEW-DAG: sd $6, 16($sp)
; NEW-DAG: sd $7, 24($sp)
; NEW-DAG: sd $8, 32($sp)
; NEW-DAG: sd $9, 40($sp)
; NEW-DAG: sd $10, 48($sp)
; NEW-DAG: sd $11, 56($sp)

; Get the varargs pointer
; O32 has 4 bytes padding, 4 bytes for the varargs pointer, and 8 bytes reserved
; for arguments 1 and 2.
; N32/N64 has 8 bytes for the varargs pointer, and no reserved area.
; O32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 16
; O32-DAG: sw [[VAPTR]], 4($sp)
; N32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 8
; N32-DAG: sw [[VAPTR]], 4($sp)
; N64-DAG: daddiu [[VAPTR:\$[0-9]+]], $sp, 8
; N64-DAG: sd [[VAPTR]], 0($sp)

; Increment the pointer then get the varargs arg
; LLVM will rebind the load to the stack pointer instead of the varargs pointer
; during lowering. This is fine and doesn't change the behaviour.
; O32-DAG: addiu [[VAPTR]], [[VAPTR]], 8
; N32-DAG: addiu [[VAPTR]], [[VAPTR]], 8
; N64-DAG: daddiu [[VAPTR]], [[VAPTR]], 8
; O32-DAG: lw [[R3:\$[0-9]+]], 16($sp)
; O32-DAG: lw [[R4:\$[0-9]+]], 20($sp)
; O32-DAG: sw [[R3]], 16([[R2]])
; O32-DAG: sw [[R4]], 20([[R2]])
; NEW-DAG: ld [[R3:\$[0-9]+]], 8($sp)
; NEW-DAG: sd [[R3]], 16([[R2]])

define void @float_args(float %a, ...) nounwind {
entry:
%0 = getelementptr [11 x float], ptr @floats, i32 0, i32 1
store volatile float %a, ptr %0

%ap = alloca ptr
call void @llvm.va_start(ptr %ap)
%b = va_arg ptr %ap, float
%1 = getelementptr [11 x float], ptr @floats, i32 0, i32 2
store volatile float %b, ptr %1
call void @llvm.va_end(ptr %ap)
ret void
}

; ALL-LABEL: float_args:
; We won't test the way the global address is calculated in this test. This is
; just to get the register number for the other checks.
; SYM32-DAG: addiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(floats)
; SYM64-DAG: daddiu [[R2:\$[0-9]+]], ${{[0-9]+}}, %lo(floats)

; The first four arguments are the same in O32/N32/N64.
; The non-variable portion should be unaffected.
; O32-DAG: mtc1 $4, $f0
; O32-DAG: swc1 $f0, 4([[R2]])
; NEW-DAG: swc1 $f12, 4([[R2]])

; The varargs portion is dumped to stack
; O32-DAG: sw $5, 12($sp)
; O32-DAG: sw $6, 16($sp)
; O32-DAG: sw $7, 20($sp)
; NEW-DAG: sd $5, 8($sp)
; NEW-DAG: sd $6, 16($sp)
; NEW-DAG: sd $7, 24($sp)
; NEW-DAG: sd $8, 32($sp)
; NEW-DAG: sd $9, 40($sp)
; NEW-DAG: sd $10, 48($sp)
; NEW-DAG: sd $11, 56($sp)

; Get the varargs pointer
; O32 has 4 bytes padding, 4 bytes for the varargs pointer, and should have 8
; bytes reserved for arguments 1 and 2 (the first float arg) but as discussed in
; arguments-float.ll, GCC doesn't agree with MD00305 and treats floats as 4
; bytes so we only have 12 bytes total.
; N32/N64 has 8 bytes for the varargs pointer, and no reserved area.
; O32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 12
; O32-DAG: sw [[VAPTR]], 4($sp)
; N32-DAG: addiu [[VAPTR:\$[0-9]+]], $sp, 8
; N32-DAG: sw [[VAPTR]], 4($sp)
; N64-DAG: daddiu [[VAPTR:\$[0-9]+]], $sp, 8
; N64-DAG: sd [[VAPTR]], 0($sp)

; Increment the pointer then get the varargs arg
; LLVM will rebind the load to the stack pointer instead of the varargs pointer
; during lowering. This is fine and doesn't change the behaviour.
; Also, in big-endian mode the offset must be increased by 4 to retrieve the
; correct half of the argument slot.
;
; O32-DAG: addiu [[VAPTR]], [[VAPTR]], 4
; N32-DAG: addiu [[VAPTR]], [[VAPTR]], 8
; N64-DAG: daddiu [[VAPTR]], [[VAPTR]], 8
; O32-DAG: lwc1 [[FTMP1:\$f[0-9]+]], 12($sp)
; NEWLE-DAG: lwc1 [[FTMP1:\$f[0-9]+]], 8($sp)
; NEWBE-DAG: lwc1 [[FTMP1:\$f[0-9]+]], 12($sp)
; ALL-DAG: swc1 [[FTMP1]], 8([[R2]])

declare void @llvm.va_start(ptr)
declare void @llvm.va_copy(ptr, ptr)
declare void @llvm.va_end(ptr)
Loading