Skip to content

Commit 2f18b5e

Browse files
authored
[AArch64] Add fpext and fpround costs (#119292)
This adds some basic costs for fpext and fpround, many of which were already handled by the generic costing routines but this does make some adjustments for larger vector types that can use fcvtn+fcvtn2, as opposed to fcvtn+fcvtn+concat. These should now more closely match the codegen from https://godbolt.org/z/r3P9Mf8ez, for example.
1 parent 553058f commit 2f18b5e

File tree

4 files changed

+123
-98
lines changed

4 files changed

+123
-98
lines changed

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2800,6 +2800,31 @@ InstructionCost AArch64TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
28002800
{ISD::SIGN_EXTEND, MVT::v16i32, MVT::v16i8, 6},
28012801
{ISD::ZERO_EXTEND, MVT::v16i32, MVT::v16i8, 6},
28022802

2803+
// FP Ext and trunc
2804+
{ISD::FP_EXTEND, MVT::f64, MVT::f32, 1}, // fcvt
2805+
{ISD::FP_EXTEND, MVT::v2f64, MVT::v2f32, 1}, // fcvtl
2806+
{ISD::FP_EXTEND, MVT::v4f64, MVT::v4f32, 2}, // fcvtl+fcvtl2
2807+
// FP16
2808+
{ISD::FP_EXTEND, MVT::f32, MVT::f16, 1}, // fcvt
2809+
{ISD::FP_EXTEND, MVT::f64, MVT::f16, 1}, // fcvt
2810+
{ISD::FP_EXTEND, MVT::v4f32, MVT::v4f16, 1}, // fcvtl
2811+
{ISD::FP_EXTEND, MVT::v8f32, MVT::v8f16, 2}, // fcvtl+fcvtl2
2812+
{ISD::FP_EXTEND, MVT::v2f64, MVT::v2f16, 2}, // fcvtl+fcvtl
2813+
{ISD::FP_EXTEND, MVT::v4f64, MVT::v4f16, 3}, // fcvtl+fcvtl2+fcvtl
2814+
{ISD::FP_EXTEND, MVT::v8f64, MVT::v8f16, 6}, // 2 * fcvtl+fcvtl2+fcvtl
2815+
// FP Ext and trunc
2816+
{ISD::FP_ROUND, MVT::f32, MVT::f64, 1}, // fcvt
2817+
{ISD::FP_ROUND, MVT::v2f32, MVT::v2f64, 1}, // fcvtn
2818+
{ISD::FP_ROUND, MVT::v4f32, MVT::v4f64, 2}, // fcvtn+fcvtn2
2819+
// FP16
2820+
{ISD::FP_ROUND, MVT::f16, MVT::f32, 1}, // fcvt
2821+
{ISD::FP_ROUND, MVT::f16, MVT::f64, 1}, // fcvt
2822+
{ISD::FP_ROUND, MVT::v4f16, MVT::v4f32, 1}, // fcvtn
2823+
{ISD::FP_ROUND, MVT::v8f16, MVT::v8f32, 2}, // fcvtn+fcvtn2
2824+
{ISD::FP_ROUND, MVT::v2f16, MVT::v2f64, 2}, // fcvtn+fcvtn
2825+
{ISD::FP_ROUND, MVT::v4f16, MVT::v4f64, 3}, // fcvtn+fcvtn2+fcvtn
2826+
{ISD::FP_ROUND, MVT::v8f16, MVT::v8f64, 6}, // 2 * fcvtn+fcvtn2+fcvtn
2827+
28032828
// LowerVectorINT_TO_FP:
28042829
{ISD::SINT_TO_FP, MVT::v2f32, MVT::v2i32, 1},
28052830
{ISD::SINT_TO_FP, MVT::v4f32, MVT::v4i32, 1},

llvm/test/Analysis/CostModel/AArch64/cast.ll

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -274,34 +274,34 @@ define i32 @casts_no_users() {
274274
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r69 = uitofp i64 undef to double
275275
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r80 = fptrunc double undef to float
276276
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r81 = fptrunc <2 x double> undef to <2 x float>
277-
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r82 = fptrunc <4 x double> undef to <4 x float>
278-
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r83 = fptrunc <8 x double> undef to <8 x float>
279-
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %r84 = fptrunc <16 x double> undef to <16 x float>
277+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r82 = fptrunc <4 x double> undef to <4 x float>
278+
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r83 = fptrunc <8 x double> undef to <8 x float>
279+
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r84 = fptrunc <16 x double> undef to <16 x float>
280280
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %truncf64f16 = fptrunc double undef to half
281-
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %truncv2f64f16 = fptrunc <2 x double> undef to <2 x half>
281+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %truncv2f64f16 = fptrunc <2 x double> undef to <2 x half>
282282
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %truncv4f64f16 = fptrunc <4 x double> undef to <4 x half>
283-
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %truncv8f64f16 = fptrunc <8 x double> undef to <8 x half>
284-
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %truncv16f64f16 = fptrunc <16 x double> undef to <16 x half>
283+
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %truncv8f64f16 = fptrunc <8 x double> undef to <8 x half>
284+
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %truncv16f64f16 = fptrunc <16 x double> undef to <16 x half>
285285
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %truncv32f16 = fptrunc float undef to half
286286
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %truncv2f32f16 = fptrunc <2 x float> undef to <2 x half>
287287
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %truncv4f32f16 = fptrunc <4 x float> undef to <4 x half>
288-
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %truncv8f32f16 = fptrunc <8 x float> undef to <8 x half>
289-
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %truncv16f32f16 = fptrunc <16 x float> undef to <16 x half>
288+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %truncv8f32f16 = fptrunc <8 x float> undef to <8 x half>
289+
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %truncv16f32f16 = fptrunc <16 x float> undef to <16 x half>
290290
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r85 = fpext float undef to double
291291
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r86 = fpext <2 x float> undef to <2 x double>
292-
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r87 = fpext <4 x float> undef to <4 x double>
293-
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r88 = fpext <8 x float> undef to <8 x double>
294-
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %r89 = fpext <16 x float> undef to <16 x double>
292+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r87 = fpext <4 x float> undef to <4 x double>
293+
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r88 = fpext <8 x float> undef to <8 x double>
294+
; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r89 = fpext <16 x float> undef to <16 x double>
295295
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %extf16f32 = fpext half undef to float
296296
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %extv2f16f32 = fpext <2 x half> undef to <2 x float>
297297
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %extv4f16f32 = fpext <4 x half> undef to <4 x float>
298-
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %extv8f16f32 = fpext <8 x half> undef to <8 x float>
299-
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %extv16f16f32 = fpext <16 x half> undef to <16 x float>
298+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %extv8f16f32 = fpext <8 x half> undef to <8 x float>
299+
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %extv16f16f32 = fpext <16 x half> undef to <16 x float>
300300
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %extf16f64 = fpext half undef to double
301-
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %extv2f16f64 = fpext <2 x half> undef to <2 x double>
301+
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %extv2f16f64 = fpext <2 x half> undef to <2 x double>
302302
; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %extv4f16f64 = fpext <4 x half> undef to <4 x double>
303-
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %extv8f16f64 = fpext <8 x half> undef to <8 x double>
304-
; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %extv16f16f64 = fpext <16 x half> undef to <16 x double>
303+
; CHECK-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %extv8f16f64 = fpext <8 x half> undef to <8 x double>
304+
; CHECK-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %extv16f16f64 = fpext <16 x half> undef to <16 x double>
305305
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r90 = fptoui <2 x float> undef to <2 x i1>
306306
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r91 = fptosi <2 x float> undef to <2 x i1>
307307
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r92 = fptoui <2 x float> undef to <2 x i8>

llvm/test/Analysis/CostModel/AArch64/sve-cast.ll

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -600,14 +600,14 @@ define i32 @casts_no_users() {
600600
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r69 = uitofp i64 undef to double
601601
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r80 = fptrunc double undef to float
602602
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r81 = fptrunc <2 x double> undef to <2 x float>
603-
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r82 = fptrunc <4 x double> undef to <4 x float>
604-
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r83 = fptrunc <8 x double> undef to <8 x float>
605-
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %r84 = fptrunc <16 x double> undef to <16 x float>
603+
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r82 = fptrunc <4 x double> undef to <4 x float>
604+
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r83 = fptrunc <8 x double> undef to <8 x float>
605+
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r84 = fptrunc <16 x double> undef to <16 x float>
606606
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r85 = fpext float undef to double
607607
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r86 = fpext <2 x float> undef to <2 x double>
608-
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r87 = fpext <4 x float> undef to <4 x double>
609-
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r88 = fpext <8 x float> undef to <8 x double>
610-
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %r89 = fpext <16 x float> undef to <16 x double>
608+
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r87 = fpext <4 x float> undef to <4 x double>
609+
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r88 = fpext <8 x float> undef to <8 x double>
610+
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r89 = fpext <16 x float> undef to <16 x double>
611611
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r90 = fptoui <2 x float> undef to <2 x i1>
612612
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r91 = fptosi <2 x float> undef to <2 x i1>
613613
; CHECK-SVE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r92 = fptoui <2 x float> undef to <2 x i8>

0 commit comments

Comments
 (0)