-
Notifications
You must be signed in to change notification settings - Fork 13.6k
PowerPC/VSX: Select FMINNUM and FMAXNUM #135739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
See also: #112195 |
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-powerpc Author: YunQiang Su (wzssyqa) ChangesIn LangRef, we claim that FMINNUM and FMAXNUM should follow the minNum and maxNum operators in IEEE754-2008. PowerPC/VSX does have these instructions XSMINDP and XSMAXDP. Note: FMAXNUM_IEEE and FMINNUM_IEEE will be removed in future. Full diff: https://github.com/llvm/llvm-project/pull/135739.diff 3 Files Affected:
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 1f75425752a78..0ee93b5e8012c 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -777,6 +777,10 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setOperationAction(ISD::FMAXNUM_IEEE, MVT::f32, Legal);
setOperationAction(ISD::FMINNUM_IEEE, MVT::f64, Legal);
setOperationAction(ISD::FMINNUM_IEEE, MVT::f32, Legal);
+ setOperationAction(ISD::FMAXNUM, MVT::f64, Legal);
+ setOperationAction(ISD::FMAXNUM, MVT::f32, Legal);
+ setOperationAction(ISD::FMINNUM, MVT::f64, Legal);
+ setOperationAction(ISD::FMINNUM, MVT::f32, Legal);
}
if (Subtarget.hasAltivec()) {
diff --git a/llvm/lib/Target/PowerPC/PPCInstrVSX.td b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
index 19448210f5db1..695e28bf1493c 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrVSX.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
@@ -2731,6 +2731,8 @@ def : Pat<(f32 (fneg (fabs f32:$S))),
(COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;
// f32 Min.
+def : Pat<(f32 (fminnum f32:$A, f32:$B)),
+ (f32 FpMinMax.F32Min)>;
def : Pat<(f32 (fminnum_ieee f32:$A, f32:$B)),
(f32 FpMinMax.F32Min)>;
def : Pat<(f32 (fminnum_ieee (fcanonicalize f32:$A), f32:$B)),
@@ -2742,6 +2744,8 @@ def : Pat<(f32 (fminnum_ieee (fcanonicalize f32:$A), (fcanonicalize f32:$B))),
// F32 Max.
def : Pat<(f32 (fmaxnum_ieee f32:$A, f32:$B)),
(f32 FpMinMax.F32Max)>;
+def : Pat<(f32 (fmaxnum f32:$A, f32:$B)),
+ (f32 FpMinMax.F32Max)>;
def : Pat<(f32 (fmaxnum_ieee (fcanonicalize f32:$A), f32:$B)),
(f32 FpMinMax.F32Max)>;
def : Pat<(f32 (fmaxnum_ieee f32:$A, (fcanonicalize f32:$B))),
@@ -2750,6 +2754,8 @@ def : Pat<(f32 (fmaxnum_ieee (fcanonicalize f32:$A), (fcanonicalize f32:$B))),
(f32 FpMinMax.F32Max)>;
// f64 Min.
+def : Pat<(f64 (fminnum f64:$A, f64:$B)),
+ (f64 (XSMINDP $A, $B))>;
def : Pat<(f64 (fminnum_ieee f64:$A, f64:$B)),
(f64 (XSMINDP $A, $B))>;
def : Pat<(f64 (fminnum_ieee (fcanonicalize f64:$A), f64:$B)),
@@ -2759,6 +2765,8 @@ def : Pat<(f64 (fminnum_ieee f64:$A, (fcanonicalize f64:$B))),
def : Pat<(f64 (fminnum_ieee (fcanonicalize f64:$A), (fcanonicalize f64:$B))),
(f64 (XSMINDP $A, $B))>;
// f64 Max.
+def : Pat<(f64 (fmaxnum f64:$A, f64:$B)),
+ (f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee f64:$A, f64:$B)),
(f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), f64:$B)),
diff --git a/llvm/test/CodeGen/PowerPC/scalar-min-max.ll b/llvm/test/CodeGen/PowerPC/scalar-min-max.ll
index 216d498e85411..f6ea0d9cc2328 100644
--- a/llvm/test/CodeGen/PowerPC/scalar-min-max.ll
+++ b/llvm/test/CodeGen/PowerPC/scalar-min-max.ll
@@ -117,13 +117,12 @@ define dso_local float @testfmax_fast(float %a, float %b) local_unnamed_addr {
;
; NO-FAST-P9-LABEL: testfmax_fast:
; NO-FAST-P9: # %bb.0: # %entry
-; NO-FAST-P9-NEXT: xsmaxcdp f1, f1, f2
+; NO-FAST-P9-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testfmax_fast:
; NO-FAST-P8: # %bb.0: # %entry
-; NO-FAST-P8-NEXT: xssubsp f0, f2, f1
-; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf ogt float %a, %b
@@ -138,13 +137,12 @@ define dso_local double @testdmax_fast(double %a, double %b) local_unnamed_addr
;
; NO-FAST-P9-LABEL: testdmax_fast:
; NO-FAST-P9: # %bb.0: # %entry
-; NO-FAST-P9-NEXT: xsmaxcdp f1, f1, f2
+; NO-FAST-P9-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testdmax_fast:
; NO-FAST-P8: # %bb.0: # %entry
-; NO-FAST-P8-NEXT: xssubdp f0, f2, f1
-; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf ogt double %a, %b
@@ -159,13 +157,12 @@ define dso_local float @testfmin_fast(float %a, float %b) local_unnamed_addr {
;
; NO-FAST-P9-LABEL: testfmin_fast:
; NO-FAST-P9: # %bb.0: # %entry
-; NO-FAST-P9-NEXT: xsmincdp f1, f1, f2
+; NO-FAST-P9-NEXT: xsmindp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testfmin_fast:
; NO-FAST-P8: # %bb.0: # %entry
-; NO-FAST-P8-NEXT: xssubsp f0, f1, f2
-; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT: xsmindp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf olt float %a, %b
@@ -180,13 +177,12 @@ define dso_local double @testdmin_fast(double %a, double %b) local_unnamed_addr
;
; NO-FAST-P9-LABEL: testdmin_fast:
; NO-FAST-P9: # %bb.0: # %entry
-; NO-FAST-P9-NEXT: xsmincdp f1, f1, f2
+; NO-FAST-P9-NEXT: xsmindp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testdmin_fast:
; NO-FAST-P8: # %bb.0: # %entry
-; NO-FAST-P8-NEXT: xssubdp f0, f1, f2
-; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT: xsmindp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf olt double %a, %b
|
ping |
def : Pat<(f32 (fminnum f32:$A, f32:$B)), | ||
(f32 FpMinMax.F32Min)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect a more direct replacement, removing the legality of the IEEE variants. At least should avoid duplicating the patterns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some steps to remove IEEE variants:
- Support fminnum/fmaxnum on all architectures with fminnum_ieee/fmaxnum_ieee.
- Remove all the references of FMAXNUM_IEEE/FMINNUM_IEEE in the common code
- Remove fminnum_ieee/fmaxnum_ieee in the code of backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a patch for the first step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a need to have a target have coexisting support for both fminnum and fminnum_ieee. We can fully migrate the target to the new behavior in one go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But we need sometime to switch FMAXNUM_IEEE to FMAXNUM in common code, such as in TargetLowering.cpp
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we drop fminnum_ieee/fmaxnum_ieee, we will have some z-turn of the test suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
z-turn?
But we need sometime to switch FMAXNUM_IEEE to FMAXNUM
Yes, but you can get there by removing the use a target at a time. Here if you make FMAXNUM legal, and stop making FMAXNUM_IEEE legal, the target starts interpreting the nodes in the new way and stops using FMAXNUM_IEEE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
z-turn?
I mean we will need to change the testcases, and then will need to change them back.
But we need sometime to switch FMAXNUM_IEEE to FMAXNUM
Yes, but you can get there by removing the use a target at a time. Here if you make FMAXNUM legal, and stop making FMAXNUM_IEEE legal, the target starts interpreting the nodes in the new way and stops using FMAXNUM_IEEE.
Yes. But we haven't change the code in arch-indepent code.
I will have a try with PowerPC/VSX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a try. If we drop FMAXNUM_IEEE/FMINNUM_IEEE now,
llvm/test/CodeGen/PowerPC/fminimum-fmaximum.ll
will not emit correct instructions, as
expandFMINIMUM_FMAXIMUM
uses FMINNUM_IEEE/FMAXNUM_IEEE.
2f89d4d
to
3de59f1
Compare
3de59f1
to
372784d
Compare
Let's wait for #137367 |
PowerPC with VSX has vector instructions: XVMAXSP/XVMINSP/XVMAXDP/XVMINDP which follow the semantics of minNUM/maxNUM of IEEE754-2008; and scaler instructions XSMINDP/XSMAXDP which also follow semantics of minNUM/maxNUM of IEEE754-2008. Let's use them to define FMAXNUM_IEEE and FMINNUM_IEEE. Currently, some Pat<(f64 (fminnum_ieee (fcanonicalize .. are defined. They are not correct. Let's remove them. In the future patch, we will define fcanonicalize for PowerPC/VSX, then `fminimunnum/fmaximumnum` will be usable.
This reverts commit dbb3904.
In LangRef, we claim that FMINNUM and FMAXNUM should follow the minNum and maxNum operators in IEEE754-2008. PowerPC/VSX does have these instructions XSMINDP and XSMAXDP.
In LangRef, we claim that FMINNUM and FMAXNUM should follow the minNum and maxNum operators in IEEE754-2008.
PowerPC/VSX does have these instructions XSMINDP and XSMAXDP.
Note: FMAXNUM_IEEE and FMINNUM_IEEE will be removed in future.