Skip to content

PowerPC/VSX: Select FMINNUM and FMAXNUM #135739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

wzssyqa
Copy link
Contributor

@wzssyqa wzssyqa commented Apr 15, 2025

In LangRef, we claim that FMINNUM and FMAXNUM should follow the minNum and maxNum operators in IEEE754-2008.

PowerPC/VSX does have these instructions XSMINDP and XSMAXDP.

Note: FMAXNUM_IEEE and FMINNUM_IEEE will be removed in future.

@wzssyqa
Copy link
Contributor Author

wzssyqa commented Apr 15, 2025

See also: #112195

@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-powerpc

Author: YunQiang Su (wzssyqa)

Changes

In LangRef, we claim that FMINNUM and FMAXNUM should follow the minNum and maxNum operators in IEEE754-2008.

PowerPC/VSX does have these instructions XSMINDP and XSMAXDP.

Note: FMAXNUM_IEEE and FMINNUM_IEEE will be removed in future.


Full diff: https://github.com/llvm/llvm-project/pull/135739.diff

3 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCISelLowering.cpp (+4)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrVSX.td (+8)
  • (modified) llvm/test/CodeGen/PowerPC/scalar-min-max.ll (+8-12)
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 1f75425752a78..0ee93b5e8012c 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -777,6 +777,10 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
     setOperationAction(ISD::FMAXNUM_IEEE, MVT::f32, Legal);
     setOperationAction(ISD::FMINNUM_IEEE, MVT::f64, Legal);
     setOperationAction(ISD::FMINNUM_IEEE, MVT::f32, Legal);
+    setOperationAction(ISD::FMAXNUM, MVT::f64, Legal);
+    setOperationAction(ISD::FMAXNUM, MVT::f32, Legal);
+    setOperationAction(ISD::FMINNUM, MVT::f64, Legal);
+    setOperationAction(ISD::FMINNUM, MVT::f32, Legal);
   }
 
   if (Subtarget.hasAltivec()) {
diff --git a/llvm/lib/Target/PowerPC/PPCInstrVSX.td b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
index 19448210f5db1..695e28bf1493c 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrVSX.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
@@ -2731,6 +2731,8 @@ def : Pat<(f32 (fneg (fabs f32:$S))),
                (COPY_TO_REGCLASS $S, VSFRC)), VSSRC))>;
 
 // f32 Min.
+def : Pat<(f32 (fminnum f32:$A, f32:$B)),
+          (f32 FpMinMax.F32Min)>;
 def : Pat<(f32 (fminnum_ieee f32:$A, f32:$B)),
           (f32 FpMinMax.F32Min)>;
 def : Pat<(f32 (fminnum_ieee (fcanonicalize f32:$A), f32:$B)),
@@ -2742,6 +2744,8 @@ def : Pat<(f32 (fminnum_ieee (fcanonicalize f32:$A), (fcanonicalize f32:$B))),
 // F32 Max.
 def : Pat<(f32 (fmaxnum_ieee f32:$A, f32:$B)),
           (f32 FpMinMax.F32Max)>;
+def : Pat<(f32 (fmaxnum f32:$A, f32:$B)),
+          (f32 FpMinMax.F32Max)>;
 def : Pat<(f32 (fmaxnum_ieee (fcanonicalize f32:$A), f32:$B)),
           (f32 FpMinMax.F32Max)>;
 def : Pat<(f32 (fmaxnum_ieee f32:$A, (fcanonicalize f32:$B))),
@@ -2750,6 +2754,8 @@ def : Pat<(f32 (fmaxnum_ieee (fcanonicalize f32:$A), (fcanonicalize f32:$B))),
           (f32 FpMinMax.F32Max)>;
 
 // f64 Min.
+def : Pat<(f64 (fminnum f64:$A, f64:$B)),
+          (f64 (XSMINDP $A, $B))>;
 def : Pat<(f64 (fminnum_ieee f64:$A, f64:$B)),
           (f64 (XSMINDP $A, $B))>;
 def : Pat<(f64 (fminnum_ieee (fcanonicalize f64:$A), f64:$B)),
@@ -2759,6 +2765,8 @@ def : Pat<(f64 (fminnum_ieee f64:$A, (fcanonicalize f64:$B))),
 def : Pat<(f64 (fminnum_ieee (fcanonicalize f64:$A), (fcanonicalize f64:$B))),
           (f64 (XSMINDP $A, $B))>;
 // f64 Max.
+def : Pat<(f64 (fmaxnum f64:$A, f64:$B)),
+          (f64 (XSMAXDP $A, $B))>;
 def : Pat<(f64 (fmaxnum_ieee f64:$A, f64:$B)),
           (f64 (XSMAXDP $A, $B))>;
 def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), f64:$B)),
diff --git a/llvm/test/CodeGen/PowerPC/scalar-min-max.ll b/llvm/test/CodeGen/PowerPC/scalar-min-max.ll
index 216d498e85411..f6ea0d9cc2328 100644
--- a/llvm/test/CodeGen/PowerPC/scalar-min-max.ll
+++ b/llvm/test/CodeGen/PowerPC/scalar-min-max.ll
@@ -117,13 +117,12 @@ define dso_local float @testfmax_fast(float %a, float %b) local_unnamed_addr {
 ;
 ; NO-FAST-P9-LABEL: testfmax_fast:
 ; NO-FAST-P9:       # %bb.0: # %entry
-; NO-FAST-P9-NEXT:    xsmaxcdp f1, f1, f2
+; NO-FAST-P9-NEXT:    xsmaxdp f1, f1, f2
 ; NO-FAST-P9-NEXT:    blr
 ;
 ; NO-FAST-P8-LABEL: testfmax_fast:
 ; NO-FAST-P8:       # %bb.0: # %entry
-; NO-FAST-P8-NEXT:    xssubsp f0, f2, f1
-; NO-FAST-P8-NEXT:    fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT:    xsmaxdp f1, f1, f2
 ; NO-FAST-P8-NEXT:    blr
 entry:
   %cmp = fcmp nnan ninf ogt float %a, %b
@@ -138,13 +137,12 @@ define dso_local double @testdmax_fast(double %a, double %b) local_unnamed_addr
 ;
 ; NO-FAST-P9-LABEL: testdmax_fast:
 ; NO-FAST-P9:       # %bb.0: # %entry
-; NO-FAST-P9-NEXT:    xsmaxcdp f1, f1, f2
+; NO-FAST-P9-NEXT:    xsmaxdp f1, f1, f2
 ; NO-FAST-P9-NEXT:    blr
 ;
 ; NO-FAST-P8-LABEL: testdmax_fast:
 ; NO-FAST-P8:       # %bb.0: # %entry
-; NO-FAST-P8-NEXT:    xssubdp f0, f2, f1
-; NO-FAST-P8-NEXT:    fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT:    xsmaxdp f1, f1, f2
 ; NO-FAST-P8-NEXT:    blr
 entry:
   %cmp = fcmp nnan ninf ogt double %a, %b
@@ -159,13 +157,12 @@ define dso_local float @testfmin_fast(float %a, float %b) local_unnamed_addr {
 ;
 ; NO-FAST-P9-LABEL: testfmin_fast:
 ; NO-FAST-P9:       # %bb.0: # %entry
-; NO-FAST-P9-NEXT:    xsmincdp f1, f1, f2
+; NO-FAST-P9-NEXT:    xsmindp f1, f1, f2
 ; NO-FAST-P9-NEXT:    blr
 ;
 ; NO-FAST-P8-LABEL: testfmin_fast:
 ; NO-FAST-P8:       # %bb.0: # %entry
-; NO-FAST-P8-NEXT:    xssubsp f0, f1, f2
-; NO-FAST-P8-NEXT:    fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT:    xsmindp f1, f1, f2
 ; NO-FAST-P8-NEXT:    blr
 entry:
   %cmp = fcmp nnan ninf olt float %a, %b
@@ -180,13 +177,12 @@ define dso_local double @testdmin_fast(double %a, double %b) local_unnamed_addr
 ;
 ; NO-FAST-P9-LABEL: testdmin_fast:
 ; NO-FAST-P9:       # %bb.0: # %entry
-; NO-FAST-P9-NEXT:    xsmincdp f1, f1, f2
+; NO-FAST-P9-NEXT:    xsmindp f1, f1, f2
 ; NO-FAST-P9-NEXT:    blr
 ;
 ; NO-FAST-P8-LABEL: testdmin_fast:
 ; NO-FAST-P8:       # %bb.0: # %entry
-; NO-FAST-P8-NEXT:    xssubdp f0, f1, f2
-; NO-FAST-P8-NEXT:    fsel f1, f0, f2, f1
+; NO-FAST-P8-NEXT:    xsmindp f1, f1, f2
 ; NO-FAST-P8-NEXT:    blr
 entry:
   %cmp = fcmp nnan ninf olt double %a, %b

@wzssyqa wzssyqa requested review from arsenm and lei137 April 15, 2025 03:12
@wzssyqa
Copy link
Contributor Author

wzssyqa commented Apr 25, 2025

ping

Comment on lines 2734 to 2742
def : Pat<(f32 (fminnum f32:$A, f32:$B)),
(f32 FpMinMax.F32Min)>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect a more direct replacement, removing the legality of the IEEE variants. At least should avoid duplicating the patterns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some steps to remove IEEE variants:

  • Support fminnum/fmaxnum on all architectures with fminnum_ieee/fmaxnum_ieee.
  • Remove all the references of FMAXNUM_IEEE/FMINNUM_IEEE in the common code
  • Remove fminnum_ieee/fmaxnum_ieee in the code of backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a patch for the first step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need to have a target have coexisting support for both fminnum and fminnum_ieee. We can fully migrate the target to the new behavior in one go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But we need sometime to switch FMAXNUM_IEEE to FMAXNUM in common code, such as in TargetLowering.cpp etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we drop fminnum_ieee/fmaxnum_ieee, we will have some z-turn of the test suite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z-turn?

But we need sometime to switch FMAXNUM_IEEE to FMAXNUM

Yes, but you can get there by removing the use a target at a time. Here if you make FMAXNUM legal, and stop making FMAXNUM_IEEE legal, the target starts interpreting the nodes in the new way and stops using FMAXNUM_IEEE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z-turn?

I mean we will need to change the testcases, and then will need to change them back.

But we need sometime to switch FMAXNUM_IEEE to FMAXNUM

Yes, but you can get there by removing the use a target at a time. Here if you make FMAXNUM legal, and stop making FMAXNUM_IEEE legal, the target starts interpreting the nodes in the new way and stops using FMAXNUM_IEEE.

Yes. But we haven't change the code in arch-indepent code.
I will have a try with PowerPC/VSX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a try. If we drop FMAXNUM_IEEE/FMINNUM_IEEE now,
llvm/test/CodeGen/PowerPC/fminimum-fmaximum.ll will not emit correct instructions, as
expandFMINIMUM_FMAXIMUM uses FMINNUM_IEEE/FMAXNUM_IEEE.

@wzssyqa wzssyqa requested a review from arsenm April 25, 2025 12:28
@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Apr 25, 2025
@wzssyqa wzssyqa marked this pull request as draft April 25, 2025 17:19
@wzssyqa
Copy link
Contributor Author

wzssyqa commented Apr 25, 2025

Let's wait for #137367

wzssyqa added 5 commits May 8, 2025 10:14
PowerPC with VSX has vector instructions:
   XVMAXSP/XVMINSP/XVMAXDP/XVMINDP
which follow the semantics of minNUM/maxNUM of IEEE754-2008;

and scaler instructions
   XSMINDP/XSMAXDP
which also follow semantics of minNUM/maxNUM of IEEE754-2008.

Let's use them to define FMAXNUM_IEEE and FMINNUM_IEEE.

Currently, some
   Pat<(f64 (fminnum_ieee (fcanonicalize ..
are defined. They are not correct. Let's remove them.
In the future patch, we will define fcanonicalize for PowerPC/VSX,
then `fminimunnum/fmaximumnum` will be usable.
In LangRef, we claim that FMINNUM and FMAXNUM should follow the
minNum and maxNum operators in IEEE754-2008.

PowerPC/VSX does have these instructions XSMINDP and XSMAXDP.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:PowerPC llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants