Skip to content

Commit 3675761

Browse files
authored
[NVVM] Upgrade nvvm.ptr.* intrinics to addrspace cast (#109710)
Remove the following intrinsics which can be trivially replaced with an `addrspacecast` * llvm.nvvm.ptr.gen.to.global * llvm.nvvm.ptr.gen.to.shared * llvm.nvvm.ptr.gen.to.constant * llvm.nvvm.ptr.gen.to.local * llvm.nvvm.ptr.global.to.gen * llvm.nvvm.ptr.shared.to.gen * llvm.nvvm.ptr.constant.to.gen * llvm.nvvm.ptr.local.to.gen Also, cleanup the NVPTX lowering of `addrspacecast` making it more concise.
1 parent 8b5e841 commit 3675761

File tree

10 files changed

+146
-228
lines changed

10 files changed

+146
-228
lines changed

llvm/docs/NVPTXUsage.rst

Lines changed: 0 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -127,69 +127,6 @@ Example: 64-bit PTX for CUDA Driver API: ``nvptx64-nvidia-cuda``
127127
NVPTX Intrinsics
128128
================
129129

130-
Address Space Conversion
131-
------------------------
132-
133-
'``llvm.nvvm.ptr.*.to.gen``' Intrinsics
134-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
135-
136-
Syntax:
137-
"""""""
138-
139-
These are overloaded intrinsics. You can use these on any pointer types.
140-
141-
.. code-block:: llvm
142-
143-
declare ptr @llvm.nvvm.ptr.global.to.gen.p0.p1(ptr addrspace(1))
144-
declare ptr @llvm.nvvm.ptr.shared.to.gen.p0.p3(ptr addrspace(3))
145-
declare ptr @llvm.nvvm.ptr.constant.to.gen.p0.p4(ptr addrspace(4))
146-
declare ptr @llvm.nvvm.ptr.local.to.gen.p0.p5(ptr addrspace(5))
147-
148-
Overview:
149-
"""""""""
150-
151-
The '``llvm.nvvm.ptr.*.to.gen``' intrinsics convert a pointer in a non-generic
152-
address space to a generic address space pointer.
153-
154-
Semantics:
155-
""""""""""
156-
157-
These intrinsics modify the pointer value to be a valid generic address space
158-
pointer.
159-
160-
161-
'``llvm.nvvm.ptr.gen.to.*``' Intrinsics
162-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
163-
164-
Syntax:
165-
"""""""
166-
167-
These are overloaded intrinsics. You can use these on any pointer types.
168-
169-
.. code-block:: llvm
170-
171-
declare ptr addrspace(1) @llvm.nvvm.ptr.gen.to.global.p1.p0(ptr)
172-
declare ptr addrspace(3) @llvm.nvvm.ptr.gen.to.shared.p3.p0(ptr)
173-
declare ptr addrspace(4) @llvm.nvvm.ptr.gen.to.constant.p4.p0(ptr)
174-
declare ptr addrspace(5) @llvm.nvvm.ptr.gen.to.local.p5.p0(ptr)
175-
176-
Overview:
177-
"""""""""
178-
179-
The '``llvm.nvvm.ptr.gen.to.*``' intrinsics convert a pointer in the generic
180-
address space to a pointer in the target address space. Note that these
181-
intrinsics are only useful if the address space of the target address space of
182-
the pointer is known. It is not legal to use address space conversion
183-
intrinsics to convert a pointer from one non-generic address space to another
184-
non-generic address space.
185-
186-
Semantics:
187-
""""""""""
188-
189-
These intrinsics modify the pointer value to be a valid pointer in the target
190-
non-generic address space.
191-
192-
193130
Reading PTX Special Registers
194131
-----------------------------
195132

llvm/docs/ReleaseNotes.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,18 @@ Changes to the LLVM IR
6969
* ``llvm.nvvm.rotate.right.b64``
7070
* ``llvm.nvvm.rotate.b64``
7171

72+
* Remove the following intrinsics which can be replaced with an
73+
``addrspacecast``:
74+
75+
* ``llvm.nvvm.ptr.gen.to.global``
76+
* ``llvm.nvvm.ptr.gen.to.shared``
77+
* ``llvm.nvvm.ptr.gen.to.constant``
78+
* ``llvm.nvvm.ptr.gen.to.local``
79+
* ``llvm.nvvm.ptr.global.to.gen``
80+
* ``llvm.nvvm.ptr.shared.to.gen``
81+
* ``llvm.nvvm.ptr.constant.to.gen``
82+
* ``llvm.nvvm.ptr.local.to.gen``
83+
7284
Changes to LLVM infrastructure
7385
------------------------------
7486

llvm/include/llvm/IR/IntrinsicsNVVM.td

Lines changed: 12 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,18 @@
3030
// * llvm.nvvm.max.ui --> select(x ule y, x, y)
3131
// * llvm.nvvm.max.ull --> ibid.
3232
// * llvm.nvvm.h2f --> llvm.convert.to.fp16.f32
33-
// * llvm.nvvm.bitcast.f2i --> bitcast
34-
// * llvm.nvvm.bitcast.i2f --> ibid.
35-
// * llvm.nvvm.bitcast.d2ll --> ibid.
36-
// * llvm.nvvm.bitcast.ll2d --> ibid.
33+
// * llvm.nvvm.bitcast.f2i --> bitcast
34+
// * llvm.nvvm.bitcast.i2f --> ibid.
35+
// * llvm.nvvm.bitcast.d2ll --> ibid.
36+
// * llvm.nvvm.bitcast.ll2d --> ibid.
37+
// * llvm.nvvm.ptr.gen.to.global --> addrspacecast
38+
// * llvm.nvvm.ptr.gen.to.shared --> ibid.
39+
// * llvm.nvvm.ptr.gen.to.constant --> ibid.
40+
// * llvm.nvvm.ptr.gen.to.local --> ibid.
41+
// * llvm.nvvm.ptr.global.to.gen --> ibid.
42+
// * llvm.nvvm.ptr.shared.to.gen --> ibid.
43+
// * llvm.nvvm.ptr.constant.to.gen --> ibid.
44+
// * llvm.nvvm.ptr.local.to.gen --> ibid.
3745

3846
def llvm_global_ptr_ty : LLVMQualPointerType<1>; // (global)ptr
3947
def llvm_shared_ptr_ty : LLVMQualPointerType<3>; // (shared)ptr
@@ -1602,40 +1610,6 @@ def int_nvvm_ldg_global_p : Intrinsic<[llvm_anyptr_ty],
16021610
[IntrReadMem, IntrArgMemOnly, IntrNoCallback, IntrWillReturn, NoCapture<ArgIndex<0>>],
16031611
"llvm.nvvm.ldg.global.p">;
16041612

1605-
// Use for generic pointers
1606-
// - These intrinsics are used to convert address spaces.
1607-
// - The input pointer and output pointer must have the same type, except for
1608-
// the address-space. (This restriction is not enforced here as there is
1609-
// currently no way to describe it).
1610-
// - This complements the llvm bitcast, which can be used to cast one type
1611-
// of pointer to another type of pointer, while the address space remains
1612-
// the same.
1613-
def int_nvvm_ptr_local_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1614-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1615-
"llvm.nvvm.ptr.local.to.gen">;
1616-
def int_nvvm_ptr_shared_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1617-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1618-
"llvm.nvvm.ptr.shared.to.gen">;
1619-
def int_nvvm_ptr_global_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1620-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1621-
"llvm.nvvm.ptr.global.to.gen">;
1622-
def int_nvvm_ptr_constant_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1623-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1624-
"llvm.nvvm.ptr.constant.to.gen">;
1625-
1626-
def int_nvvm_ptr_gen_to_global: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1627-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1628-
"llvm.nvvm.ptr.gen.to.global">;
1629-
def int_nvvm_ptr_gen_to_shared: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1630-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1631-
"llvm.nvvm.ptr.gen.to.shared">;
1632-
def int_nvvm_ptr_gen_to_local: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1633-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1634-
"llvm.nvvm.ptr.gen.to.local">;
1635-
def int_nvvm_ptr_gen_to_constant: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1636-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1637-
"llvm.nvvm.ptr.gen.to.constant">;
1638-
16391613
// Used in nvvm internally to help address space opt and ptx code generation
16401614
// This is for params that are passed to kernel functions by pointer by-val.
16411615
def int_nvvm_ptr_gen_to_param: Intrinsic<[llvm_anyptr_ty],

llvm/lib/IR/AutoUpgrade.cpp

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1275,6 +1275,16 @@ static bool upgradeIntrinsicFunction1(Function *F, Function *&NewFn,
12751275
else if (Name.consume_front("rotate."))
12761276
// nvvm.rotate.{b32,b64,right.b64}
12771277
Expand = Name == "b32" || Name == "b64" || Name == "right.b64";
1278+
else if (Name.consume_front("ptr.gen.to."))
1279+
// nvvm.ptr.gen.to.{local,shared,global,constant}
1280+
Expand = Name.starts_with("local") || Name.starts_with("shared") ||
1281+
Name.starts_with("global") || Name.starts_with("constant");
1282+
else if (Name.consume_front("ptr."))
1283+
// nvvm.ptr.{local,shared,global,constant}.to.gen
1284+
Expand =
1285+
(Name.consume_front("local") || Name.consume_front("shared") ||
1286+
Name.consume_front("global") || Name.consume_front("constant")) &&
1287+
Name.starts_with(".to.gen");
12781288
else
12791289
Expand = false;
12801290

@@ -2338,6 +2348,15 @@ static Value *upgradeNVVMIntrinsicCall(StringRef Name, CallBase *CI,
23382348
Value *ZExtShiftAmt = Builder.CreateZExt(CI->getOperand(1), Int64Ty);
23392349
Rep = Builder.CreateIntrinsic(Int64Ty, Intrinsic::fshr,
23402350
{Arg, Arg, ZExtShiftAmt});
2351+
} else if ((Name.consume_front("ptr.gen.to.") &&
2352+
(Name.starts_with("local") || Name.starts_with("shared") ||
2353+
Name.starts_with("global") || Name.starts_with("constant"))) ||
2354+
(Name.consume_front("ptr.") &&
2355+
(Name.consume_front("local") || Name.consume_front("shared") ||
2356+
Name.consume_front("global") ||
2357+
Name.consume_front("constant")) &&
2358+
Name.starts_with(".to.gen"))) {
2359+
Rep = Builder.CreateAddrSpaceCast(CI->getArgOperand(0), CI->getType());
23412360
} else {
23422361
Intrinsic::ID IID = shouldUpgradeNVPTXBF16Intrinsic(Name);
23432362
if (IID != Intrinsic::not_intrinsic &&

llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp

Lines changed: 28 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1109,38 +1109,38 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
11091109
AddrSpaceCastSDNode *CastN = cast<AddrSpaceCastSDNode>(N);
11101110
unsigned SrcAddrSpace = CastN->getSrcAddressSpace();
11111111
unsigned DstAddrSpace = CastN->getDestAddressSpace();
1112+
SDLoc DL(N);
11121113
assert(SrcAddrSpace != DstAddrSpace &&
11131114
"addrspacecast must be between different address spaces");
11141115

11151116
if (DstAddrSpace == ADDRESS_SPACE_GENERIC) {
11161117
// Specific to generic
1118+
1119+
if (TM.is64Bit() && TM.getPointerSizeInBits(SrcAddrSpace) == 32) {
1120+
SDValue CvtNone =
1121+
CurDAG->getTargetConstant(NVPTX::PTXCvtMode::NONE, DL, MVT::i32);
1122+
SDNode *Cvt = CurDAG->getMachineNode(NVPTX::CVT_u64_u32, DL, MVT::i64,
1123+
Src, CvtNone);
1124+
Src = SDValue(Cvt, 0);
1125+
}
1126+
11171127
unsigned Opc;
11181128
switch (SrcAddrSpace) {
11191129
default: report_fatal_error("Bad address space in addrspacecast");
11201130
case ADDRESS_SPACE_GLOBAL:
11211131
Opc = TM.is64Bit() ? NVPTX::cvta_global_64 : NVPTX::cvta_global;
11221132
break;
11231133
case ADDRESS_SPACE_SHARED:
1124-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(SrcAddrSpace) == 32
1125-
? NVPTX::cvta_shared_6432
1126-
: NVPTX::cvta_shared_64)
1127-
: NVPTX::cvta_shared;
1134+
Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
11281135
break;
11291136
case ADDRESS_SPACE_CONST:
1130-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(SrcAddrSpace) == 32
1131-
? NVPTX::cvta_const_6432
1132-
: NVPTX::cvta_const_64)
1133-
: NVPTX::cvta_const;
1137+
Opc = TM.is64Bit() ? NVPTX::cvta_const_64 : NVPTX::cvta_const;
11341138
break;
11351139
case ADDRESS_SPACE_LOCAL:
1136-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(SrcAddrSpace) == 32
1137-
? NVPTX::cvta_local_6432
1138-
: NVPTX::cvta_local_64)
1139-
: NVPTX::cvta_local;
1140+
Opc = TM.is64Bit() ? NVPTX::cvta_local_64 : NVPTX::cvta_local;
11401141
break;
11411142
}
1142-
ReplaceNode(N, CurDAG->getMachineNode(Opc, SDLoc(N), N->getValueType(0),
1143-
Src));
1143+
ReplaceNode(N, CurDAG->getMachineNode(Opc, DL, N->getValueType(0), Src));
11441144
return;
11451145
} else {
11461146
// Generic to specific
@@ -1153,30 +1153,28 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
11531153
Opc = TM.is64Bit() ? NVPTX::cvta_to_global_64 : NVPTX::cvta_to_global;
11541154
break;
11551155
case ADDRESS_SPACE_SHARED:
1156-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(DstAddrSpace) == 32
1157-
? NVPTX::cvta_to_shared_3264
1158-
: NVPTX::cvta_to_shared_64)
1159-
: NVPTX::cvta_to_shared;
1156+
Opc = TM.is64Bit() ? NVPTX::cvta_to_shared_64 : NVPTX::cvta_to_shared;
11601157
break;
11611158
case ADDRESS_SPACE_CONST:
1162-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(DstAddrSpace) == 32
1163-
? NVPTX::cvta_to_const_3264
1164-
: NVPTX::cvta_to_const_64)
1165-
: NVPTX::cvta_to_const;
1159+
Opc = TM.is64Bit() ? NVPTX::cvta_to_const_64 : NVPTX::cvta_to_const;
11661160
break;
11671161
case ADDRESS_SPACE_LOCAL:
1168-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(DstAddrSpace) == 32
1169-
? NVPTX::cvta_to_local_3264
1170-
: NVPTX::cvta_to_local_64)
1171-
: NVPTX::cvta_to_local;
1162+
Opc = TM.is64Bit() ? NVPTX::cvta_to_local_64 : NVPTX::cvta_to_local;
11721163
break;
11731164
case ADDRESS_SPACE_PARAM:
1174-
Opc = TM.is64Bit() ? NVPTX::nvvm_ptr_gen_to_param_64
1175-
: NVPTX::nvvm_ptr_gen_to_param;
1165+
Opc = TM.is64Bit() ? NVPTX::IMOV64rr : NVPTX::IMOV32rr;
11761166
break;
11771167
}
1178-
ReplaceNode(N, CurDAG->getMachineNode(Opc, SDLoc(N), N->getValueType(0),
1179-
Src));
1168+
1169+
SDNode *CVTA = CurDAG->getMachineNode(Opc, DL, N->getValueType(0), Src);
1170+
if (TM.is64Bit() && TM.getPointerSizeInBits(DstAddrSpace) == 32) {
1171+
SDValue CvtNone =
1172+
CurDAG->getTargetConstant(NVPTX::PTXCvtMode::NONE, DL, MVT::i32);
1173+
CVTA = CurDAG->getMachineNode(NVPTX::CVT_u32_u64, DL, MVT::i32,
1174+
SDValue(CVTA, 0), CvtNone);
1175+
}
1176+
1177+
ReplaceNode(N, CVTA);
11801178
return;
11811179
}
11821180
}

llvm/lib/Target/NVPTX/NVPTXInstrInfo.td

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -174,10 +174,6 @@ def hasSM90a : Predicate<"Subtarget->getFullSmVersion() == 901">;
174174
def hasSHFL : Predicate<"!(Subtarget->getSmVersion() >= 70"
175175
"&& Subtarget->getPTXVersion() >= 64)">;
176176

177-
def useShortPtrLocal : Predicate<"TM.is64Bit() && TM.getPointerSizeInBits(ADDRESS_SPACE_LOCAL) == 32">;
178-
def useShortPtrShared : Predicate<"TM.is64Bit() && TM.getPointerSizeInBits(ADDRESS_SPACE_SHARED) == 32">;
179-
def useShortPtrConst : Predicate<"TM.is64Bit() && TM.getPointerSizeInBits(ADDRESS_SPACE_CONST) == 32">;
180-
181177
def useFP16Math: Predicate<"Subtarget->allowFP16Math()">;
182178
def hasBF16Math: Predicate<"Subtarget->hasBF16Math()">;
183179

0 commit comments

Comments
 (0)