Skip to content

Commit 870b376

Browse files
authored
[DirectX] Support the CBufferLoadLegacy operation (#128699)
Fixes #112992
1 parent 5d501c6 commit 870b376

File tree

9 files changed

+342
-7
lines changed

9 files changed

+342
-7
lines changed

llvm/docs/DirectX/DXILResources.rst

Lines changed: 120 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,7 @@ Examples:
277277
Accessing Resources as Memory
278278
-----------------------------
279279

280-
*relevant types: Buffers, CBuffer, and Textures*
280+
*relevant types: Buffers and Textures*
281281

282282
Loading and storing from resources is generally represented in LLVM using
283283
operations on memory that is only accessible via a handle object. Given a
@@ -321,12 +321,11 @@ Examples:
321321
Loads, Samples, and Gathers
322322
---------------------------
323323

324-
*relevant types: Buffers, CBuffers, and Textures*
324+
*relevant types: Buffers and Textures*
325325

326-
All load, sample, and gather operations in DXIL return a `ResRet`_ type, and
327-
CBuffer loads return a similar `CBufRet`_ type. These types are structs
328-
containing 4 elements of some basic type, and in the case of `ResRet` a 5th
329-
element that is used by the `CheckAccessFullyMapped`_ operation. Some of these
326+
All load, sample, and gather operations in DXIL return a `ResRet`_ type. These
327+
types are structs containing 4 elements of some basic type, and a 5th element
328+
that is used by the `CheckAccessFullyMapped`_ operation. Some of these
330329
operations, like `RawBufferLoad`_ include a mask and/or alignment that tell us
331330
some information about how to interpret those four values.
332331

@@ -632,3 +631,118 @@ Examples:
632631
target("dx.RawBuffer", i8, 1, 0, 0) %buffer,
633632
i32 %index, i32 0, <4 x double> %data)
634633
634+
Constant Buffer Loads
635+
---------------------
636+
637+
*relevant types: CBuffers*
638+
639+
The `CBufferLoadLegacy`_ operation, which despite the name is the only
640+
supported way to load from a cbuffer in any DXIL version, loads a single "row"
641+
of a cbuffer, which is exactly 16 bytes. The return value of the operation is
642+
represented by a `CBufRet`_ type, which has variants for 2 64-bit values, 4
643+
32-bit values, and 8 16-bit values.
644+
645+
We represent these in LLVM IR with 3 separate operations, which return a
646+
2-element, 4-element, or 8-element struct respectively.
647+
648+
.. _CBufferLoadLegacy: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#cbufferLoadLegacy
649+
.. _CBufRet: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#cbufferloadlegacy
650+
651+
.. list-table:: ``@llvm.dx.resource.load.cbufferrow.4``
652+
:header-rows: 1
653+
654+
* - Argument
655+
-
656+
- Type
657+
- Description
658+
* - Return value
659+
-
660+
- A struct of 4 32-bit values
661+
- A single row of a cbuffer, interpreted as 4 32-bit values
662+
* - ``%buffer``
663+
- 0
664+
- ``target(dx.CBuffer, ...)``
665+
- The buffer to load from
666+
* - ``%index``
667+
- 1
668+
- ``i32``
669+
- Index into the buffer
670+
671+
Examples:
672+
673+
.. code-block:: llvm
674+
675+
%ret = call {float, float, float, float}
676+
@llvm.dx.resource.load.cbufferrow.4(
677+
target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
678+
i32 %index)
679+
%ret = call {i32, i32, i32, i32}
680+
@llvm.dx.resource.load.cbufferrow.4(
681+
target("dx.CBuffer", target("dx.Layout", {i32}, 4, 0)) %buffer,
682+
i32 %index)
683+
684+
.. list-table:: ``@llvm.dx.resource.load.cbufferrow.2``
685+
:header-rows: 1
686+
687+
* - Argument
688+
-
689+
- Type
690+
- Description
691+
* - Return value
692+
-
693+
- A struct of 2 64-bit values
694+
- A single row of a cbuffer, interpreted as 2 64-bit values
695+
* - ``%buffer``
696+
- 0
697+
- ``target(dx.CBuffer, ...)``
698+
- The buffer to load from
699+
* - ``%index``
700+
- 1
701+
- ``i32``
702+
- Index into the buffer
703+
704+
Examples:
705+
706+
.. code-block:: llvm
707+
708+
%ret = call {double, double}
709+
@llvm.dx.resource.load.cbufferrow.2(
710+
target("dx.CBuffer", target("dx.Layout", {double}, 8, 0)) %buffer,
711+
i32 %index)
712+
%ret = call {i64, i64}
713+
@llvm.dx.resource.load.cbufferrow.2(
714+
target("dx.CBuffer", target("dx.Layout", {i64}, 4, 0)) %buffer,
715+
i32 %index)
716+
717+
.. list-table:: ``@llvm.dx.resource.load.cbufferrow.8``
718+
:header-rows: 1
719+
720+
* - Argument
721+
-
722+
- Type
723+
- Description
724+
* - Return value
725+
-
726+
- A struct of 8 16-bit values
727+
- A single row of a cbuffer, interpreted as 8 16-bit values
728+
* - ``%buffer``
729+
- 0
730+
- ``target(dx.CBuffer, ...)``
731+
- The buffer to load from
732+
* - ``%index``
733+
- 1
734+
- ``i32``
735+
- Index into the buffer
736+
737+
Examples:
738+
739+
.. code-block:: llvm
740+
741+
%ret = call {half, half, half, half, half, half, half, half}
742+
@llvm.dx.resource.load.cbufferrow.8(
743+
target("dx.CBuffer", target("dx.Layout", {half}, 2, 0)) %buffer,
744+
i32 %index)
745+
%ret = call {i16, i16, i16, i16, i16, i16, i16, i16}
746+
@llvm.dx.resource.load.cbufferrow.8(
747+
target("dx.CBuffer", target("dx.Layout", {i16}, 2, 0)) %buffer,
748+
i32 %index)

llvm/include/llvm/IR/IntrinsicsDirectX.td

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,21 @@ def int_dx_resource_store_rawbuffer
4545
[], [llvm_any_ty, llvm_i32_ty, llvm_i32_ty, llvm_any_ty],
4646
[IntrWriteMem]>;
4747

48+
// dx.resource.load.cbufferrow encodes the number of elements returned in the
49+
// function name. The total size of the return should always be 128 bits.
50+
def int_dx_resource_load_cbufferrow_8
51+
: DefaultAttrsIntrinsic<
52+
[llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty,
53+
llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty],
54+
[llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
55+
def int_dx_resource_load_cbufferrow_4
56+
: DefaultAttrsIntrinsic<
57+
[llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty],
58+
[llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
59+
def int_dx_resource_load_cbufferrow_2
60+
: DefaultAttrsIntrinsic<[llvm_any_ty, llvm_any_ty],
61+
[llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
62+
4863
def int_dx_resource_updatecounter
4964
: DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_any_ty, llvm_i8_ty],
5065
[IntrInaccessibleMemOrArgMemOnly]>;

llvm/lib/Target/DirectX/DXIL.td

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,12 @@ def ResRetDoubleTy : DXILOpParamType;
4646
def ResRetInt16Ty : DXILOpParamType;
4747
def ResRetInt32Ty : DXILOpParamType;
4848
def ResRetInt64Ty : DXILOpParamType;
49+
def CBufRetHalfTy : DXILOpParamType;
50+
def CBufRetFloatTy : DXILOpParamType;
51+
def CBufRetDoubleTy : DXILOpParamType;
52+
def CBufRetInt16Ty : DXILOpParamType;
53+
def CBufRetInt32Ty : DXILOpParamType;
54+
def CBufRetInt64Ty : DXILOpParamType;
4955
def HandleTy : DXILOpParamType;
5056
def ResBindTy : DXILOpParamType;
5157
def ResPropsTy : DXILOpParamType;
@@ -816,6 +822,19 @@ def CreateHandle : DXILOp<57, createHandle> {
816822
let attributes = [Attributes<DXIL1_0, [ReadOnly]>];
817823
}
818824

825+
def CBufferLoadLegacy : DXILOp<59, cbufferLoadLegacy> {
826+
let Doc = "loads a value from a constant buffer resource";
827+
// Handle, Index
828+
let arguments = [HandleTy, Int32Ty];
829+
let result = OverloadTy;
830+
let overloads = [Overloads<DXIL1_0, [
831+
CBufRetHalfTy, CBufRetFloatTy, CBufRetDoubleTy, CBufRetInt16Ty,
832+
CBufRetInt32Ty, CBufRetInt64Ty
833+
]>];
834+
let stages = [Stages<DXIL1_0, [all_stages]>];
835+
let attributes = [Attributes<DXIL1_0, [ReadOnly]>];
836+
}
837+
819838
def BufferLoad : DXILOp<68, bufferLoad> {
820839
let Doc = "reads from a TypedBuffer";
821840
// Handle, Coord0, Coord1

llvm/lib/Target/DirectX/DXILOpBuilder.cpp

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,29 @@ static StructType *getResRetType(Type *ElementTy) {
201201
return getOrCreateStructType(TypeName, FieldTypes, Ctx);
202202
}
203203

204+
static StructType *getCBufRetType(Type *ElementTy) {
205+
LLVMContext &Ctx = ElementTy->getContext();
206+
OverloadKind Kind = getOverloadKind(ElementTy);
207+
std::string TypeName = constructOverloadTypeName(Kind, "dx.types.CBufRet.");
208+
209+
// 64-bit types only have two elements
210+
if (ElementTy->isDoubleTy() || ElementTy->isIntegerTy(64))
211+
return getOrCreateStructType(TypeName, {ElementTy, ElementTy}, Ctx);
212+
213+
// 16-bit types pack 8 elements and have .8 in their name to differentiate
214+
// from min-precision types.
215+
if (ElementTy->isHalfTy() || ElementTy->isIntegerTy(16)) {
216+
TypeName += ".8";
217+
return getOrCreateStructType(TypeName,
218+
{ElementTy, ElementTy, ElementTy, ElementTy,
219+
ElementTy, ElementTy, ElementTy, ElementTy},
220+
Ctx);
221+
}
222+
223+
return getOrCreateStructType(
224+
TypeName, {ElementTy, ElementTy, ElementTy, ElementTy}, Ctx);
225+
}
226+
204227
static StructType *getHandleType(LLVMContext &Ctx) {
205228
return getOrCreateStructType("dx.types.Handle", PointerType::getUnqual(Ctx),
206229
Ctx);
@@ -265,6 +288,18 @@ static Type *getTypeFromOpParamType(OpParamType Kind, LLVMContext &Ctx,
265288
return getResRetType(Type::getInt32Ty(Ctx));
266289
case OpParamType::ResRetInt64Ty:
267290
return getResRetType(Type::getInt64Ty(Ctx));
291+
case OpParamType::CBufRetHalfTy:
292+
return getCBufRetType(Type::getHalfTy(Ctx));
293+
case OpParamType::CBufRetFloatTy:
294+
return getCBufRetType(Type::getFloatTy(Ctx));
295+
case OpParamType::CBufRetDoubleTy:
296+
return getCBufRetType(Type::getDoubleTy(Ctx));
297+
case OpParamType::CBufRetInt16Ty:
298+
return getCBufRetType(Type::getInt16Ty(Ctx));
299+
case OpParamType::CBufRetInt32Ty:
300+
return getCBufRetType(Type::getInt32Ty(Ctx));
301+
case OpParamType::CBufRetInt64Ty:
302+
return getCBufRetType(Type::getInt64Ty(Ctx));
268303
case OpParamType::HandleTy:
269304
return getHandleType(Ctx);
270305
case OpParamType::ResBindTy:
@@ -535,6 +570,10 @@ StructType *DXILOpBuilder::getResRetType(Type *ElementTy) {
535570
return ::getResRetType(ElementTy);
536571
}
537572

573+
StructType *DXILOpBuilder::getCBufRetType(Type *ElementTy) {
574+
return ::getCBufRetType(ElementTy);
575+
}
576+
538577
StructType *DXILOpBuilder::getHandleType() {
539578
return ::getHandleType(IRB.getContext());
540579
}

llvm/lib/Target/DirectX/DXILOpBuilder.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ class DXILOpBuilder {
5050
/// Get a `%dx.types.ResRet` type with the given element type.
5151
StructType *getResRetType(Type *ElementTy);
5252

53+
/// Get a `%dx.types.CBufRet` type with the given element type.
54+
StructType *getCBufRetType(Type *ElementTy);
55+
5356
/// Get the `%dx.types.Handle` type.
5457
StructType *getHandleType();
5558

llvm/lib/Target/DirectX/DXILOpLowering.cpp

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -569,6 +569,32 @@ class OpLowerer {
569569
});
570570
}
571571

572+
[[nodiscard]] bool lowerCBufferLoad(Function &F) {
573+
IRBuilder<> &IRB = OpBuilder.getIRB();
574+
575+
return replaceFunction(F, [&](CallInst *CI) -> Error {
576+
IRB.SetInsertPoint(CI);
577+
578+
Type *OldTy = cast<StructType>(CI->getType())->getElementType(0);
579+
Type *ScalarTy = OldTy->getScalarType();
580+
Type *NewRetTy = OpBuilder.getCBufRetType(ScalarTy);
581+
582+
Value *Handle =
583+
createTmpHandleCast(CI->getArgOperand(0), OpBuilder.getHandleType());
584+
Value *Index = CI->getArgOperand(1);
585+
586+
Expected<CallInst *> OpCall = OpBuilder.tryCreateOp(
587+
OpCode::CBufferLoadLegacy, {Handle, Index}, CI->getName(), NewRetTy);
588+
if (Error E = OpCall.takeError())
589+
return E;
590+
if (Error E = replaceNamedStructUses(CI, *OpCall))
591+
return E;
592+
593+
CI->eraseFromParent();
594+
return Error::success();
595+
});
596+
}
597+
572598
[[nodiscard]] bool lowerUpdateCounter(Function &F) {
573599
IRBuilder<> &IRB = OpBuilder.getIRB();
574600
Type *Int32Ty = IRB.getInt32Ty();
@@ -808,6 +834,11 @@ class OpLowerer {
808834
case Intrinsic::dx_resource_store_rawbuffer:
809835
HasErrors |= lowerBufferStore(F, /*IsRaw=*/true);
810836
break;
837+
case Intrinsic::dx_resource_load_cbufferrow_2:
838+
case Intrinsic::dx_resource_load_cbufferrow_4:
839+
case Intrinsic::dx_resource_load_cbufferrow_8:
840+
HasErrors |= lowerCBufferLoad(F);
841+
break;
811842
case Intrinsic::dx_resource_updatecounter:
812843
HasErrors |= lowerUpdateCounter(F);
813844
break;
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
; We use llc for this test so that we don't abort after the first error.
2+
; RUN: not llc %s -o /dev/null 2>&1 | FileCheck %s
3+
4+
target triple = "dxil-pc-shadermodel6.6-compute"
5+
6+
declare void @f32_user(float)
7+
declare void @f64_user(double)
8+
declare void @f16_user(half)
9+
10+
; CHECK: error:
11+
; CHECK-SAME: in function four64
12+
; CHECK-SAME: Type mismatch between intrinsic and DXIL op
13+
define void @four64() "hlsl.export" {
14+
%buffer = call target("dx.CBuffer", target("dx.Layout", {double}, 8, 0))
15+
@llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
16+
17+
%load = call {double, double, double, double} @llvm.dx.resource.load.cbufferrow.4(
18+
target("dx.CBuffer", target("dx.Layout", {double}, 8, 0)) %buffer,
19+
i32 0)
20+
%data = extractvalue {double, double, double, double} %load, 0
21+
22+
call void @f64_user(double %data)
23+
24+
ret void
25+
}
26+
27+
; CHECK: error:
28+
; CHECK-SAME: in function two32
29+
; CHECK-SAME: Type mismatch between intrinsic and DXIL op
30+
define void @two32() "hlsl.export" {
31+
%buffer = call target("dx.CBuffer", target("dx.Layout", {float}, 4, 0))
32+
@llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
33+
34+
%load = call {float, float} @llvm.dx.resource.load.cbufferrow.2(
35+
target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
36+
i32 0)
37+
%data = extractvalue {float, float} %load, 0
38+
39+
call void @f32_user(float %data)
40+
41+
ret void
42+
}
43+
44+
declare { double, double, double, double } @llvm.dx.resource.load.cbufferrow.4.f64.f64.f64.f64.tdx.CBuffer_tdx.Layout_sl_f64s_8_0tt(target("dx.CBuffer", target("dx.Layout", { double }, 8, 0)), i32)
45+
declare { float, float } @llvm.dx.resource.load.cbufferrow.2.f32.f32.tdx.CBuffer_tdx.Layout_sl_f32s_4_0tt(target("dx.CBuffer", target("dx.Layout", { float }, 4, 0)), i32)

0 commit comments

Comments
 (0)