Skip to content

Commit 215f92b

Browse files
authored
[AMDGPU] Fix crash in the SILoadStoreOptimizer (#93862)
It does not properly handle situation when address calculation uses V_ADDC_U32 0, 0, carry-in (i.e. with both src0 and src1 immediates).
1 parent c4dad9a commit 215f92b

File tree

2 files changed

+27
-1
lines changed

2 files changed

+27
-1
lines changed

llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2034,7 +2034,7 @@ void SILoadStoreOptimizer::processBaseWithConstOffset(const MachineOperand &Base
20342034
if (Src0->isImm())
20352035
std::swap(Src0, Src1);
20362036

2037-
if (!Src1->isImm())
2037+
if (!Src1->isImm() || Src0->isImm())
20382038
return;
20392039

20402040
uint64_t Offset1 = Src1->getImm();
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
2+
# RUN: llc -march=amdgcn -mcpu=gfx900 -run-pass=si-load-store-opt -o - %s | FileCheck --check-prefix=GCN %s
3+
4+
# This used to crash
5+
6+
---
7+
name: analyze_addc_0_0
8+
body: |
9+
bb.1.entry:
10+
liveins: $vgpr0
11+
12+
; GCN-LABEL: name: analyze_addc_0_0
13+
; GCN: liveins: $vgpr0
14+
; GCN-NEXT: {{ $}}
15+
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
16+
; GCN-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], 16, 0, implicit $exec
17+
; GCN-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 0, 0, killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
18+
; GCN-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
19+
; GCN-NEXT: [[GLOBAL_LOAD_DWORDX4_:%[0-9]+]]:vreg_128 = GLOBAL_LOAD_DWORDX4 [[REG_SEQUENCE]], 0, 0, implicit $exec
20+
%0:vgpr_32 = COPY $vgpr0
21+
%1:vgpr_32, %2:sreg_64_xexec = V_ADD_CO_U32_e64 %0, 16, 0, implicit $exec
22+
%3:vgpr_32, dead %26:sreg_64_xexec = V_ADDC_U32_e64 0, 0, killed %2, 0, implicit $exec
23+
%4:vreg_64 = REG_SEQUENCE %1, %subreg.sub0, %3, %subreg.sub1
24+
%5:vreg_128 = GLOBAL_LOAD_DWORDX4 %4, 0, 0, implicit $exec
25+
26+
...

0 commit comments

Comments
 (0)