-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[Clang][FMV] Stop emitting implicit default version using target_clones. #141808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
With the current behavior the following example yields a linker error: "multiple definition of `foo.default'" // Translation Unit 1 __attribute__((target_clones("dotprod, sve"))) int foo(void) { return 1; } // Translation Unit 2 int foo(void) { return 0; } __attribute__((target_version("dotprod"))) int foo(void); __attribute__((target_version("sve"))) int foo(void); int bar(void) { return foo(); } That is because foo.default is generated twice. As a user I don't find this particularly intuitive. If I wanted the default to be generated in TU1 I'd rather write target_clones("dotprod, sve", "default") explicitly. When changing the code I noticed that the RISC-V target defers the resolver emission when encountering a target_version definition. This seems accidental since it only makes sense for AArch64, where we only emit a resolver once we've processed the entire TU, and only if the default version is present. I've changed this so that RISC-V immediately emmits the resolver. I adjusted the codegen tests since the functions now appear in a different order. Implements ARM-software/acle#377
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: Alexandros Lamprineas (labrinea) ChangesWith the current behavior the following example yields a linker error: "multiple definition of `foo.default'" // Translation Unit 1 // Translation Unit 2 That is because foo.default is generated twice. As a user I don't find this particularly intuitive. If I wanted the default to be generated in TU1 I'd rather write target_clones("dotprod, sve", "default") explicitly. When changing the code I noticed that the RISC-V target defers the resolver emission when encountering a target_version definition. This seems accidental since it only makes sense for AArch64, where we only emit a resolver once we've processed the entire TU, and only if the default version is present. I've changed this so that RISC-V immediately emmits the resolver. I adjusted the codegen tests since the functions now appear in a different order. Implements ARM-software/acle#377 Patch is 143.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141808.diff 13 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6d2c705338ecf..9383c57d3c991 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -4237,19 +4237,19 @@ void CodeGenModule::EmitMultiVersionFunctionDefinition(GlobalDecl GD,
EmitGlobalFunctionDefinition(GD.getWithMultiVersionIndex(I), nullptr);
} else if (auto *TC = FD->getAttr<TargetClonesAttr>()) {
for (unsigned I = 0; I < TC->featuresStrs_size(); ++I)
- // AArch64 favors the default target version over the clone if any.
- if ((!TC->isDefaultVersion(I) || !getTarget().getTriple().isAArch64()) &&
- TC->isFirstOfVersion(I))
+ if (TC->isFirstOfVersion(I))
EmitGlobalFunctionDefinition(GD.getWithMultiVersionIndex(I), nullptr);
- // Ensure that the resolver function is also emitted.
- GetOrCreateMultiVersionResolver(GD);
} else
EmitGlobalFunctionDefinition(GD, GV);
- // Defer the resolver emission until we can reason whether the TU
- // contains a default target version implementation.
- if (FD->isTargetVersionMultiVersion())
- AddDeferredMultiVersionResolverToEmit(GD);
+ // Ensure that the resolver function is also emitted.
+ if (FD->isTargetVersionMultiVersion() || FD->isTargetClonesMultiVersion()) {
+ // On AArch64 defer the resolver emission until the entire TU is processed.
+ if (getTarget().getTriple().isAArch64())
+ AddDeferredMultiVersionResolverToEmit(GD);
+ else
+ GetOrCreateMultiVersionResolver(GD);
+ }
}
void CodeGenModule::EmitGlobalDefinition(GlobalDecl GD, llvm::GlobalValue *GV) {
@@ -4351,7 +4351,7 @@ void CodeGenModule::emitMultiVersionFunctions() {
};
// For AArch64, a resolver is only emitted if a function marked with
- // target_version("default")) or target_clones() is present and defined
+ // target_version("default")) or target_clones("default") is defined
// in this TU. For other architectures it is always emitted.
bool ShouldEmitResolver = !getTarget().getTriple().isAArch64();
SmallVector<CodeGenFunction::FMVResolverOption, 10> Options;
@@ -4374,12 +4374,11 @@ void CodeGenModule::emitMultiVersionFunctions() {
TVA->getFeatures(Feats, Delim);
Options.emplace_back(Func, Feats);
} else if (const auto *TC = CurFD->getAttr<TargetClonesAttr>()) {
- if (IsDefined)
- ShouldEmitResolver = true;
for (unsigned I = 0; I < TC->featuresStrs_size(); ++I) {
if (!TC->isFirstOfVersion(I))
continue;
-
+ if (TC->isDefaultVersion(I) && IsDefined)
+ ShouldEmitResolver = true;
llvm::Function *Func = createFunction(CurFD, I);
Feats.clear();
if (getTarget().getTriple().isX86()) {
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 54bac40982eda..6a202f5c6c167 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3494,13 +3494,7 @@ static void handleTargetClonesAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
if (HasCommas && AL.getNumArgs() > 1)
S.Diag(AL.getLoc(), diag::warn_target_clone_mixed_values);
- if (S.Context.getTargetInfo().getTriple().isAArch64() && !HasDefault) {
- // Add default attribute if there is no one
- HasDefault = true;
- Strings.push_back("default");
- }
-
- if (!HasDefault) {
+ if (!HasDefault && !S.Context.getTargetInfo().getTriple().isAArch64()) {
S.Diag(AL.getLoc(), diag::err_target_clone_must_have_default);
return;
}
diff --git a/clang/test/AST/attr-target-version.c b/clang/test/AST/attr-target-version.c
index 52ac0e61b5a59..b537f5e685a31 100644
--- a/clang/test/AST/attr-target-version.c
+++ b/clang/test/AST/attr-target-version.c
@@ -1,7 +1,7 @@
// RUN: %clang_cc1 -triple aarch64-linux-gnu -ast-dump %s | FileCheck %s
int __attribute__((target_version("sve2-bitperm + sha2"))) foov(void) { return 1; }
-int __attribute__((target_clones(" lse + fp + sha3 "))) fooc(void) { return 2; }
+int __attribute__((target_clones(" lse + fp + sha3 ", "default"))) fooc(void) { return 2; }
// CHECK: TargetVersionAttr
// CHECK: sve2-bitperm + sha2
// CHECK: TargetClonesAttr
diff --git a/clang/test/CodeGen/AArch64/fmv-detection.c b/clang/test/CodeGen/AArch64/fmv-detection.c
index 44702a04e532e..e585140a1eb08 100644
--- a/clang/test/CodeGen/AArch64/fmv-detection.c
+++ b/clang/test/CodeGen/AArch64/fmv-detection.c
@@ -97,7 +97,7 @@ __attribute__((target_version("wfxt"))) int fmv(void) { return 0; }
__attribute__((target_version("cssc+fp"))) int fmv(void);
-__attribute__((target_version("default"))) int fmv(void);
+__attribute__((target_version("default"))) int fmv(void) { return 0; }
int caller() {
return fmv();
@@ -121,380 +121,6 @@ int caller() {
// CHECK-NEXT: ret i32 0
//
//
-// CHECK-LABEL: define {{[^@]+}}@fmv.resolver() comdat {
-// CHECK-NEXT: resolver_entry:
-// CHECK-NEXT: call void @__init_cpu_features_resolver()
-// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 2304
-// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 2304
-// CHECK-NEXT: [[TMP3:%.*]] = and i1 true, [[TMP2]]
-// CHECK-NEXT: br i1 [[TMP3]], label [[RESOLVER_RETURN:%.*]], label [[RESOLVER_ELSE:%.*]]
-// CHECK: resolver_return:
-// CHECK-NEXT: ret ptr @fmv._McsscMfp
-// CHECK: resolver_else:
-// CHECK-NEXT: [[TMP4:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP5:%.*]] = and i64 [[TMP4]], 2048
-// CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 2048
-// CHECK-NEXT: [[TMP7:%.*]] = and i1 true, [[TMP6]]
-// CHECK-NEXT: br i1 [[TMP7]], label [[RESOLVER_RETURN1:%.*]], label [[RESOLVER_ELSE2:%.*]]
-// CHECK: resolver_return1:
-// CHECK-NEXT: ret ptr @fmv._Mcssc
-// CHECK: resolver_else2:
-// CHECK-NEXT: [[TMP8:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP9:%.*]] = and i64 [[TMP8]], 576460752303423488
-// CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP9]], 576460752303423488
-// CHECK-NEXT: [[TMP11:%.*]] = and i1 true, [[TMP10]]
-// CHECK-NEXT: br i1 [[TMP11]], label [[RESOLVER_RETURN3:%.*]], label [[RESOLVER_ELSE4:%.*]]
-// CHECK: resolver_return3:
-// CHECK-NEXT: ret ptr @fmv._Mmops
-// CHECK: resolver_else4:
-// CHECK-NEXT: [[TMP12:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP13:%.*]] = and i64 [[TMP12]], 144119586256651008
-// CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[TMP13]], 144119586256651008
-// CHECK-NEXT: [[TMP15:%.*]] = and i1 true, [[TMP14]]
-// CHECK-NEXT: br i1 [[TMP15]], label [[RESOLVER_RETURN5:%.*]], label [[RESOLVER_ELSE6:%.*]]
-// CHECK: resolver_return5:
-// CHECK-NEXT: ret ptr @fmv._Msme2
-// CHECK: resolver_else6:
-// CHECK-NEXT: [[TMP16:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP17:%.*]] = and i64 [[TMP16]], 72061992218723072
-// CHECK-NEXT: [[TMP18:%.*]] = icmp eq i64 [[TMP17]], 72061992218723072
-// CHECK-NEXT: [[TMP19:%.*]] = and i1 true, [[TMP18]]
-// CHECK-NEXT: br i1 [[TMP19]], label [[RESOLVER_RETURN7:%.*]], label [[RESOLVER_ELSE8:%.*]]
-// CHECK: resolver_return7:
-// CHECK-NEXT: ret ptr @fmv._Msme-i16i64
-// CHECK: resolver_else8:
-// CHECK-NEXT: [[TMP20:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP21:%.*]] = and i64 [[TMP20]], 36033195199759104
-// CHECK-NEXT: [[TMP22:%.*]] = icmp eq i64 [[TMP21]], 36033195199759104
-// CHECK-NEXT: [[TMP23:%.*]] = and i1 true, [[TMP22]]
-// CHECK-NEXT: br i1 [[TMP23]], label [[RESOLVER_RETURN9:%.*]], label [[RESOLVER_ELSE10:%.*]]
-// CHECK: resolver_return9:
-// CHECK-NEXT: ret ptr @fmv._Msme-f64f64
-// CHECK: resolver_else10:
-// CHECK-NEXT: [[TMP24:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP25:%.*]] = and i64 [[TMP24]], 18014398509481984
-// CHECK-NEXT: [[TMP26:%.*]] = icmp eq i64 [[TMP25]], 18014398509481984
-// CHECK-NEXT: [[TMP27:%.*]] = and i1 true, [[TMP26]]
-// CHECK-NEXT: br i1 [[TMP27]], label [[RESOLVER_RETURN11:%.*]], label [[RESOLVER_ELSE12:%.*]]
-// CHECK: resolver_return11:
-// CHECK-NEXT: ret ptr @fmv._Mwfxt
-// CHECK: resolver_else12:
-// CHECK-NEXT: [[TMP28:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP29:%.*]] = and i64 [[TMP28]], 1125899906842624
-// CHECK-NEXT: [[TMP30:%.*]] = icmp eq i64 [[TMP29]], 1125899906842624
-// CHECK-NEXT: [[TMP31:%.*]] = and i1 true, [[TMP30]]
-// CHECK-NEXT: br i1 [[TMP31]], label [[RESOLVER_RETURN13:%.*]], label [[RESOLVER_ELSE14:%.*]]
-// CHECK: resolver_return13:
-// CHECK-NEXT: ret ptr @fmv._Mbti
-// CHECK: resolver_else14:
-// CHECK-NEXT: [[TMP32:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP33:%.*]] = and i64 [[TMP32]], 562949953421312
-// CHECK-NEXT: [[TMP34:%.*]] = icmp eq i64 [[TMP33]], 562949953421312
-// CHECK-NEXT: [[TMP35:%.*]] = and i1 true, [[TMP34]]
-// CHECK-NEXT: br i1 [[TMP35]], label [[RESOLVER_RETURN15:%.*]], label [[RESOLVER_ELSE16:%.*]]
-// CHECK: resolver_return15:
-// CHECK-NEXT: ret ptr @fmv._Mssbs
-// CHECK: resolver_else16:
-// CHECK-NEXT: [[TMP36:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP37:%.*]] = and i64 [[TMP36]], 70368744177664
-// CHECK-NEXT: [[TMP38:%.*]] = icmp eq i64 [[TMP37]], 70368744177664
-// CHECK-NEXT: [[TMP39:%.*]] = and i1 true, [[TMP38]]
-// CHECK-NEXT: br i1 [[TMP39]], label [[RESOLVER_RETURN17:%.*]], label [[RESOLVER_ELSE18:%.*]]
-// CHECK: resolver_return17:
-// CHECK-NEXT: ret ptr @fmv._Msb
-// CHECK: resolver_else18:
-// CHECK-NEXT: [[TMP40:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP41:%.*]] = and i64 [[TMP40]], 17592186044416
-// CHECK-NEXT: [[TMP42:%.*]] = icmp eq i64 [[TMP41]], 17592186044416
-// CHECK-NEXT: [[TMP43:%.*]] = and i1 true, [[TMP42]]
-// CHECK-NEXT: br i1 [[TMP43]], label [[RESOLVER_RETURN19:%.*]], label [[RESOLVER_ELSE20:%.*]]
-// CHECK: resolver_return19:
-// CHECK-NEXT: ret ptr @fmv._Mmemtag
-// CHECK: resolver_else20:
-// CHECK-NEXT: [[TMP44:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP45:%.*]] = and i64 [[TMP44]], 4398180795136
-// CHECK-NEXT: [[TMP46:%.*]] = icmp eq i64 [[TMP45]], 4398180795136
-// CHECK-NEXT: [[TMP47:%.*]] = and i1 true, [[TMP46]]
-// CHECK-NEXT: br i1 [[TMP47]], label [[RESOLVER_RETURN21:%.*]], label [[RESOLVER_ELSE22:%.*]]
-// CHECK: resolver_return21:
-// CHECK-NEXT: ret ptr @fmv._Msme
-// CHECK: resolver_else22:
-// CHECK-NEXT: [[TMP48:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP49:%.*]] = and i64 [[TMP48]], 2268816540448
-// CHECK-NEXT: [[TMP50:%.*]] = icmp eq i64 [[TMP49]], 2268816540448
-// CHECK-NEXT: [[TMP51:%.*]] = and i1 true, [[TMP50]]
-// CHECK-NEXT: br i1 [[TMP51]], label [[RESOLVER_RETURN23:%.*]], label [[RESOLVER_ELSE24:%.*]]
-// CHECK: resolver_return23:
-// CHECK-NEXT: ret ptr @fmv._Msve2-sm4
-// CHECK: resolver_else24:
-// CHECK-NEXT: [[TMP52:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP53:%.*]] = and i64 [[TMP52]], 1169304924928
-// CHECK-NEXT: [[TMP54:%.*]] = icmp eq i64 [[TMP53]], 1169304924928
-// CHECK-NEXT: [[TMP55:%.*]] = and i1 true, [[TMP54]]
-// CHECK-NEXT: br i1 [[TMP55]], label [[RESOLVER_RETURN25:%.*]], label [[RESOLVER_ELSE26:%.*]]
-// CHECK: resolver_return25:
-// CHECK-NEXT: ret ptr @fmv._Msve2-sha3
-// CHECK: resolver_else26:
-// CHECK-NEXT: [[TMP56:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP57:%.*]] = and i64 [[TMP56]], 619549098240
-// CHECK-NEXT: [[TMP58:%.*]] = icmp eq i64 [[TMP57]], 619549098240
-// CHECK-NEXT: [[TMP59:%.*]] = and i1 true, [[TMP58]]
-// CHECK-NEXT: br i1 [[TMP59]], label [[RESOLVER_RETURN27:%.*]], label [[RESOLVER_ELSE28:%.*]]
-// CHECK: resolver_return27:
-// CHECK-NEXT: ret ptr @fmv._Msve2-bitperm
-// CHECK: resolver_else28:
-// CHECK-NEXT: [[TMP60:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP61:%.*]] = and i64 [[TMP60]], 344671224576
-// CHECK-NEXT: [[TMP62:%.*]] = icmp eq i64 [[TMP61]], 344671224576
-// CHECK-NEXT: [[TMP63:%.*]] = and i1 true, [[TMP62]]
-// CHECK-NEXT: br i1 [[TMP63]], label [[RESOLVER_RETURN29:%.*]], label [[RESOLVER_ELSE30:%.*]]
-// CHECK: resolver_return29:
-// CHECK-NEXT: ret ptr @fmv._Msve2-aes
-// CHECK: resolver_else30:
-// CHECK-NEXT: [[TMP64:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP65:%.*]] = and i64 [[TMP64]], 69793284352
-// CHECK-NEXT: [[TMP66:%.*]] = icmp eq i64 [[TMP65]], 69793284352
-// CHECK-NEXT: [[TMP67:%.*]] = and i1 true, [[TMP66]]
-// CHECK-NEXT: br i1 [[TMP67]], label [[RESOLVER_RETURN31:%.*]], label [[RESOLVER_ELSE32:%.*]]
-// CHECK: resolver_return31:
-// CHECK-NEXT: ret ptr @fmv._Msve2
-// CHECK: resolver_else32:
-// CHECK-NEXT: [[TMP68:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP69:%.*]] = and i64 [[TMP68]], 35433545984
-// CHECK-NEXT: [[TMP70:%.*]] = icmp eq i64 [[TMP69]], 35433545984
-// CHECK-NEXT: [[TMP71:%.*]] = and i1 true, [[TMP70]]
-// CHECK-NEXT: br i1 [[TMP71]], label [[RESOLVER_RETURN33:%.*]], label [[RESOLVER_ELSE34:%.*]]
-// CHECK: resolver_return33:
-// CHECK-NEXT: ret ptr @fmv._Mf64mm
-// CHECK: resolver_else34:
-// CHECK-NEXT: [[TMP72:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP73:%.*]] = and i64 [[TMP72]], 18253676800
-// CHECK-NEXT: [[TMP74:%.*]] = icmp eq i64 [[TMP73]], 18253676800
-// CHECK-NEXT: [[TMP75:%.*]] = and i1 true, [[TMP74]]
-// CHECK-NEXT: br i1 [[TMP75]], label [[RESOLVER_RETURN35:%.*]], label [[RESOLVER_ELSE36:%.*]]
-// CHECK: resolver_return35:
-// CHECK-NEXT: ret ptr @fmv._Mf32mm
-// CHECK: resolver_else36:
-// CHECK-NEXT: [[TMP76:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP77:%.*]] = and i64 [[TMP76]], 1073807616
-// CHECK-NEXT: [[TMP78:%.*]] = icmp eq i64 [[TMP77]], 1073807616
-// CHECK-NEXT: [[TMP79:%.*]] = and i1 true, [[TMP78]]
-// CHECK-NEXT: br i1 [[TMP79]], label [[RESOLVER_RETURN37:%.*]], label [[RESOLVER_ELSE38:%.*]]
-// CHECK: resolver_return37:
-// CHECK-NEXT: ret ptr @fmv._Msve
-// CHECK: resolver_else38:
-// CHECK-NEXT: [[TMP80:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP81:%.*]] = and i64 [[TMP80]], 134218496
-// CHECK-NEXT: [[TMP82:%.*]] = icmp eq i64 [[TMP81]], 134218496
-// CHECK-NEXT: [[TMP83:%.*]] = and i1 true, [[TMP82]]
-// CHECK-NEXT: br i1 [[TMP83]], label [[RESOLVER_RETURN39:%.*]], label [[RESOLVER_ELSE40:%.*]]
-// CHECK: resolver_return39:
-// CHECK-NEXT: ret ptr @fmv._Mbf16
-// CHECK: resolver_else40:
-// CHECK-NEXT: [[TMP84:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP85:%.*]] = and i64 [[TMP84]], 67109632
-// CHECK-NEXT: [[TMP86:%.*]] = icmp eq i64 [[TMP85]], 67109632
-// CHECK-NEXT: [[TMP87:%.*]] = and i1 true, [[TMP86]]
-// CHECK-NEXT: br i1 [[TMP87]], label [[RESOLVER_RETURN41:%.*]], label [[RESOLVER_ELSE42:%.*]]
-// CHECK: resolver_return41:
-// CHECK-NEXT: ret ptr @fmv._Mi8mm
-// CHECK: resolver_else42:
-// CHECK-NEXT: [[TMP88:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP89:%.*]] = and i64 [[TMP88]], 16777472
-// CHECK-NEXT: [[TMP90:%.*]] = icmp eq i64 [[TMP89]], 16777472
-// CHECK-NEXT: [[TMP91:%.*]] = and i1 true, [[TMP90]]
-// CHECK-NEXT: br i1 [[TMP91]], label [[RESOLVER_RETURN43:%.*]], label [[RESOLVER_ELSE44:%.*]]
-// CHECK: resolver_return43:
-// CHECK-NEXT: ret ptr @fmv._Mfrintts
-// CHECK: resolver_else44:
-// CHECK-NEXT: [[TMP92:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP93:%.*]] = and i64 [[TMP92]], 288230376164294656
-// CHECK-NEXT: [[TMP94:%.*]] = icmp eq i64 [[TMP93]], 288230376164294656
-// CHECK-NEXT: [[TMP95:%.*]] = and i1 true, [[TMP94]]
-// CHECK-NEXT: br i1 [[TMP95]], label [[RESOLVER_RETURN45:%.*]], label [[RESOLVER_ELSE46:%.*]]
-// CHECK: resolver_return45:
-// CHECK-NEXT: ret ptr @fmv._Mrcpc3
-// CHECK: resolver_else46:
-// CHECK-NEXT: [[TMP96:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP97:%.*]] = and i64 [[TMP96]], 12582912
-// CHECK-NEXT: [[TMP98:%.*]] = icmp eq i64 [[TMP97]], 12582912
-// CHECK-NEXT: [[TMP99:%.*]] = and i1 true, [[TMP98]]
-// CHECK-NEXT: br i1 [[TMP99]], label [[RESOLVER_RETURN47:%.*]], label [[RESOLVER_ELSE48:%.*]]
-// CHECK: resolver_return47:
-// CHECK-NEXT: ret ptr @fmv._Mrcpc2
-// CHECK: resolver_else48:
-// CHECK-NEXT: [[TMP100:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP101:%.*]] = and i64 [[TMP100]], 4194304
-// CHECK-NEXT: [[TMP102:%.*]] = icmp eq i64 [[TMP101]], 4194304
-// CHECK-NEXT: [[TMP103:%.*]] = and i1 true, [[TMP102]]
-// CHECK-NEXT: br i1 [[TMP103]], label [[RESOLVER_RETURN49:%.*]], label [[RESOLVER_ELSE50:%.*]]
-// CHECK: resolver_return49:
-// CHECK-NEXT: ret ptr @fmv._Mrcpc
-// CHECK: resolver_else50:
-// CHECK-NEXT: [[TMP104:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP105:%.*]] = and i64 [[TMP104]], 2097920
-// CHECK-NEXT: [[TMP106:%.*]] = icmp eq i64 [[TMP105]], 2097920
-// CHECK-NEXT: [[TMP107:%.*]] = and i1 true, [[TMP106]]
-// CHECK-NEXT: br i1 [[TMP107]], label [[RESOLVER_RETURN51:%.*]], label [[RESOLVER_ELSE52:%.*]]
-// CHECK: resolver_return51:
-// CHECK-NEXT: ret ptr @fmv._Mfcma
-// CHECK: resolver_else52:
-// CHECK-NEXT: [[TMP108:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP109:%.*]] = and i64 [[TMP108]], 1048832
-// CHECK-NEXT: [[TMP110:%.*]] = icmp eq i64 [[TMP109]], 1048832
-// CHECK-NEXT: [[TMP111:%.*]] = and i1 true, [[TMP110]]
-// CHECK-NEXT: br i1 [[TMP111]], label [[RESOLVER_RETURN53:%.*]], label [[RESOLVER_ELSE54:%.*]]
-// CHECK: resolver_return53:
-// CHECK-NEXT: ret ptr @fmv._Mjscvt
-// CHECK: resolver_else54:
-// CHECK-NEXT: [[TMP112:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP113:%.*]] = and i64 [[TMP112]], 786432
-// CHECK-NEXT: [[TMP114:%.*]] = icmp eq i64 [[TMP113]], 786432
-// CHECK-NEXT: [[TMP115:%.*]] = and i1 true, [[TMP114]]
-// CHECK-NEXT: br i1 [[TMP115]], label [[RESOLVER_RETURN55:%.*]], label [[RESOLVER_ELSE56:%.*]]
-// CHECK: resolver_return55:
-// CHECK-NEXT: ret ptr @fmv._Mdpb2
-// CHECK: resolver_else56:
-// CHECK-NEXT: [[TMP116:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP117:%.*]] = and i64 [[TMP116]], 262144
-// CHECK-NEXT: [[TMP118:%.*]] = icmp eq i64 [[TMP117]], 262144
-// CHECK-NEXT: [[TMP119:%.*]] = and i1 true, [[TMP118]]
-// CHECK-NEXT: br i1 [[TMP119]], label [[RESOLVER_RETURN57:%.*]], label [[RESOLVER_ELSE58:%.*]]
-// CHECK: resolver_return57:
-// CHECK-NEXT: ret ptr @fmv._Mdpb
-// CHECK: resolver_else58:
-// CHECK-NEXT: [[TMP120:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP121:%.*]] = and i64 [[TMP120]], 131072
-// CHECK-NEXT: [[TMP122:%.*]] = icmp eq i64 [[TMP121]], 131072
-// CHECK-NEXT: [[TMP123:%.*]] = and i1 true, [[TMP122]]
-// CHECK-NEXT: br i1 [[TMP123]], label [[RESOLVER_RETURN59:%.*]], label [[RESOLVER_ELSE60:%.*]]
-// CHECK: resolver_return59:
-// CHECK-NEXT: ret ptr @fmv._Mdit
-// CHECK: resolver_else60:
-// CHECK-NEXT: [[TMP124:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP125:%.*]] = and i64 [[TMP124]], 66312
-// CHECK-NEXT: [[TMP126:%.*]] = icmp eq i64 [[TMP125]], 66312
-// CHECK-NEXT: [[TMP127:%.*]] = and i1 true, [[TMP126]]
-// CHECK-NEXT: br i1 [[TMP127]], label [[RESOLVER_RETURN61:%.*]], ...
[truncated]
|
@llvm/pr-subscribers-backend-risc-v Author: Alexandros Lamprineas (labrinea) ChangesWith the current behavior the following example yields a linker error: "multiple definition of `foo.default'" // Translation Unit 1 // Translation Unit 2 That is because foo.default is generated twice. As a user I don't find this particularly intuitive. If I wanted the default to be generated in TU1 I'd rather write target_clones("dotprod, sve", "default") explicitly. When changing the code I noticed that the RISC-V target defers the resolver emission when encountering a target_version definition. This seems accidental since it only makes sense for AArch64, where we only emit a resolver once we've processed the entire TU, and only if the default version is present. I've changed this so that RISC-V immediately emmits the resolver. I adjusted the codegen tests since the functions now appear in a different order. Implements ARM-software/acle#377 Patch is 143.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141808.diff 13 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6d2c705338ecf..9383c57d3c991 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -4237,19 +4237,19 @@ void CodeGenModule::EmitMultiVersionFunctionDefinition(GlobalDecl GD,
EmitGlobalFunctionDefinition(GD.getWithMultiVersionIndex(I), nullptr);
} else if (auto *TC = FD->getAttr<TargetClonesAttr>()) {
for (unsigned I = 0; I < TC->featuresStrs_size(); ++I)
- // AArch64 favors the default target version over the clone if any.
- if ((!TC->isDefaultVersion(I) || !getTarget().getTriple().isAArch64()) &&
- TC->isFirstOfVersion(I))
+ if (TC->isFirstOfVersion(I))
EmitGlobalFunctionDefinition(GD.getWithMultiVersionIndex(I), nullptr);
- // Ensure that the resolver function is also emitted.
- GetOrCreateMultiVersionResolver(GD);
} else
EmitGlobalFunctionDefinition(GD, GV);
- // Defer the resolver emission until we can reason whether the TU
- // contains a default target version implementation.
- if (FD->isTargetVersionMultiVersion())
- AddDeferredMultiVersionResolverToEmit(GD);
+ // Ensure that the resolver function is also emitted.
+ if (FD->isTargetVersionMultiVersion() || FD->isTargetClonesMultiVersion()) {
+ // On AArch64 defer the resolver emission until the entire TU is processed.
+ if (getTarget().getTriple().isAArch64())
+ AddDeferredMultiVersionResolverToEmit(GD);
+ else
+ GetOrCreateMultiVersionResolver(GD);
+ }
}
void CodeGenModule::EmitGlobalDefinition(GlobalDecl GD, llvm::GlobalValue *GV) {
@@ -4351,7 +4351,7 @@ void CodeGenModule::emitMultiVersionFunctions() {
};
// For AArch64, a resolver is only emitted if a function marked with
- // target_version("default")) or target_clones() is present and defined
+ // target_version("default")) or target_clones("default") is defined
// in this TU. For other architectures it is always emitted.
bool ShouldEmitResolver = !getTarget().getTriple().isAArch64();
SmallVector<CodeGenFunction::FMVResolverOption, 10> Options;
@@ -4374,12 +4374,11 @@ void CodeGenModule::emitMultiVersionFunctions() {
TVA->getFeatures(Feats, Delim);
Options.emplace_back(Func, Feats);
} else if (const auto *TC = CurFD->getAttr<TargetClonesAttr>()) {
- if (IsDefined)
- ShouldEmitResolver = true;
for (unsigned I = 0; I < TC->featuresStrs_size(); ++I) {
if (!TC->isFirstOfVersion(I))
continue;
-
+ if (TC->isDefaultVersion(I) && IsDefined)
+ ShouldEmitResolver = true;
llvm::Function *Func = createFunction(CurFD, I);
Feats.clear();
if (getTarget().getTriple().isX86()) {
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 54bac40982eda..6a202f5c6c167 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3494,13 +3494,7 @@ static void handleTargetClonesAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
if (HasCommas && AL.getNumArgs() > 1)
S.Diag(AL.getLoc(), diag::warn_target_clone_mixed_values);
- if (S.Context.getTargetInfo().getTriple().isAArch64() && !HasDefault) {
- // Add default attribute if there is no one
- HasDefault = true;
- Strings.push_back("default");
- }
-
- if (!HasDefault) {
+ if (!HasDefault && !S.Context.getTargetInfo().getTriple().isAArch64()) {
S.Diag(AL.getLoc(), diag::err_target_clone_must_have_default);
return;
}
diff --git a/clang/test/AST/attr-target-version.c b/clang/test/AST/attr-target-version.c
index 52ac0e61b5a59..b537f5e685a31 100644
--- a/clang/test/AST/attr-target-version.c
+++ b/clang/test/AST/attr-target-version.c
@@ -1,7 +1,7 @@
// RUN: %clang_cc1 -triple aarch64-linux-gnu -ast-dump %s | FileCheck %s
int __attribute__((target_version("sve2-bitperm + sha2"))) foov(void) { return 1; }
-int __attribute__((target_clones(" lse + fp + sha3 "))) fooc(void) { return 2; }
+int __attribute__((target_clones(" lse + fp + sha3 ", "default"))) fooc(void) { return 2; }
// CHECK: TargetVersionAttr
// CHECK: sve2-bitperm + sha2
// CHECK: TargetClonesAttr
diff --git a/clang/test/CodeGen/AArch64/fmv-detection.c b/clang/test/CodeGen/AArch64/fmv-detection.c
index 44702a04e532e..e585140a1eb08 100644
--- a/clang/test/CodeGen/AArch64/fmv-detection.c
+++ b/clang/test/CodeGen/AArch64/fmv-detection.c
@@ -97,7 +97,7 @@ __attribute__((target_version("wfxt"))) int fmv(void) { return 0; }
__attribute__((target_version("cssc+fp"))) int fmv(void);
-__attribute__((target_version("default"))) int fmv(void);
+__attribute__((target_version("default"))) int fmv(void) { return 0; }
int caller() {
return fmv();
@@ -121,380 +121,6 @@ int caller() {
// CHECK-NEXT: ret i32 0
//
//
-// CHECK-LABEL: define {{[^@]+}}@fmv.resolver() comdat {
-// CHECK-NEXT: resolver_entry:
-// CHECK-NEXT: call void @__init_cpu_features_resolver()
-// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 2304
-// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 2304
-// CHECK-NEXT: [[TMP3:%.*]] = and i1 true, [[TMP2]]
-// CHECK-NEXT: br i1 [[TMP3]], label [[RESOLVER_RETURN:%.*]], label [[RESOLVER_ELSE:%.*]]
-// CHECK: resolver_return:
-// CHECK-NEXT: ret ptr @fmv._McsscMfp
-// CHECK: resolver_else:
-// CHECK-NEXT: [[TMP4:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP5:%.*]] = and i64 [[TMP4]], 2048
-// CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 2048
-// CHECK-NEXT: [[TMP7:%.*]] = and i1 true, [[TMP6]]
-// CHECK-NEXT: br i1 [[TMP7]], label [[RESOLVER_RETURN1:%.*]], label [[RESOLVER_ELSE2:%.*]]
-// CHECK: resolver_return1:
-// CHECK-NEXT: ret ptr @fmv._Mcssc
-// CHECK: resolver_else2:
-// CHECK-NEXT: [[TMP8:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP9:%.*]] = and i64 [[TMP8]], 576460752303423488
-// CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[TMP9]], 576460752303423488
-// CHECK-NEXT: [[TMP11:%.*]] = and i1 true, [[TMP10]]
-// CHECK-NEXT: br i1 [[TMP11]], label [[RESOLVER_RETURN3:%.*]], label [[RESOLVER_ELSE4:%.*]]
-// CHECK: resolver_return3:
-// CHECK-NEXT: ret ptr @fmv._Mmops
-// CHECK: resolver_else4:
-// CHECK-NEXT: [[TMP12:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP13:%.*]] = and i64 [[TMP12]], 144119586256651008
-// CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[TMP13]], 144119586256651008
-// CHECK-NEXT: [[TMP15:%.*]] = and i1 true, [[TMP14]]
-// CHECK-NEXT: br i1 [[TMP15]], label [[RESOLVER_RETURN5:%.*]], label [[RESOLVER_ELSE6:%.*]]
-// CHECK: resolver_return5:
-// CHECK-NEXT: ret ptr @fmv._Msme2
-// CHECK: resolver_else6:
-// CHECK-NEXT: [[TMP16:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP17:%.*]] = and i64 [[TMP16]], 72061992218723072
-// CHECK-NEXT: [[TMP18:%.*]] = icmp eq i64 [[TMP17]], 72061992218723072
-// CHECK-NEXT: [[TMP19:%.*]] = and i1 true, [[TMP18]]
-// CHECK-NEXT: br i1 [[TMP19]], label [[RESOLVER_RETURN7:%.*]], label [[RESOLVER_ELSE8:%.*]]
-// CHECK: resolver_return7:
-// CHECK-NEXT: ret ptr @fmv._Msme-i16i64
-// CHECK: resolver_else8:
-// CHECK-NEXT: [[TMP20:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP21:%.*]] = and i64 [[TMP20]], 36033195199759104
-// CHECK-NEXT: [[TMP22:%.*]] = icmp eq i64 [[TMP21]], 36033195199759104
-// CHECK-NEXT: [[TMP23:%.*]] = and i1 true, [[TMP22]]
-// CHECK-NEXT: br i1 [[TMP23]], label [[RESOLVER_RETURN9:%.*]], label [[RESOLVER_ELSE10:%.*]]
-// CHECK: resolver_return9:
-// CHECK-NEXT: ret ptr @fmv._Msme-f64f64
-// CHECK: resolver_else10:
-// CHECK-NEXT: [[TMP24:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP25:%.*]] = and i64 [[TMP24]], 18014398509481984
-// CHECK-NEXT: [[TMP26:%.*]] = icmp eq i64 [[TMP25]], 18014398509481984
-// CHECK-NEXT: [[TMP27:%.*]] = and i1 true, [[TMP26]]
-// CHECK-NEXT: br i1 [[TMP27]], label [[RESOLVER_RETURN11:%.*]], label [[RESOLVER_ELSE12:%.*]]
-// CHECK: resolver_return11:
-// CHECK-NEXT: ret ptr @fmv._Mwfxt
-// CHECK: resolver_else12:
-// CHECK-NEXT: [[TMP28:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP29:%.*]] = and i64 [[TMP28]], 1125899906842624
-// CHECK-NEXT: [[TMP30:%.*]] = icmp eq i64 [[TMP29]], 1125899906842624
-// CHECK-NEXT: [[TMP31:%.*]] = and i1 true, [[TMP30]]
-// CHECK-NEXT: br i1 [[TMP31]], label [[RESOLVER_RETURN13:%.*]], label [[RESOLVER_ELSE14:%.*]]
-// CHECK: resolver_return13:
-// CHECK-NEXT: ret ptr @fmv._Mbti
-// CHECK: resolver_else14:
-// CHECK-NEXT: [[TMP32:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP33:%.*]] = and i64 [[TMP32]], 562949953421312
-// CHECK-NEXT: [[TMP34:%.*]] = icmp eq i64 [[TMP33]], 562949953421312
-// CHECK-NEXT: [[TMP35:%.*]] = and i1 true, [[TMP34]]
-// CHECK-NEXT: br i1 [[TMP35]], label [[RESOLVER_RETURN15:%.*]], label [[RESOLVER_ELSE16:%.*]]
-// CHECK: resolver_return15:
-// CHECK-NEXT: ret ptr @fmv._Mssbs
-// CHECK: resolver_else16:
-// CHECK-NEXT: [[TMP36:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP37:%.*]] = and i64 [[TMP36]], 70368744177664
-// CHECK-NEXT: [[TMP38:%.*]] = icmp eq i64 [[TMP37]], 70368744177664
-// CHECK-NEXT: [[TMP39:%.*]] = and i1 true, [[TMP38]]
-// CHECK-NEXT: br i1 [[TMP39]], label [[RESOLVER_RETURN17:%.*]], label [[RESOLVER_ELSE18:%.*]]
-// CHECK: resolver_return17:
-// CHECK-NEXT: ret ptr @fmv._Msb
-// CHECK: resolver_else18:
-// CHECK-NEXT: [[TMP40:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP41:%.*]] = and i64 [[TMP40]], 17592186044416
-// CHECK-NEXT: [[TMP42:%.*]] = icmp eq i64 [[TMP41]], 17592186044416
-// CHECK-NEXT: [[TMP43:%.*]] = and i1 true, [[TMP42]]
-// CHECK-NEXT: br i1 [[TMP43]], label [[RESOLVER_RETURN19:%.*]], label [[RESOLVER_ELSE20:%.*]]
-// CHECK: resolver_return19:
-// CHECK-NEXT: ret ptr @fmv._Mmemtag
-// CHECK: resolver_else20:
-// CHECK-NEXT: [[TMP44:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP45:%.*]] = and i64 [[TMP44]], 4398180795136
-// CHECK-NEXT: [[TMP46:%.*]] = icmp eq i64 [[TMP45]], 4398180795136
-// CHECK-NEXT: [[TMP47:%.*]] = and i1 true, [[TMP46]]
-// CHECK-NEXT: br i1 [[TMP47]], label [[RESOLVER_RETURN21:%.*]], label [[RESOLVER_ELSE22:%.*]]
-// CHECK: resolver_return21:
-// CHECK-NEXT: ret ptr @fmv._Msme
-// CHECK: resolver_else22:
-// CHECK-NEXT: [[TMP48:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP49:%.*]] = and i64 [[TMP48]], 2268816540448
-// CHECK-NEXT: [[TMP50:%.*]] = icmp eq i64 [[TMP49]], 2268816540448
-// CHECK-NEXT: [[TMP51:%.*]] = and i1 true, [[TMP50]]
-// CHECK-NEXT: br i1 [[TMP51]], label [[RESOLVER_RETURN23:%.*]], label [[RESOLVER_ELSE24:%.*]]
-// CHECK: resolver_return23:
-// CHECK-NEXT: ret ptr @fmv._Msve2-sm4
-// CHECK: resolver_else24:
-// CHECK-NEXT: [[TMP52:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP53:%.*]] = and i64 [[TMP52]], 1169304924928
-// CHECK-NEXT: [[TMP54:%.*]] = icmp eq i64 [[TMP53]], 1169304924928
-// CHECK-NEXT: [[TMP55:%.*]] = and i1 true, [[TMP54]]
-// CHECK-NEXT: br i1 [[TMP55]], label [[RESOLVER_RETURN25:%.*]], label [[RESOLVER_ELSE26:%.*]]
-// CHECK: resolver_return25:
-// CHECK-NEXT: ret ptr @fmv._Msve2-sha3
-// CHECK: resolver_else26:
-// CHECK-NEXT: [[TMP56:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP57:%.*]] = and i64 [[TMP56]], 619549098240
-// CHECK-NEXT: [[TMP58:%.*]] = icmp eq i64 [[TMP57]], 619549098240
-// CHECK-NEXT: [[TMP59:%.*]] = and i1 true, [[TMP58]]
-// CHECK-NEXT: br i1 [[TMP59]], label [[RESOLVER_RETURN27:%.*]], label [[RESOLVER_ELSE28:%.*]]
-// CHECK: resolver_return27:
-// CHECK-NEXT: ret ptr @fmv._Msve2-bitperm
-// CHECK: resolver_else28:
-// CHECK-NEXT: [[TMP60:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP61:%.*]] = and i64 [[TMP60]], 344671224576
-// CHECK-NEXT: [[TMP62:%.*]] = icmp eq i64 [[TMP61]], 344671224576
-// CHECK-NEXT: [[TMP63:%.*]] = and i1 true, [[TMP62]]
-// CHECK-NEXT: br i1 [[TMP63]], label [[RESOLVER_RETURN29:%.*]], label [[RESOLVER_ELSE30:%.*]]
-// CHECK: resolver_return29:
-// CHECK-NEXT: ret ptr @fmv._Msve2-aes
-// CHECK: resolver_else30:
-// CHECK-NEXT: [[TMP64:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP65:%.*]] = and i64 [[TMP64]], 69793284352
-// CHECK-NEXT: [[TMP66:%.*]] = icmp eq i64 [[TMP65]], 69793284352
-// CHECK-NEXT: [[TMP67:%.*]] = and i1 true, [[TMP66]]
-// CHECK-NEXT: br i1 [[TMP67]], label [[RESOLVER_RETURN31:%.*]], label [[RESOLVER_ELSE32:%.*]]
-// CHECK: resolver_return31:
-// CHECK-NEXT: ret ptr @fmv._Msve2
-// CHECK: resolver_else32:
-// CHECK-NEXT: [[TMP68:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP69:%.*]] = and i64 [[TMP68]], 35433545984
-// CHECK-NEXT: [[TMP70:%.*]] = icmp eq i64 [[TMP69]], 35433545984
-// CHECK-NEXT: [[TMP71:%.*]] = and i1 true, [[TMP70]]
-// CHECK-NEXT: br i1 [[TMP71]], label [[RESOLVER_RETURN33:%.*]], label [[RESOLVER_ELSE34:%.*]]
-// CHECK: resolver_return33:
-// CHECK-NEXT: ret ptr @fmv._Mf64mm
-// CHECK: resolver_else34:
-// CHECK-NEXT: [[TMP72:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP73:%.*]] = and i64 [[TMP72]], 18253676800
-// CHECK-NEXT: [[TMP74:%.*]] = icmp eq i64 [[TMP73]], 18253676800
-// CHECK-NEXT: [[TMP75:%.*]] = and i1 true, [[TMP74]]
-// CHECK-NEXT: br i1 [[TMP75]], label [[RESOLVER_RETURN35:%.*]], label [[RESOLVER_ELSE36:%.*]]
-// CHECK: resolver_return35:
-// CHECK-NEXT: ret ptr @fmv._Mf32mm
-// CHECK: resolver_else36:
-// CHECK-NEXT: [[TMP76:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP77:%.*]] = and i64 [[TMP76]], 1073807616
-// CHECK-NEXT: [[TMP78:%.*]] = icmp eq i64 [[TMP77]], 1073807616
-// CHECK-NEXT: [[TMP79:%.*]] = and i1 true, [[TMP78]]
-// CHECK-NEXT: br i1 [[TMP79]], label [[RESOLVER_RETURN37:%.*]], label [[RESOLVER_ELSE38:%.*]]
-// CHECK: resolver_return37:
-// CHECK-NEXT: ret ptr @fmv._Msve
-// CHECK: resolver_else38:
-// CHECK-NEXT: [[TMP80:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP81:%.*]] = and i64 [[TMP80]], 134218496
-// CHECK-NEXT: [[TMP82:%.*]] = icmp eq i64 [[TMP81]], 134218496
-// CHECK-NEXT: [[TMP83:%.*]] = and i1 true, [[TMP82]]
-// CHECK-NEXT: br i1 [[TMP83]], label [[RESOLVER_RETURN39:%.*]], label [[RESOLVER_ELSE40:%.*]]
-// CHECK: resolver_return39:
-// CHECK-NEXT: ret ptr @fmv._Mbf16
-// CHECK: resolver_else40:
-// CHECK-NEXT: [[TMP84:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP85:%.*]] = and i64 [[TMP84]], 67109632
-// CHECK-NEXT: [[TMP86:%.*]] = icmp eq i64 [[TMP85]], 67109632
-// CHECK-NEXT: [[TMP87:%.*]] = and i1 true, [[TMP86]]
-// CHECK-NEXT: br i1 [[TMP87]], label [[RESOLVER_RETURN41:%.*]], label [[RESOLVER_ELSE42:%.*]]
-// CHECK: resolver_return41:
-// CHECK-NEXT: ret ptr @fmv._Mi8mm
-// CHECK: resolver_else42:
-// CHECK-NEXT: [[TMP88:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP89:%.*]] = and i64 [[TMP88]], 16777472
-// CHECK-NEXT: [[TMP90:%.*]] = icmp eq i64 [[TMP89]], 16777472
-// CHECK-NEXT: [[TMP91:%.*]] = and i1 true, [[TMP90]]
-// CHECK-NEXT: br i1 [[TMP91]], label [[RESOLVER_RETURN43:%.*]], label [[RESOLVER_ELSE44:%.*]]
-// CHECK: resolver_return43:
-// CHECK-NEXT: ret ptr @fmv._Mfrintts
-// CHECK: resolver_else44:
-// CHECK-NEXT: [[TMP92:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP93:%.*]] = and i64 [[TMP92]], 288230376164294656
-// CHECK-NEXT: [[TMP94:%.*]] = icmp eq i64 [[TMP93]], 288230376164294656
-// CHECK-NEXT: [[TMP95:%.*]] = and i1 true, [[TMP94]]
-// CHECK-NEXT: br i1 [[TMP95]], label [[RESOLVER_RETURN45:%.*]], label [[RESOLVER_ELSE46:%.*]]
-// CHECK: resolver_return45:
-// CHECK-NEXT: ret ptr @fmv._Mrcpc3
-// CHECK: resolver_else46:
-// CHECK-NEXT: [[TMP96:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP97:%.*]] = and i64 [[TMP96]], 12582912
-// CHECK-NEXT: [[TMP98:%.*]] = icmp eq i64 [[TMP97]], 12582912
-// CHECK-NEXT: [[TMP99:%.*]] = and i1 true, [[TMP98]]
-// CHECK-NEXT: br i1 [[TMP99]], label [[RESOLVER_RETURN47:%.*]], label [[RESOLVER_ELSE48:%.*]]
-// CHECK: resolver_return47:
-// CHECK-NEXT: ret ptr @fmv._Mrcpc2
-// CHECK: resolver_else48:
-// CHECK-NEXT: [[TMP100:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP101:%.*]] = and i64 [[TMP100]], 4194304
-// CHECK-NEXT: [[TMP102:%.*]] = icmp eq i64 [[TMP101]], 4194304
-// CHECK-NEXT: [[TMP103:%.*]] = and i1 true, [[TMP102]]
-// CHECK-NEXT: br i1 [[TMP103]], label [[RESOLVER_RETURN49:%.*]], label [[RESOLVER_ELSE50:%.*]]
-// CHECK: resolver_return49:
-// CHECK-NEXT: ret ptr @fmv._Mrcpc
-// CHECK: resolver_else50:
-// CHECK-NEXT: [[TMP104:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP105:%.*]] = and i64 [[TMP104]], 2097920
-// CHECK-NEXT: [[TMP106:%.*]] = icmp eq i64 [[TMP105]], 2097920
-// CHECK-NEXT: [[TMP107:%.*]] = and i1 true, [[TMP106]]
-// CHECK-NEXT: br i1 [[TMP107]], label [[RESOLVER_RETURN51:%.*]], label [[RESOLVER_ELSE52:%.*]]
-// CHECK: resolver_return51:
-// CHECK-NEXT: ret ptr @fmv._Mfcma
-// CHECK: resolver_else52:
-// CHECK-NEXT: [[TMP108:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP109:%.*]] = and i64 [[TMP108]], 1048832
-// CHECK-NEXT: [[TMP110:%.*]] = icmp eq i64 [[TMP109]], 1048832
-// CHECK-NEXT: [[TMP111:%.*]] = and i1 true, [[TMP110]]
-// CHECK-NEXT: br i1 [[TMP111]], label [[RESOLVER_RETURN53:%.*]], label [[RESOLVER_ELSE54:%.*]]
-// CHECK: resolver_return53:
-// CHECK-NEXT: ret ptr @fmv._Mjscvt
-// CHECK: resolver_else54:
-// CHECK-NEXT: [[TMP112:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP113:%.*]] = and i64 [[TMP112]], 786432
-// CHECK-NEXT: [[TMP114:%.*]] = icmp eq i64 [[TMP113]], 786432
-// CHECK-NEXT: [[TMP115:%.*]] = and i1 true, [[TMP114]]
-// CHECK-NEXT: br i1 [[TMP115]], label [[RESOLVER_RETURN55:%.*]], label [[RESOLVER_ELSE56:%.*]]
-// CHECK: resolver_return55:
-// CHECK-NEXT: ret ptr @fmv._Mdpb2
-// CHECK: resolver_else56:
-// CHECK-NEXT: [[TMP116:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP117:%.*]] = and i64 [[TMP116]], 262144
-// CHECK-NEXT: [[TMP118:%.*]] = icmp eq i64 [[TMP117]], 262144
-// CHECK-NEXT: [[TMP119:%.*]] = and i1 true, [[TMP118]]
-// CHECK-NEXT: br i1 [[TMP119]], label [[RESOLVER_RETURN57:%.*]], label [[RESOLVER_ELSE58:%.*]]
-// CHECK: resolver_return57:
-// CHECK-NEXT: ret ptr @fmv._Mdpb
-// CHECK: resolver_else58:
-// CHECK-NEXT: [[TMP120:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP121:%.*]] = and i64 [[TMP120]], 131072
-// CHECK-NEXT: [[TMP122:%.*]] = icmp eq i64 [[TMP121]], 131072
-// CHECK-NEXT: [[TMP123:%.*]] = and i1 true, [[TMP122]]
-// CHECK-NEXT: br i1 [[TMP123]], label [[RESOLVER_RETURN59:%.*]], label [[RESOLVER_ELSE60:%.*]]
-// CHECK: resolver_return59:
-// CHECK-NEXT: ret ptr @fmv._Mdit
-// CHECK: resolver_else60:
-// CHECK-NEXT: [[TMP124:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP125:%.*]] = and i64 [[TMP124]], 66312
-// CHECK-NEXT: [[TMP126:%.*]] = icmp eq i64 [[TMP125]], 66312
-// CHECK-NEXT: [[TMP127:%.*]] = and i1 true, [[TMP126]]
-// CHECK-NEXT: br i1 [[TMP127]], label [[RESOLVER_RETURN61:%.*]], ...
[truncated]
|
With the current behavior the following example yields a linker error: "multiple definition of `foo.default'"
That is because foo.default is generated twice. As a user I don't find this particularly intuitive. If I wanted the default to be generated in TU1 I'd rather write target_clones("dotprod, sve", "default") explicitly.
When changing the code I noticed that the RISC-V target defers the resolver emission when encountering a target_version definition. This seems accidental since it only makes sense for AArch64, where we only emit a resolver once we've processed the entire TU, and only if the default version is present. I've changed this so that RISC-V immediately emmits the resolver. I adjusted the codegen tests since the functions now appear in a different order.
Implements ARM-software/acle#377