rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS #139385

joboet · 2025-04-04T17:50:11Z

nikic

Sorry, totally missed this one...

nikic · 2025-05-24T20:37:41Z

compiler/rustc_codegen_llvm/src/builder.rs

        // Cast to default address space if globals are in a different addrspace
-        self.cx().const_pointercast(s, self.type_ptr())
+        let g = self.cx().const_pointercast(g, self.type_ptr());


llvm.threadlocal.address requires the argument to be a global value, it can't be a cast. So we should move this cast after the call to the intrinsic.

Well, that obviously didn't work. Quick question: Can globals emitted by rustc actually be in a different address space? And if so, how would I get its number for an addrspacecast?

I think this just needs to replace const_pointercast -> pointercast.

Ah right, I overlooked that that existed...

nikic · 2025-05-24T20:39:28Z

@bors try @rust-timer queue

rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS Fixes #136044 r? `@nikic`

bors · 2025-05-24T20:40:46Z

⌛ Trying commit 5abeaa8 with merge 473cc4a...

bors · 2025-05-24T22:37:04Z

☀️ Try build successful - checks-actions
Build commit: 473cc4a (473cc4ab79c0e988e63684339ab40c813e82141a)

rust-timer · 2025-05-25T01:03:28Z

Finished benchmarking commit (473cc4a): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.8%	[-2.9%, -0.2%]	25
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.8%	[-2.9%, -0.2%]	25

Max RSS (memory usage)

Results (primary 3.4%, secondary 2.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.4%	[3.0%, 3.6%]	3
Regressions ❌ (secondary)	2.6%	[2.0%, 6.1%]	18
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	3.4%	[3.0%, 3.6%]	3

Cycles

Results (primary -1.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.9%	[-2.8%, -1.3%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.9%	[-2.8%, -1.3%]	3

Binary size

Results (primary -0.2%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-1.1%, -0.1%]	13
Improvements ✅ (secondary)	-0.1%	[-0.1%, -0.1%]	37
All ❌✅ (primary)	-0.2%	[-1.1%, -0.1%]	13

Bootstrap: 775.41s -> 775.194s (-0.03%)
Artifact size: 366.27 MiB -> 366.27 MiB (-0.00%)

rustbot · 2025-05-29T14:08:09Z

⚠️ Warning ⚠️

This PR is based on an upstream commit that is 55 days old.

It's recommended to update your branch according to the rustc-dev-guide.

nikic · 2025-05-30T08:57:43Z

@bors r+

bors · 2025-05-30T08:57:46Z

📌 Commit e4d9b06 has been approved by nikic

It is now in the queue for this repository.

nikic · 2025-05-30T08:58:35Z

compiler/rustc_codegen_llvm/src/builder.rs

+            self.pointercast(pointer, self.type_ptr())
+        } else {
+            // Cast to default address space if globals are in a different addrspace
+            self.cx().const_pointercast(global, self.type_ptr())


It would have been fine to use pointercast for both cases, but this is ok as well.

bors · 2025-05-30T15:40:00Z

⌛ Testing commit e4d9b06 with merge 15825b7...

bors · 2025-05-30T18:55:30Z

☀️ Test successful - checks-actions
Approved by: nikic
Pushing 15825b7 to master...

github-actions · 2025-05-30T18:58:26Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing aa5832b (parent) -> 15825b7 (this PR)

Test differences

No test diffs found

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 15825b7161f8bd6a3482211fbf6727a52aa1166b --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

dist-apple-various: 6415.0s -> 10260.3s (59.9%)
x86_64-apple-1: 7306.0s -> 10726.8s (46.8%)
x86_64-apple-2: 6857.2s -> 4914.5s (-28.3%)
dist-x86_64-apple: 7809.5s -> 9551.7s (22.3%)
dist-riscv64-linux: 5016.5s -> 5734.1s (14.3%)
dist-x86_64-msvc-alt: 7674.8s -> 8772.5s (14.3%)
aarch64-apple: 5034.3s -> 5648.4s (12.2%)
dist-aarch64-apple: 5475.7s -> 4819.7s (-12.0%)
dist-powerpc64-linux: 5435.1s -> 5880.7s (8.2%)
x86_64-msvc-1: 8798.0s -> 8229.3s (-6.5%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS Fixes rust-lang#136044 r? `@nikic`

rust-timer · 2025-05-31T05:28:48Z

Finished benchmarking commit (15825b7): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.3%	[0.3%, 0.3%]	1
Improvements ✅ (primary)	-0.7%	[-2.0%, -0.2%]	22
Improvements ✅ (secondary)	-1.4%	[-1.5%, -1.4%]	3
All ❌✅ (primary)	-0.7%	[-2.0%, -0.2%]	22

Max RSS (memory usage)

Results (secondary -0.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.6%	[0.4%, 0.8%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.6%	[-1.2%, -0.5%]	7
All ❌✅ (primary)	-	-	0

Cycles

Results (primary -1.0%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.9%	[0.4%, 1.8%]	8
Improvements ✅ (primary)	-1.0%	[-1.2%, -0.9%]	2
Improvements ✅ (secondary)	-0.8%	[-1.2%, -0.7%]	7
All ❌✅ (primary)	-1.0%	[-1.2%, -0.9%]	2

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.1%	[-0.1%, -0.1%]	11
Improvements ✅ (secondary)	-0.1%	[-0.1%, -0.1%]	37
All ❌✅ (primary)	-0.1%	[-0.1%, -0.1%]	11

Bootstrap: 777.274s -> 778.613s (0.17%)
Artifact size: 370.15 MiB -> 370.16 MiB (0.00%)

rustbot assigned nikic Apr 4, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 4, 2025

This comment has been minimized.

Sign in to view

joboet force-pushed the threadlocal_address branch from ec467ba to 6471d6a Compare April 4, 2025 18:38

This comment has been minimized.

Sign in to view

joboet force-pushed the threadlocal_address branch from 6471d6a to 5abeaa8 Compare April 4, 2025 19:48

nikic reviewed May 24, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 24, 2025

bors added a commit that referenced this pull request May 24, 2025

Auto merge of #139385 - joboet:threadlocal_address, r=<try>

473cc4a

rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS Fixes #136044 r? `@nikic`

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 25, 2025

joboet force-pushed the threadlocal_address branch from 5abeaa8 to a9bcc5e Compare May 29, 2025 10:41

rustbot added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label May 29, 2025

This comment has been minimized.

Sign in to view

rustc_codegen_llvm: use threadlocal.address intrinsic to access TLS

e4d9b06

joboet force-pushed the threadlocal_address branch from a9bcc5e to e4d9b06 Compare May 29, 2025 14:08

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 30, 2025

nikic reviewed May 30, 2025

View reviewed changes

bors added the merged-by-bors This PR was explicitly merged by bors. label May 30, 2025

bors merged commit 15825b7 into rust-lang:master May 30, 2025
10 checks passed

rustbot added this to the 1.89.0 milestone May 30, 2025

notriddle pushed a commit to notriddle/rust that referenced this pull request May 30, 2025

Auto merge of rust-lang#139385 - joboet:threadlocal_address, r=nikic

3f0036a

rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS Fixes rust-lang#136044 r? `@nikic`

rustbot added the perf-regression Performance regression. label May 31, 2025

rustc_codegen_llvm: use threadlocal.address intrinsic to access TLS #139385

rustc_codegen_llvm: use threadlocal.address intrinsic to access TLS #139385

Uh oh!

Conversation

joboet commented Apr 4, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic commented May 24, 2025

Uh oh!

This comment has been minimized.

bors commented May 24, 2025

Uh oh!

bors commented May 24, 2025

Uh oh!

This comment has been minimized.

rust-timer commented May 25, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

This comment has been minimized.

This comment has been minimized.

rustbot commented May 29, 2025

Uh oh!

nikic commented May 30, 2025

Uh oh!

bors commented May 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bors commented May 30, 2025

Uh oh!

bors commented May 30, 2025

Uh oh!

Uh oh!

github-actions bot commented May 30, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented May 31, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

Uh oh!

rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS #139385

rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS #139385

nikic May 29, 2025 •

edited

Loading