[experimental, do not merge!] a faster implementation of Polonius and a more compact DenseBitSet implementation #141583

tage64 · 2025-05-26T12:02:38Z

This is the union of #141326 and #141325, a Polonius experiment combined with a more compact version of DenseBitSet. I would like to get a perf-run of this @lqd.

r? lqd

…d of legacy.

…not interpretted as Rust doctests

This commit modifies DenseBitSet so that it only uses one word on the stack instead of 4 words as before, allowing for faster clones. The downside is that it may at most store 63 elements on the stack as aposed to 128 for the previous implementation.

sizes are different The new implementation of DenseBitSet doesn't store the exact domain size, so of course the hash values for identical sets with different domain sizes may be equal.

… dependency to be required instead of optional

…erialize dependency to be required instead of optional", and introduce conditional compilation instead.

rustbot · 2025-05-26T12:02:46Z

This PR changes a file inside tests/crashes. If a crash was fixed, please move into the corresponding ui subdir and add 'Fixes #' to the PR description to autoclose the issue upon merge.

Some changes occurred in src/tools/compiletest

cc @jieyouxu

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

Some changes occurred in coverage instrumentation.

cc @Zalathar

lqd · 2025-05-26T12:48:30Z

For perf tests, your PRs should be drafts so they’re not merged and also they won’t ping a lot of people until you want them to do so for review.

@bors try @rust-timer queue

[experimental, do not merge!] a faster implementation of Polonius and a more compact DenseBitSet implementation This is the union of #141326 and #141325, a Polonius experiment combined with a more compact version of `DenseBitSet`. I would like to get a perf-run of this `@lqd.` r? lqd

bors · 2025-05-26T12:49:43Z

⌛ Trying commit f2585eb with merge 16435a6...

bors · 2025-05-26T14:47:00Z

☀️ Try build successful - checks-actions
Build commit: 16435a6 (16435a66301a7d82ae6ff183f6d9a7005cb3f471)

Kobzol · 2025-05-26T17:12:48Z

For perf tests, your PRs should be drafts so they’re not merged and also they won’t ping a lot of people until you want them to do so for review.

I think that drafts still ping ppl right now, although they shouldn't request a reviewer.

lqd · 2025-05-26T17:25:47Z

Sometimes we don't want to ping people, so I hope that's incorrect and they still don't ping.

rust-timer · 2025-05-26T23:53:43Z

Finished benchmarking commit (16435a6): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	2.1%	[0.1%, 5.1%]	7
Regressions ❌ (secondary)	1.4%	[0.3%, 3.6%]	12
Improvements ✅ (primary)	-0.5%	[-3.0%, -0.1%]	69
Improvements ✅ (secondary)	-1.0%	[-2.9%, -0.1%]	47
All ❌✅ (primary)	-0.3%	[-3.0%, 5.1%]	76

Max RSS (memory usage)

Results (primary 70.8%, secondary 25.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	76.9%	[0.9%, 191.0%]	12
Regressions ❌ (secondary)	25.3%	[2.3%, 44.2%]	8
Improvements ✅ (primary)	-2.2%	[-2.2%, -2.2%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	70.8%	[-2.2%, 191.0%]	13

Cycles

Results (primary 1.4%, secondary 1.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.4%	[1.0%, 3.6%]	16
Regressions ❌ (secondary)	2.5%	[1.9%, 3.3%]	5
Improvements ✅ (primary)	-1.6%	[-2.8%, -0.9%]	5
Improvements ✅ (secondary)	-1.6%	[-2.0%, -1.3%]	3
All ❌✅ (primary)	1.4%	[-2.8%, 3.6%]	21

Binary size

Results (primary 0.0%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.1%]	46
Regressions ❌ (secondary)	0.1%	[0.0%, 0.1%]	29
Improvements ✅ (primary)	-1.1%	[-1.1%, -1.1%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	[-1.1%, 0.1%]	47

Bootstrap: 776.359s -> 774.569s (-0.23%)
Artifact size: 366.28 MiB -> 366.23 MiB (-0.02%)

lqd · 2025-05-27T15:57:12Z

cranelift-codegen is still seeing a big +5% hit but funnily enough I likely have a -10% improvement for it.

I haven't looked into the code yet apart from our discussions and walkthrough, are the max-rss increases expected?

tage64 · 2025-05-28T08:21:58Z

cranelift-codegen is still seeing a big +5% hit but funnily enough I likely have a -10% improvement for it.

Great! How do you get that improvement? In the borrow checker, or somewhere else?

I haven't looked into the code yet apart from our discussions and walkthrough, are the max-rss increases expected?

No The max-rss increase in a few crates is not expected. I don't know what it comes from. It is at least not the DenseBitSet implementation because we see the same regression in #141326 which is same as this pr minus the DenseBitSet-changes. It has to be investigated further.

lqd · 2025-05-28T09:15:43Z

In move/init for liveness in the borrowck, #141667

Tage Johansson added 12 commits May 21, 2025 10:02

Create Horatio: A faster implementation of Polonius.

1b58937

compiletest: Change Polonius compare mode to run polonius=next instea…

95e1a43

…d of legacy.

fix documentation errors

1bfbc9f

fix Clippy warning about cloning reference counted pointers

37e44c5

remove tests/crashes/135646.rs as it doesn't crash anymore

2522f24

make some code blocks in comments explicitly text so they are …

baa12bc

…not interpretted as Rust doctests

DenseBitSet: Solve a few bugs related to overflow in SHR.

280bee5

fix documentation errors

5081f73

remove a test checking that the hash for bit sets of different domain

f5f88ea

sizes are different The new implementation of DenseBitSet doesn't store the exact domain size, so of course the hash values for identical sets with different domain sizes may be equal.

in rustc_index: fix compilation error by changing the rustc_serialize…

042debb

… dependency to be required instead of optional

in rustc_index: revert "fix compilation error by changing the rustc_s…

f2585eb

…erialize dependency to be required instead of optional", and introduce conditional compilation instead.

rustbot assigned lqd May 26, 2025

lqd marked this pull request as draft May 26, 2025 12:45

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 26, 2025

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels May 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[experimental, do not merge!] a faster implementation of Polonius and a more compact DenseBitSet implementation #141583

[experimental, do not merge!] a faster implementation of Polonius and a more compact DenseBitSet implementation #141583

tage64 commented May 26, 2025 •

edited by rustbot

Loading

Uh oh!

rustbot commented May 26, 2025

Uh oh!

lqd commented May 26, 2025

Uh oh!

This comment has been minimized.

bors commented May 26, 2025

Uh oh!

bors commented May 26, 2025

Uh oh!

This comment has been minimized.

Kobzol commented May 26, 2025

Uh oh!

lqd commented May 26, 2025

Uh oh!

rust-timer commented May 26, 2025

Uh oh!

lqd commented May 27, 2025 •

edited

Loading

Uh oh!

tage64 commented May 28, 2025

Uh oh!

lqd commented May 28, 2025

Uh oh!

Uh oh!

[experimental, do not merge!] a faster implementation of Polonius and a more compact DenseBitSet implementation #141583

Are you sure you want to change the base?

[experimental, do not merge!] a faster implementation of Polonius and a more compact DenseBitSet implementation #141583

Conversation

tage64 commented May 26, 2025 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented May 26, 2025

Uh oh!

lqd commented May 26, 2025

Uh oh!

This comment has been minimized.

bors commented May 26, 2025

Uh oh!

bors commented May 26, 2025

Uh oh!

This comment has been minimized.

Kobzol commented May 26, 2025

Uh oh!

lqd commented May 26, 2025

Uh oh!

rust-timer commented May 26, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Uh oh!

lqd commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tage64 commented May 28, 2025

Uh oh!

lqd commented May 28, 2025

Uh oh!

Uh oh!

tage64 commented May 26, 2025 •

edited by rustbot

Loading

lqd commented May 27, 2025 •

edited

Loading