Optimize break patterns #107638

zhangyunhao116 · 2023-02-03T15:58:50Z

Use wyrand instead of calling XORSHIFT 2 times in break patterns for the 64-bit platform. The new PRNG is 2x faster than the previous one.

Bench result(via https://gist.github.com/zhangyunhao116/11ef41a150f5c23bb47d86255fbeba89):

old                     time:   [1.3258 ns 1.3262 ns 1.3266 ns]
                        change: [+0.5901% +0.6731% +0.7791%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe

new                     time:   [657.65 ps 657.89 ps 658.18 ps]
                        change: [-1.6910% -1.6110% -1.5256%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

rustbot · 2023-02-03T15:58:58Z

r? @scottmcm

(rustbot has picked a reviewer for you, use r? to override)

rustbot · 2023-02-03T15:59:01Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

scottmcm · 2023-02-03T21:11:37Z

I'm going to flip this over to the actual lib team, not me, since I don't know what the algorithms policy is for stuff like this
r? rust-lang/libs

(I also wonder if 64-bit randomness is even all that useful here, since the odds of someone sorting more than 2³² items seems incredibly low, and if they're really sorting/selecting more, maybe breaking patterns in "only" the local over-4GB-of-RAM-anyway area might be sufficient.)

zhangyunhao116 · 2023-02-04T02:58:46Z

Thanks! Agree that the length of the slice greater than 2^32 is a rare case, but the wyrand(generate 64-bit) is even 10% faster than the XORSHIFT(generate 32-bit) in the 64-bit platform. It might be better if we could use the faster algorithm and still support this rare case.

cuviper · 2023-02-20T20:01:54Z

Did you consider the 64-bit version of xorshift? (<<13; >>7; <<17) With your benchmark setup, 64-bit xorshift performs almost the same as the wyrand version on my system. I'd feel more comfortable staying in the same PRNG family, rather than introducing any uncertainty around the wyrand unlicense. (Which may be fine, but we have to figure that out.)

In break_patterns these calls are also repeated in a short loop, mixed with other operations, so it's not obvious to extrapolate from your benchmark to how that performance will play out in reality.

zhangyunhao116 · 2023-02-22T16:00:29Z

I think using the 64-bit version of the XORSHIFT is good. On my server, the WYHASH is just 40ps (about 5%~10%) faster than it. Considering gen_usize is not a hot spot, using the same algorithm is an acceptable way.

cuviper · 2023-02-22T20:22:34Z

@bors r+ rollup=never (for perf)

bors · 2023-02-22T20:22:36Z

📌 Commit e107ca0 has been approved by cuviper

It is now in the queue for this repository.

bors · 2023-02-22T20:25:49Z

📌 Commit e107ca0 has been approved by cuviper

It is now in the queue for this repository.

bors · 2023-02-25T03:01:43Z

⌛ Testing commit e107ca0 with merge 6ffabf3...

bors · 2023-02-25T06:14:28Z

☀️ Test successful - checks-actions
Approved by: cuviper
Pushing 6ffabf3 to master...

rust-timer · 2023-02-25T07:32:01Z

Finished benchmarking commit (6ffabf3): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.3%	[-2.3%, -2.3%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.3%	[-2.3%, -2.3%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.8%	[-4.9%, -1.7%]	15
All ❌✅ (primary)	-	-	0

scottmcm · 2023-02-25T07:37:53Z

Wow! Are those cycle results legit? What's rustc sorting to get that?

cuviper · 2023-02-26T02:36:36Z

I think that may be recovering from a fluke in the prior cycles comparison that a regressed by the same amount.

rustbot assigned scottmcm Feb 3, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 3, 2023

zhangyunhao116 force-pushed the pdqsort-rand branch 5 times, most recently from 195966c to 1e81c81 Compare February 3, 2023 17:08

rustbot assigned cuviper and unassigned scottmcm Feb 3, 2023

zhangyunhao116 force-pushed the pdqsort-rand branch from 1e81c81 to 2b14ea3 Compare February 22, 2023 16:00

zhangyunhao116 closed this Feb 22, 2023

zhangyunhao116 force-pushed the pdqsort-rand branch from 2b14ea3 to 3b4d6e0 Compare February 22, 2023 16:02

Optimize break patterns

e107ca0

zhangyunhao116 reopened this Feb 22, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 22, 2023

This comment was marked as resolved.

Sign in to view

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 22, 2023

This comment was marked as resolved.

Sign in to view

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 22, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 25, 2023

bors merged commit 6ffabf3 into rust-lang:master Feb 25, 2023

rustbot added this to the 1.69.0 milestone Feb 25, 2023

zhangyunhao116 deleted the pdqsort-rand branch March 3, 2023 11:27

zhangyunhao116 restored the pdqsort-rand branch March 3, 2023 11:28

zhangyunhao116 deleted the pdqsort-rand branch March 3, 2023 11:28

zhangyunhao116 restored the pdqsort-rand branch March 3, 2023 11:28

zhangyunhao116 deleted the pdqsort-rand branch March 3, 2023 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize break patterns #107638

Optimize break patterns #107638

Uh oh!

zhangyunhao116 commented Feb 3, 2023

Uh oh!

rustbot commented Feb 3, 2023

Uh oh!

rustbot commented Feb 3, 2023

Uh oh!

scottmcm commented Feb 3, 2023

Uh oh!

zhangyunhao116 commented Feb 4, 2023

Uh oh!

cuviper commented Feb 20, 2023

Uh oh!

zhangyunhao116 commented Feb 22, 2023

Uh oh!

cuviper commented Feb 22, 2023

Uh oh!

bors commented Feb 22, 2023

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

bors commented Feb 22, 2023

Uh oh!

bors commented Feb 25, 2023

Uh oh!

bors commented Feb 25, 2023

Uh oh!

rust-timer commented Feb 25, 2023

Uh oh!

scottmcm commented Feb 25, 2023

Uh oh!

cuviper commented Feb 26, 2023

Uh oh!

Uh oh!

Optimize break patterns #107638

Optimize break patterns #107638

Uh oh!

Conversation

zhangyunhao116 commented Feb 3, 2023

Uh oh!

rustbot commented Feb 3, 2023

Uh oh!

rustbot commented Feb 3, 2023

Uh oh!

scottmcm commented Feb 3, 2023

Uh oh!

zhangyunhao116 commented Feb 4, 2023

Uh oh!

cuviper commented Feb 20, 2023

Uh oh!

zhangyunhao116 commented Feb 22, 2023

Uh oh!

cuviper commented Feb 22, 2023

Uh oh!

bors commented Feb 22, 2023

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

bors commented Feb 22, 2023

Uh oh!

bors commented Feb 25, 2023

Uh oh!

bors commented Feb 25, 2023

Uh oh!

rust-timer commented Feb 25, 2023

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Uh oh!

scottmcm commented Feb 25, 2023

Uh oh!

cuviper commented Feb 26, 2023

Uh oh!

Uh oh!