-
Notifications
You must be signed in to change notification settings - Fork 7.9k
random: Optimize data flow for the generate
function of native engines
#13043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Instead of returning the generated `uint64_t` and providing the size (i.e. the number of bytes of the generated value) out-of-band via the `last_generated_size` member of the `php_random_status` struct, the `generate` function is now expected to return a new `php_random_result` struct containing both the `size` and the `result`. This has two benefits, one for the developer: It's no longer possible to forget setting `last_generated_size` to the correct value, because it now happens at the time of returning from the function. and the other benefit is for performance: The `php_random_result` struct will be returned as a register pair, thus the `size` will be directly available without reloading it from main memory. Checking a simplified version of `php_random_range64()` on Compiler Explorer (“Godbolt”) with clang 17 shows a single change in the resulting assembly showcasing the improvement (https://godbolt.org/z/G4WjdYxqx): - add rbp, qword ptr [r14] + add rbp, rdx Empirical testing confirms a measurable performance increase for the `Randomizer::getBytes()` method: <?php $e = new Random\Engine\Xoshiro256StarStar(0); $r = new Random\Randomizer($e); var_dump(strlen($r->getBytes(100000000))); goes from 250ms (before the change) to 220ms (after the change). While generating 100 MB of random data certainly is not the most common use case, it confirms the theoretical improvement in practice.
cb883b6
to
ba89f9b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me, but will let other people have a look too
Sorry for the delay in checking, but I think it is good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well.
@@ -218,11 +217,15 @@ typedef struct _php_random_status_state_user { | |||
zend_function *generate_method; | |||
} php_random_status_state_user; | |||
|
|||
typedef struct _php_random_result { | |||
const uint64_t result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The order of all initializers is inversed, so maybe inverse it here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've initially had it reversed, but this order minimizes the differences in the generated assembly, because .result
stays in rax
(with .size
being in rdx
, instead of vice versa).
Instead of returning the generated
uint64_t
and providing the size (i.e. the number of bytes of the generated value) out-of-band via thelast_generated_size
member of thephp_random_status
struct, thegenerate
function is now expected to return a newphp_random_result
struct containing both thesize
and theresult
.This has two benefits, one for the developer:
It's no longer possible to forget setting
last_generated_size
to the correct value, because it now happens at the time of returning from the function.and the other benefit is for performance:
The
php_random_result
struct will be returned as a register pair, thus thesize
will be directly available without reloading it from main memory.Checking a simplified version of
php_random_range64()
on Compiler Explorer (“Godbolt”) with clang 17 shows a single change in the resulting assembly showcasing the improvement (https://godbolt.org/z/G4WjdYxqx):Empirical testing confirms a measurable performance increase for the
Randomizer::getBytes()
method:goes from 250ms (before the change) to 220ms (after the change). While generating 100 MB of random data certainly is not the most common use case, it confirms the theoretical improvement in practice.