You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allow some array functions to operate in-place using a peep-hole optimization
This patch essentially implements GH-9881 (and more).
Patterns like that seem to be quite common in user code. Therefore it
seems to make sense to make to create a "peep-hole optimization" to
prevent copies of the array.
This patch optimizes the `$x = array_function($x, ...)` pattern and the
`$x = array_function(temporary, ...)` pattern for these functions:
array_merge, array_unique, array_replace, and array_intersect.
With these limitations:
- array_{unique,intersect} only do the temporary optimization because the
comparison may throw.
- array_{merge,replace} only optimizes CVs for non-recursive case because the
recursive version may throw.
- array_merge optimization works only if the array is packed and is
without holes.
It works by checking if the array function is immediately followed by an
assignment which overrides the input. In that case we can do the
operation in-place instead of copying the array. Note that this is
limited to CV's at the moment, and can't handle more complex scenarios
like array or object assignments.
For the temporary case it suffices to check if the refcount of the input
array is 1 and the array is non-persistent and non-immutable.
The current approach is a bit ugly though: it looks at the VM
instructions from within a function to check if the optimization is
possible, which is a bit odd.
I considered extending opcache as an alternative, but I believe this would
require adding a whole bunch of machinery for only a few users.
Looking at the assembly of prepare_in_place_array_modify_if_possible()
it looks pretty light-weight, about 95 bytes / 29 instructions on my
x86-64 Linux laptop.
** Safety **
There are some array functions which take some sort of copy of the input
array into a temporary C array for sorting.
(e.g. array_unique, array_diff, and array_intersect do this).
Since we no longer take a copy in all cases, we must check if it's
possible that a value is accessed that was already destroyed.
For array_unique: cmpdata will never be removed so that will never reach
refcount 0. And when something is removed, it is the previous value of
cmpdata, not the one user later. So this seems okay.
For array_intersect: a previous pointer (ptr[0] - 1) is accessed.
But this can't be a destroyed value because the pointer is first moved forward.
** Results **
Using this benchmark script
https://gist.github.com/nielsdos/ae5a2dddc53c61749ae31c908aa78e98
I get:
=== array_merge $a = array_merge($a, ...) ===
before 4.3821 sec
after 0.0022 sec
=== array_merge temporary ===
before 0.1265 sec
after 0.0479 sec
=== array_unique temporary ===
before 0.9297 sec
after 0.8498 sec
=== array_replace $a = array_replace($a, ...) ===
before 0.0810 sec
after 0.0083 sec
=== array_replace temporary ===
before 0.1261 sec
after 0.0534 sec
=== array_intersect temporary
(no significant improvement because dominated by sorting) ===
before 30.499 sec
after 30.356 sec
0 commit comments