Skip to content

Allow internal functions to declare if they support compile-time evaluation. #7780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 20, 2021

Conversation

TysonAndre
Copy link
Contributor

@TysonAndre TysonAndre commented Dec 15, 2021

https://wiki.php.net/rfc/strtolower-ascii means that these functions no longer
depend on the current locale in php 8.2. Before that, this was unsafe to
evaluate at compile time.

Followup to GH-7506

Add strcmp/strcasecmp/strtolower/strtoupper functions

Add bin2hex/hex2bin and related functions

Add array_key_first/array_key_last

Update test of garbage collection using strtolower

  • Make sure that this tests the right thing when opcache is enabled.
    Update this as an example for future tests.

@TysonAndre TysonAndre changed the title Do compile-time evaluation of locale-independent case conversion Evaluate more functions at compile time, do compile-time evaluation of locale-independent case conversion, Dec 15, 2021
@nikic
Copy link
Member

nikic commented Dec 16, 2021

If you're going to optimize this, please switch it to a prepopulated hash table.

@nikic
Copy link
Member

nikic commented Dec 16, 2021

A nice solution might be to include this information as a flag in zend_func_info, which is generated from stubs.

@TysonAndre TysonAndre force-pushed the opcache-case-insensitive branch 2 times, most recently from 6e39a4f to add5428 Compare December 17, 2021 00:42
@TysonAndre
Copy link
Contributor Author

TysonAndre commented Dec 17, 2021

bin2hex hex2bin strlen mb_strlen would be good here as well, maybe also preg_match preg_replace

bin2hex/hex2bin are done, strlen is already optimized to an opcode rather than a function call, and mb_strlen can't be done because it depends on ini settings that can change at runtime

The encoding parameter is the character encoding. If it is omitted or null, the internal character encoding value will be used.

preg_replace - I've avoided that because of https://www.regular-expressions.info/catastrophic.html potentially taking exponential time in the worst case, and bugs that have been seen in the past with PCRE's jit on different platforms, though maybe that isn't a problem in practice for compiling source code

preg_match - same, also has a reference parameter

A nice solution might be to include this information as a flag in zend_func_info, which is generated from stubs.

I'd considered doing that but didn't know if there were reasons that this wasn't already done - though, until the recent refactoring by @nikic it wouldn't even have been possible to catch notices/exceptions during compilation

This would enable PECLs to start using this, which would be convenient if done properly.

I also had a question on when it's safe to do this for floating-point math. Do we assume that rounding modes don't change (or does that not apply for pow)? I see "pow" is already added, but would "log", "exp", "pi", etc. be reasonable to add (not really common to need, but now it has no cost to add)

  • More functions should be added in subsequent PRs instead to be easier to finish reviewing this

@TysonAndre TysonAndre changed the title Evaluate more functions at compile time, do compile-time evaluation of locale-independent case conversion, Allow internal functions to declare if they support compile-time evaluation. Dec 17, 2021
@TysonAndre
Copy link
Contributor Author

A nice solution might be to include this information as a flag in zend_func_info, which is generated from stubs.

Done, the build is passing except for a spurious travis failure. This is ready for review.

I also had a question on when it's safe to do this for floating-point math. Do we assume that rounding modes don't change (or does that not apply for pow)?

Never mind, I see Zend/zend_float.c has ensured that for a really long time, in zend_init_fpu - as of https://wiki.php.net/rfc/rounding

I was thinking of float-to-string conversion historically not being done in opcache until php 8.0, for https://wiki.php.net/rfc/locale_independent_float_to_string

@staabm
Copy link
Contributor

staabm commented Dec 19, 2021

will this change have an relevant impact on how long files need to compile?
in the CLI where we don't use opcache in the regular case, might this be an issue?

@TysonAndre
Copy link
Contributor Author

will this change have an relevant impact on how long files need to compile?

It should be a tiny bit faster, I'd expect, with no longer having a long chain of string comparisons when compiling functions that have constant operands.

PHP is already evaluating the most commonly used functions at compile-time. Apart from the new case-insensitive string functions (strtolower, strtoupper, strcasecmp, etc) any new functions added here are less commonly used.

This also only affects compilation of calls that have constant operands - if the operands are not inferrable as constants at compile time then this code path doesn't get called.

opcache.file_cache is an option if CLI compilation time is a concern.

in the CLI where we don't use opcache in the regular case, might this be an issue?

If opcache.enable_cli isn't enabled or opcache isn't loaded, this has no impact, it isn't called.

Also, in the interactive CLI (php -a) the optimizer isn't used, same as eval, though the required files would have opcache get used

@nikic
Copy link
Member

nikic commented Dec 19, 2021

@kocsismate fyi

…uation.

https://wiki.php.net/rfc/strtolower-ascii means that these functions no longer
depend on the current locale in php 8.2. Before that, this was unsafe to
evaluate at compile time.

Followup to phpGH-7506

Add strcmp/strcasecmp/strtolower/strtoupper functions

Add bin2hex/hex2bin and related functions

Add array_key_first/array_key_last

Update test of garbage collection using strtolower

- Make sure that this tests the right thing when opcache is enabled.
  Update this as an example for future tests.
@TysonAndre TysonAndre force-pushed the opcache-case-insensitive branch from 2b17d40 to b6f17b4 Compare December 19, 2021 15:08
@kocsismate
Copy link
Member

kocsismate commented Dec 20, 2021

Thanks for the ping (I already had a quick look at the PR before), and I liked it very much :)

A question has just came to my mind though: wouldn't it make more sense to use a more general name for the annotation, like @pure, just like what static analysis tools use? As far as I can understand, pure functions can be evaluated at compile time (- slow ones according to your definition in the comments). However, I'm fine with @compile-time-eval as well :)

UPDATE: I've just realized that @pure is probably not a good idea, since you cannot evaluate every pure function at compile time.

@TysonAndre
Copy link
Contributor Author

A question has just came to my mind though: wouldn't it make more sense to use a more general name for the annotation, like @pure, just like what static analysis tools use? As far as I can understand, pure functions can be evaluated at compile time (- slow ones according to your definition in the comments). However, I'm fine with @compile-time-eval as well :)

UPDATE: I've just realized that @pure is probably not a good idea, since you cannot evaluate every pure function at compile time.

I agree, I'd avoided that name for that reason (functions may be too slow, use up too much memory, infinitely loop, etc).

That name would also make the use of @pure in stubs differ from userland phpdoc (e.g. for stubs for IDEs/analyzers), if phpdoc were to ever add @pure as a standard annotation and make it more likely PECL developers would use it in a place where it would be unsafe.

src/Psalm/DocComment.php
34:        'generator-return', 'ignore-falsable-return', 'variadic', 'pure',

@TysonAndre TysonAndre merged commit 32e2d97 into php:master Dec 20, 2021
mvorisek added a commit to mvorisek/php-src that referenced this pull request Mar 3, 2023
mvorisek added a commit to mvorisek/php-src that referenced this pull request Mar 3, 2023
mvorisek added a commit to mvorisek/php-src that referenced this pull request Mar 7, 2023
nielsdos pushed a commit that referenced this pull request Mar 7, 2023
These functions were accidentally removed from being CTE in GH-7780.
This patch brings them back.

Closes GH-10768.
mvorisek added a commit to mvorisek/php-src that referenced this pull request Mar 15, 2023
nielsdos pushed a commit that referenced this pull request Mar 15, 2023
This was accidentally added in GH-7780, but since it takes a callable
argument, this flag is useless on this function.

Closes GH-10859.
mvorisek added a commit to mvorisek/php-src that referenced this pull request Mar 15, 2023
This was accidentally added in phpGH-7780, but since it takes a callable
argument, this flag is useless on this function.

Closes phpGH-10859.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants