Allow comments between intersection types and by-ref params #10125

iluuu1994 · 2022-12-18T00:05:55Z

Follow up to GH-10084. Found a way to do this after all. /cc @olleharstedt

iluuu1994 · 2022-12-18T00:36:06Z

/cc @cmb69 Also mentioning you as you expressed distaste for a regex solution. Let me know if this is fine or if you prefer to see this compared to GLP first. I agree that we should switch to GLP sooner or later.

olleharstedt · 2022-12-18T09:11:10Z

Impressive :)

cmb69 · 2022-12-19T13:23:10Z

Let's say that I'm generally not happy with complicating the parser with work-arounds, but this is okay for me.

Girgias

This is kinda cursed... but sure why not.

iluuu1994 · 2023-01-26T16:38:33Z

FWIW I also just tried switching to Bison GLR and it's very simple. Bison also allows declaring the expected number of conflicts using %expect[-rr] n to avoid introducing unwanted conflicts in future additions. There is a performance impact here, as the parser clones itself when encountering a conflict and exploring all branches. However that will be highly dependent on the number of conflicts and the program itself.

master...iluuu1994:bison-glr

@bwoebi Would be interested in your opinion here.

Girgias · 2023-01-26T16:42:53Z

I can imagine @dseguy be frustrating about merging back the ampersand tokens, as I know he found it useful for Exakat. :)

cmb69 · 2023-01-26T17:31:22Z

master...iluuu1994:bison-glr

Cool! (I guess more could be simplified.)

There is a performance impact here, as the parser clones itself when encountering a conflict and exploring all branches. However that will be highly dependent on the number of conflicts and the program itself.

We would need to measure, but assuming OPcache is enabled, some performance degradation in the parser shouldn't be a big deal.

But given that the tokens would change, I don't think we can introduce that in a minor version.

iluuu1994 · 2023-01-26T17:40:42Z

@cmb69

I guess more could be simplified.

Yes, there are more lexer/parser hacks we could get rid of. I just wanted to see if it worked.

But given that the tokens would change, I don't think we can introduce that in a minor version.

The T_AMPERSAND was replaced with the context-sensitive tokens in PHP 8.1 with intersection types. Not saying this shouldn't wait for PHP 9, just that we don't have any rules on that.

cmb69 · 2023-01-26T17:50:32Z

The T_AMPERSAND was replaced with the context-sensitive tokens in PHP 8.1 with intersection types.

Ah, indeed! So maybe it might be an option to switch to GLR in the next minor. Should be discussed on the ML, anyway.

bwoebi · 2023-01-26T17:52:48Z

glr

:-)

So, the good thing about glr is that it allows bison to do much more lookahead and the bad thing is that you may have more lookahead.

In general, a GLR parser can take quadratic or cubic worst-case time, and the current Bison parser even takes exponential time and space for some grammars. ~ https://www.gnu.org/software/bison/manual/html_node/Simple-GLR-Parsers.html

The problem is that you now potentially run into degenerate cases (which is hard to test for, given that e.g. it might occur in big generated codes), if the grammar is not carefully constrained, especially with grammars as complex as PHPs.

Even if it's not a problem today (as our grammar currently is well constrained, given it runs at all with no GLR), that's a problem for future developments because it's very hard to estimate how "well constrained" a new addition is, until you get the bug reports from users that their code which worked well up to now suddenly takes a second to compile.

And that's why we haven't used GLR yet. It's a nice catch-all solution, with hidden drawbacks.

Even though I'd love to see GLR because it makes things simpler from a dev perspective... I'd rather not.

dseguy · 2023-01-26T20:58:49Z

I can imagine @dseguy be frustrating about merging back the ampersand tokens, as I know he found it useful for Exakat. :)

It is quite wicked syntax, indeed.
This is going to require some more processing, though nothing impossible.

iluuu1994 · 2023-02-02T15:12:19Z

@bwoebi Thanks for your insight. I'm fine with this approach then. For generics we can use the turbofish syntax, if it ever comes to that. I don't know of any other addition atm that would require a GLR parser.

iluuu1994 requested a review from bwoebi December 18, 2022 00:06

github-actions bot added the Category: Engine label Dec 18, 2022

iluuu1994 force-pushed the allow-comments-between-intersection-types branch from 3671e1a to dc6daf2 Compare December 18, 2022 00:07

iluuu1994 mentioned this pull request Dec 18, 2022

PHP 8.0+ does no longer accept a comment in the whitespace between ampersand and argument in function signature #10083

Closed

Girgias approved these changes Dec 22, 2022

View reviewed changes

Allow comments between intersection types and by-ref params

8f899d2

iluuu1994 force-pushed the allow-comments-between-intersection-types branch from dc6daf2 to 8f899d2 Compare February 2, 2023 16:59

iluuu1994 closed this in f291d37 Feb 2, 2023

This was referenced Oct 25, 2024

PHP 8.3 | Tokenizer/PHP: support "yield from" with comments PHPCSStandards/PHP_CodeSniffer#647

Merged

PHP 8.3 | ✨ New PHPCompatibility.Generators.NewYieldFromComment sniff PHPCompatibility/PHPCompatibility#1792

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow comments between intersection types and by-ref params #10125

Allow comments between intersection types and by-ref params #10125

Uh oh!

iluuu1994 commented Dec 18, 2022

Uh oh!

iluuu1994 commented Dec 18, 2022

Uh oh!

olleharstedt commented Dec 18, 2022

Uh oh!

cmb69 commented Dec 19, 2022

Uh oh!

Girgias left a comment

Uh oh!

iluuu1994 commented Jan 26, 2023

Uh oh!

Girgias commented Jan 26, 2023

Uh oh!

cmb69 commented Jan 26, 2023

Uh oh!

iluuu1994 commented Jan 26, 2023 •

edited

Loading

Uh oh!

cmb69 commented Jan 26, 2023

Uh oh!

bwoebi commented Jan 26, 2023 •

edited

Loading

Uh oh!

dseguy commented Jan 26, 2023

Uh oh!

iluuu1994 commented Feb 2, 2023

Uh oh!

Uh oh!

Allow comments between intersection types and by-ref params #10125

Allow comments between intersection types and by-ref params #10125

Uh oh!

Conversation

iluuu1994 commented Dec 18, 2022

Uh oh!

iluuu1994 commented Dec 18, 2022

Uh oh!

olleharstedt commented Dec 18, 2022

Uh oh!

cmb69 commented Dec 19, 2022

Uh oh!

Girgias left a comment

Choose a reason for hiding this comment

Uh oh!

iluuu1994 commented Jan 26, 2023

Uh oh!

Girgias commented Jan 26, 2023

Uh oh!

cmb69 commented Jan 26, 2023

Uh oh!

iluuu1994 commented Jan 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmb69 commented Jan 26, 2023

Uh oh!

bwoebi commented Jan 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dseguy commented Jan 26, 2023

Uh oh!

iluuu1994 commented Feb 2, 2023

Uh oh!

Uh oh!

iluuu1994 commented Jan 26, 2023 •

edited

Loading

bwoebi commented Jan 26, 2023 •

edited

Loading