Skip to content
This repository was archived by the owner on Jun 15, 2023. It is now read-only.

Do not parse past Eof #542

Merged
merged 3 commits into from
Jun 16, 2022
Merged

Do not parse past Eof #542

merged 3 commits into from
Jun 16, 2022

Conversation

cristianoc
Copy link
Contributor

Parsing past Eof can lead to infinite loops, as it violates the assumptions of the termination checker.
See: #540

This PR makes Parser.next assert false when called on Eof. This should not happen as one should check the token before calling Parser.next.

The one exception is during lookahead, for which we provide a nextUnsafe function which does not make progress.

@cristianoc
Copy link
Contributor Author

cristianoc commented Jun 14, 2022

This turns potential infinite loops into runtime exceptions when calling Parser.next.
I wad checking the code for exceptions, and realised there are a few parts where there is very nearly an infinite loop (it's an innocent-looking code change away). Which after this change manifests as a runtime exception.
In here #543 the exception analysis is used to make sure this does not happen.

@cristianoc
Copy link
Contributor Author

Given how noisy it would be to give static guarantees, one can shoot for detection. When look-ahead it taken into account, real code never goes 1 further than the first Eol. This could then reported back as a syntax error message rather than crashing the compiler/editor.

@cristianoc
Copy link
Contributor Author

Summarising: preventing infinite loops arising from cases where one keeps on reading Eof over and over again statically is a bit tricky right now.
But, it's fairly easy to detect when this happens. The current PR does just that: detection. This will only fire when there's another mistake in the implementation of the parser.
Currently, detection gives an assert false, but perhaps it makes sense to give dedicated error instead. Notice that recovery is not an option in this case (unless one were to use effect handlers in the implementation).

@cristianoc
Copy link
Contributor Author

At the same time the result of #543 can be used to see where the logic had to be modified, and try to use that to construct infinite loop repros.

Here is one:

let foo = x =>
  switch x {
  | `${

@cristianoc
Copy link
Contributor Author

Another case to investigate: https://github.com/rescript-lang/syntax/pull/543/files#r897400872

@cristianoc
Copy link
Contributor Author

Parsing past Eof can lead to infinite loops, as it violates the assumptions of the termination checker.
See: #540

This PR makes `Parser.next` assert false when called on Eof. This should not happen as one should check the token before calling `Parser.next`.

The one exception is during lookahead, for which we provide a `nextUnsafe` function which does not make progress.
@cristianoc
Copy link
Contributor Author

The last unclear case found during audit does not seem a real infinite loop.
So this should cover all cases considered.

@cristianoc
Copy link
Contributor Author

Added an issue to track better error messages: #550

@cristianoc cristianoc merged commit eeaa86f into master Jun 16, 2022
@cristianoc cristianoc deleted the eof branch June 16, 2022 00:04
cristianoc added a commit that referenced this pull request Jun 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant