Skip to content

Port incremental parsing ability to CodeBlockItem #1685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 11, 2023

Conversation

StevenWong12
Copy link
Contributor

@StevenWong12 StevenWong12 commented May 21, 2023

I want to make this pr as a start effort to port incremental parsing ability to SwiftParser.

@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch 2 times, most recently from 69f7cf2 to 7a19dca Compare June 3, 2023 07:33
@StevenWong12
Copy link
Contributor Author

StevenWong12 commented Jun 3, 2023

To prevent incremental parse from missing the legal parts of re-used nodes ahead, I made following changes:

  • Add a property firstPotentialTokenChoices to SyntaxKind to represent the potential introducer for each SyntaxKind.
  • Add a property syntaxLayout to SyntaxKind to represent the layout of nodes in the form of SyntaxKind.
  • Add a property nextPotentialTokenChoices to RawSyntax to compute the next potential token of a certain node when it is initialized, which is essentially the next expected optional child of a node, and will be located by the last non-nil child. This mechanism is supported by syntaxLayout and firstPotentialTokenChoices.
  • Judge whether a node can be reused according to whether nextPotentialTokenChoices contains lookahead.currentToken.

This is still WIP but I am open to feedback.

@StevenWong12
Copy link
Contributor Author

StevenWong12 commented Jun 4, 2023

I think this pr is ready for review.

I will squash the commits when there aren't any big problems with the design.

@StevenWong12 StevenWong12 marked this pull request as ready for review June 4, 2023 07:32
@StevenWong12 StevenWong12 requested a review from ahoppen as a code owner June 4, 2023 07:32
@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch from 3ffcc72 to 0055ebe Compare June 11, 2023 15:23
@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch from 70b5a43 to 33f1a29 Compare June 19, 2023 09:13
fileprivate var edits: ConcurrentEdits?
fileprivate var reusedDelegate: IncrementalParseReusedNodeDelegate?

public var furthestOffset: Int?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split IncrementalParseTransition into two types as discussed offline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add a IncrementalParseNodeAffectRangeCollector to collect this and make it as a trigger to enable incremental parse.

Comment on lines +53 to +75
func getOffsetToStart(_ token: Lexer.Lexeme) -> Int {
return self.sourceBufferStart.distance(to: token.cursor)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of this PR: Can we use getOffsetToStart instead of offset(of:).

@@ -27,22 +27,30 @@ extension Parser {
/// i.e. how far it looked ahead.
var tokensConsumed: Int = 0

var parseTransition: IncrementalParseTransition?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, we could store furthestOffset in a standalone struct and pass that over here as an UnsafeMutablePointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have to record the offset in both Parser.peek() and Lookahead.consumeAnyToken(), I make this pointer a member of LexemeSequence.

Comment on lines 45 to 51
_ = Parser.parse(source: buf.bindMemory(to: UInt8.self))
previousTreeDict[file] = Parser.parse(source: buf.bindMemory(to: UInt8.self), parseTransition: parseTransition)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a flag to decide whether we want to do incremental parsing on subsequent iterations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a flag about this, but I'm not quite sure whether the code is still a bit hacky.

@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch from b0cf0d5 to 92d7ceb Compare June 25, 2023 13:01
@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch from 92d7ceb to accde01 Compare June 26, 2023 16:13
@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch from 5edfc14 to 82c73e5 Compare July 3, 2023 03:51
@StevenWong12
Copy link
Contributor Author

I add two Parser.parse methods which accept IncrementalParseTransition and return (tree: SourceFileSyntax, nodeAffectRangeCollector: IncrementalParseNodeAffectRangeCollector) as the incremental parse entries. Then we could use the nodeAffectRangeCollector returned by last parse to initialize IncrementalParseTransition. What do you think about it?

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very close now. I mostly have comments about naming and doc comments.

@@ -20,26 +20,39 @@
/// This is also used for testing purposes to ensure incremental reparsing
/// worked as expected.
public protocol IncrementalParseReusedNodeDelegate {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not part of this PR but I think we should consider change IncrementalParseReusedNodeDelegate to just be a callback function on IncrementalParseTransition, i.e. removing this protocol altogether. That’s more swifty than the Objective-C delegate pattern.

@@ -237,6 +262,7 @@ public struct Parser {
extension Parser {
/// Retrieves the token following the current token without consuming it.
func peek() -> Lexer.Lexeme {
lexemes.recordFurthestOffset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just something that crossed my mind: Can’t we just call recordFurthestOffset from LexemeSequences.advance and LexemeSequence.peek? That way we can’t forget to call it from any Parser or Lookahead functions that interact with LexemeSequence. Or is there a performance regression associated with that?

Copy link
Contributor Author

@StevenWong12 StevenWong12 Jul 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't do that because I was trap into some weird bugs and fail the tests then. And it turned out that it was because I called recordNextTokenInLookaheadTracker in the wrong time🥹.

Or is there a performance regression associated with that?

About the performance, see my comments below.

@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch 2 times, most recently from bc9be97 to 05ea6d0 Compare July 8, 2023 15:47
@StevenWong12
Copy link
Contributor Author

I re-run some performance tests at two points.

  1. About the SyntaxCursor, we actually get many benefits from it (obvious in 100-iterations run) and I've been mis-using it before.
image
  1. About calling recordNextTokenInLookaheadTracker in LexemeSequence. There aren't very much difference between calling in LexemeSequence and Lookahead.comsumeAnyToken/Parser.peek in terms of performance. And the difference is too small to be caught by instruments. I think calling them in LexemeSequence is more clear.
image

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better performance, I love it.

I think the only non-trivial question that remains is who’s responsible for calling recordNextTokenInLookaheadTracker, everything else is super minor now.

@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch from 05ea6d0 to fdcd570 Compare July 10, 2023 13:50
Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment to make a method private, otherwise this is looking great now 🕺🏽💃🏽.

Use `LookaheadTracker` to track the furthest offset we reached and use `LookaheadRanges` to record the offset as the source range where might affect the parse of a node. And we could use that information to judge whether a node can be reused
@StevenWong12 StevenWong12 force-pushed the enable_incremental_parsing branch from fdcd570 to 1e6406d Compare July 10, 2023 14:40
@ahoppen
Copy link
Member

ahoppen commented Jul 10, 2023

🚀

@swift-ci Please test

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@ahoppen
Copy link
Member

ahoppen commented Jul 11, 2023

swiftlang/swift-stress-tester#242

@swift-ci Please test macOS

@ahoppen
Copy link
Member

ahoppen commented Jul 11, 2023

swiftlang/swift-stress-tester#242

@swift-ci Please test Windows

@StevenWong12
Copy link
Contributor Author

Oops, the same problem as before. Maybe we need to re-trigger again🥲

@ahoppen
Copy link
Member

ahoppen commented Jul 11, 2023

swiftlang/swift-stress-tester#242

@swift-ci Please test macOS

1 similar comment
@ahoppen
Copy link
Member

ahoppen commented Jul 11, 2023

swiftlang/swift-stress-tester#242

@swift-ci Please test macOS

@ahoppen ahoppen merged commit 6507f71 into swiftlang:main Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants