Skip to content

Parse global matching options #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 28, 2022
Merged

Conversation

hamishknight
Copy link
Contributor

Parse the PCRE global matching options e.g (*CR), (*UTF), and (*LIMIT_DEPTH=d). Such options may only appear at the very start of the regex, and as such are stored on the top-level AST type.

@hamishknight
Copy link
Contributor Author

@swift-ci please test Linux

Copy link
Member

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. I think it's worth thinking (briefly and in a follow-up PR!) about what the pretty-print APIs should be, but this seems fine as it is.

@@ -34,7 +34,7 @@ extension AST.Node {
) -> String {
var printer = PrettyPrinter()
printer.printAsCanonical(
self,
.init(self, globalOptions: nil),
delimiters: delimiters,
terminateLine: terminateLine)
return printer.finish()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition is that we then want to hoist this to AST instead of on nodes. The render which takes various options makes sense on the AST so it picks up global options, etc, and the recursive logic is on node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an overload for AST too, this overload is mainly for convenience e.g a few call sites doing throw unsupported(node.renderAsCanonical()). Though we could reduce the duplication here by having it call into the other overload, let me change that.

guard ast.root == expectedAST
|| ast.root._dump() == expectedAST._dump() // EQ workaround
guard ast == expectedAST
|| ast._dump() == expectedAST._dump() // EQ workaround
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, this is still here! Any chance of adding a bool parameter for something like "ignore trivia", which we can approximate with the dump check? IIRC, this was mostly for trivia like non-semantic whitespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is awful, IIRC it's done to ignore both trivia and source location info. We could add a flag, though that wouldn't be sufficient to handle the character class trivia stripping which happens in _dumpBase in #136. Really we ought to strip trivia from the entire tree prior to dumping (or if we strip enough, just compare with ==), or otherwise have a parent type (like PrettyPrinter) that manages the dumping logic as it walks the tree and can store an ignoreTrivia flag. I can look into that in a follow-up.

Parse the PCRE global matching options e.g `(*CR)`,
`(*UTF)`, and `(*LIMIT_DEPTH=d)`. Such options may
only appear at the very start of the regex, and as
such are stored on the top-level AST type.
@hamishknight
Copy link
Contributor Author

@swift-ci please test Linux

@hamishknight hamishknight merged commit cf2d910 into swiftlang:main Jan 28, 2022
@hamishknight hamishknight deleted the round-again branch January 28, 2022 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants