Skip to content

Parse Oniguruma callout and absent function syntax #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 26, 2022

Conversation

hamishknight
Copy link
Contributor

Parse both variants of Oniguruma callout syntax (*name[tag]{args, ...}) and (?{...}[]), as well as Oniguruma absent functions (?~|). In addition, diagnose ending )s that don't close any groups.

@hamishknight
Copy link
Contributor Author

@swift-ci please test Linux

Copy link
Member

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

@@ -185,6 +188,43 @@ extension AST {
}
}

public struct AbsentFunction: Hashable, _ASTNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know Oniguruma uses this "absent function" name, but is there a better or more descriptive name? What does this do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, its primary role it to match against an inverse of a pattern (though the .expression variant also specifies what it can match). So maybe something like MatchInverse or Exclusion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. This is Oniguruma's approach to the regex-inversion problem (i.e. inverting each component of a concatenation is surprising and not the same as inverting the regex). I'm fine keeping this name with a comment for now, as a term-of-art or something to be renamed later.

@rxwei any thoughts on how to model pattern inversions in the DSL?

) -> Located<AST.AbsentFunction.Start>? {
recordLoc { src in
if src.tryEat(sequence: "(?~|") { return .withPipe }
if src.tryEat(sequence: "(?~") { return .withoutPipe }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more descriptive name than "with(out) pipe"?

Copy link
Contributor Author

@hamishknight hamishknight Jan 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean withoutPipe is only used for the repeater syntax, so we could call it repeater? And withPipe could maybe be normal or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Oniguruma calls them repeater and stopper, which sounds good enough to me for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this is just for the opening syntax, (?~| is not just for absent stoppers, but also for absent expressions and clearers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left the naming as-is for now, but am happy to change in a follow-up

Previously we just silently stopped parsing at
these due to them being a termination condition
for recursive parsing. Pick them up and diagnose
when not doing a recursive parse.
Parse named and 'of-contents' Oniguruma callouts.
This requires generalizing the group name handling
to be for arbitrary identifiers, which may have a
specific kind for diagnostics.
Parse the 4 varieties of absent function syntax
supported by Oniguruma.
@hamishknight
Copy link
Contributor Author

@swift-ci please test Linux

@hamishknight hamishknight merged commit fbdc959 into swiftlang:main Jan 26, 2022
@hamishknight hamishknight deleted the oniguruma-calling branch January 26, 2022 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants