-
Notifications
You must be signed in to change notification settings - Fork 50
Parse Oniguruma callout and absent function syntax #129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci please test Linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
@@ -185,6 +188,43 @@ extension AST { | |||
} | |||
} | |||
|
|||
public struct AbsentFunction: Hashable, _ASTNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know Oniguruma uses this "absent function" name, but is there a better or more descriptive name? What does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, its primary role it to match against an inverse of a pattern (though the .expression
variant also specifies what it can match). So maybe something like MatchInverse
or Exclusion
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. This is Oniguruma's approach to the regex-inversion problem (i.e. inverting each component of a concatenation is surprising and not the same as inverting the regex). I'm fine keeping this name with a comment for now, as a term-of-art or something to be renamed later.
@rxwei any thoughts on how to model pattern inversions in the DSL?
) -> Located<AST.AbsentFunction.Start>? { | ||
recordLoc { src in | ||
if src.tryEat(sequence: "(?~|") { return .withPipe } | ||
if src.tryEat(sequence: "(?~") { return .withoutPipe } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a more descriptive name than "with(out) pipe"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean withoutPipe
is only used for the repeater syntax, so we could call it repeater
? And withPipe
could maybe be normal
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Oniguruma calls them repeater
and stopper
, which sounds good enough to me for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this is just for the opening syntax, (?~|
is not just for absent stoppers, but also for absent expressions and clearers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left the naming as-is for now, but am happy to change in a follow-up
Previously we just silently stopped parsing at these due to them being a termination condition for recursive parsing. Pick them up and diagnose when not doing a recursive parse.
Parse named and 'of-contents' Oniguruma callouts. This requires generalizing the group name handling to be for arbitrary identifiers, which may have a specific kind for diagnostics.
Parse the 4 varieties of absent function syntax supported by Oniguruma.
fdd98c6
to
f6a0240
Compare
@swift-ci please test Linux |
Parse both variants of Oniguruma callout syntax
(*name[tag]{args, ...})
and(?{...}[])
, as well as Oniguruma absent functions(?~|)
. In addition, diagnose ending)
s that don't close any groups.