Skip to content

Support text segment boundary anchors #178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 3, 2022

Conversation

natecook1000
Copy link
Member

This enables the \y and \Y anchors in regex literals and Anchor.textSegmentBoundary in the DSL.

Note: This also includes UnicodeScalar conformance to RegexProtocol, which acts like Unicode scalar literals in regex literals.

This enables the `\y` and `\Y` anchors in regex literals and
`Anchor.textSegmentBoundary` in the DSL.

Note: This also includes `UnicodeScalar` conformance to
`RegexProtocol`, which acts like Unicode scalar literals in
regex literals.
@natecook1000
Copy link
Member Author

@swift-ci Please test Linux platform

@natecook1000
Copy link
Member Author

@swift-ci Please test Linux platform

Copy link
Member

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick pass over.

@@ -96,12 +96,16 @@ extension Compiler.ByteCodeGen {
}

case .textSegment:
// This we should be able to do!
throw Unsupported(#"\y (text segment)"#)
builder.buildAssert { (input, pos, _) in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these anchors' semantics dependent on options?

CC @hamishknight

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the y{g} and y{w} matching options flip the behavior between matching extended grapheme cluster boundaries and word boundaries respectively (with the former being the default). Though it sounds like that's what the TODO is referencing.

public typealias Match = Substring

public var regex: Regex<Match> {
.init(ast: atom(.scalar(self)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to stop constructing ASTs in the DSL. Can you use DSLTree, and if not, let me know what's missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched this and the other primitives to use DSLTree 👍🏻

@natecook1000
Copy link
Member Author

@swift-ci Please test Linux platform

Copy link
Member

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@milseman
Copy link
Member

@swift-ci please test

natecook1000 and others added 2 commits March 2, 2022 00:14
Co-authored-by: Michael Ilseman <michael.ilseman@gmail.com>
Co-authored-by: Michael Ilseman <michael.ilseman@gmail.com>
@natecook1000
Copy link
Member Author

@swift-ci Please test Linux platform

@natecook1000
Copy link
Member Author

@swift-ci Please test Linux platform

@natecook1000 natecook1000 merged commit 7b26028 into swiftlang:main Mar 3, 2022
@natecook1000 natecook1000 deleted the ahoy_grapheme_boundary branch March 3, 2022 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants