Skip to content

Replace DynamicCaptures with AnyRegexOutput. #222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 21, 2022

Conversation

rxwei
Copy link
Contributor

@rxwei rxwei commented Mar 21, 2022

AnyRegexOutput is a collection type that represents a type-erased regex output with two use cases:

  • When a Regex is initialized from a string, e.g. Regex("(a)|(b)|c"), it has type Regex<AnyRegexOutput>. One can iterate over the output to access output elements (the whole match followed by any captures).
  • When working with an existing code base that uses dynamic Regex creation from strings, one can type-erase the match result of a strongly typed regex, e.g. Regex<(Substring, Substring, Substring)>.Match, and use it as a drop in replacement for the dynamic regex match.

This has been pitched as part of the regex type.


Example:

let regex = try! Regex(#"([0-9A-F]+)(?:\.\.([0-9A-F]+))?\s+;\s+(\w+).*"#)
let line = """
  A6F0..A6F1    ; Extend # Mn   [2] BAMUM COMBINING MARK KOQNDON..BAMUM \
  COMBINING MARK TUKWENTIS
  """
let match = line.firstMatch(of: regex)!
let output = match.output // `AnyRegexOutput`
output.0            // => (the entire match)
output[0].bounds    // => (the bounds of the entire match)
output[0].substring // => (the entire match)
output[1].substring // => "A6F0"
output[2].substring // => "A6F1"
output[3].substring // => "Extend"

rxwei added 2 commits March 20, 2022 23:04
- Rename the `Match` associatedtype in `RegexComponent` to `Output`.
- Rename `MatchResult` to `Regex.Match`.

The new names have been pitched as part of the [regex type](https://forums.swift.org/t/pitch-regex-type-and-overview/56029) and the [regex builder DSL](https://forums.swift.org/t/pitch-regex-builder-dsl/56007).
`AnyRegexOutput` is a collection type that represents a type-erased regex output with two use cases:
- When a `Regex` is initialized from a string, e.g. `Regex("(a)|(b)|c")`, it has type `Regex<AnyRegexOutput>`. One can iterate over the output to access output elements (the whole match followed by any captures).
- When working with an existing code base that uses dynamic `Regex` creation from strings, one can type-erase the match result of a strongly typed regex, e.g. `Regex<(Substring, Substring, Substring)>.Match`, and use it as a drop in replacement for the dynamic regex match.

This has been pitched as part of the [regex type](https://forums.swift.org/t/pitch-regex-type-and-overview/56029).

---

Example:

```swift
let regex = try! Regex(#"([0-9A-F]+)(?:\.\.([0-9A-F]+))?\s+;\s+(\w+).*"#)
let line = """
  A6F0..A6F1    ; Extend # Mn   [2] BAMUM COMBINING MARK KOQNDON..BAMUM \
  COMBINING MARK TUKWENTIS
  """
let match = line.firstMatch(of: regex)!
let output = match.output // `AnyRegexOutput`
output.0            // => (the entire match)
output[0].bounds    // => (the bounds of the entire match)
output[0].substring // => (the entire match)
output[1].substring // => "A6F0"
output[2].substring // => "A6F1"
output[3].substring // => "Extend"
```
@rxwei rxwei requested a review from milseman March 21, 2022 06:21
@rxwei
Copy link
Contributor Author

rxwei commented Mar 21, 2022

Please review b5dd0ef as this is on top of another PR.

@rxwei
Copy link
Contributor Author

rxwei commented Mar 21, 2022

@swift-ci please test

Copy link
Member

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for tackling this!

fileprivate struct ElementRepresentation {
/// The depth of `Optioals`s wrapping the underlying value. For example,
/// `Substring` has optional depth `0`, and `Int??` has optional depth `2`.
let optionalDepth: Int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, did we decide on an optional nesting story?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no non-factorial solution to optional flattening this year, unfortunately. So we’ll have nested optionals.

@rxwei rxwei merged commit f00bfe0 into swiftlang:main Mar 21, 2022
@rxwei rxwei deleted the anyregexoutput branch March 21, 2022 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants