From 9cdd0fac10438dac4e5adcfce9fa789363db12cc Mon Sep 17 00:00:00 2001 From: Richard Wei Date: Wed, 16 Mar 2022 16:44:12 -0700 Subject: [PATCH 1/3] DSL pitch minor fixes. --- Documentation/Evolution/RegexBuilderDSL.md | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/Documentation/Evolution/RegexBuilderDSL.md b/Documentation/Evolution/RegexBuilderDSL.md index 00776f116..91e7c686a 100644 --- a/Documentation/Evolution/RegexBuilderDSL.md +++ b/Documentation/Evolution/RegexBuilderDSL.md @@ -1,17 +1,10 @@ # Regex builder DSL * Proposal: [SE-NNNN](NNNN-filename.md) -* Authors: [Richard Wei](https://github.com/rxwei), ... +* Authors: [Richard Wei](https://github.com/rxwei) * Review Manager: TBD -* Status: **Awaiting implementation** - -*During the review process, add the following fields as needed:* - -* Implementation: [apple/swift#NNNNN](https://github.com/apple/swift/pull/NNNNN) or [apple/swift-evolution-staging#NNNNN](https://github.com/apple/swift-evolution-staging/pull/NNNNN) -* Decision Notes: [Rationale](https://forums.swift.org/), [Additional Commentary](https://forums.swift.org/) -* Bugs: [SR-NNNN](https://bugs.swift.org/browse/SR-NNNN), [SR-MMMM](https://bugs.swift.org/browse/SR-MMMM) -* Previous Revision: [1](https://github.com/apple/swift-evolution/blob/...commit-ID.../proposals/NNNN-filename.md) -* Previous Proposal: [SE-XXXX](XXXX-filename.md** +* Implementation: [apple/swift-experimental-string-processing](https://github.com/apple/swift-experimental-string-processing/tree/main/Sources/_StringProcessing/RegexDSL) +* Status: **Pitch** **Table of Contents** - [Introduction](#introduction) From 872633c20f97fe8c1d470da42209b760786d088e Mon Sep 17 00:00:00 2001 From: Richard Wei Date: Wed, 16 Mar 2022 16:58:48 -0700 Subject: [PATCH 2/3] Fix typos. --- Documentation/Evolution/RegexBuilderDSL.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/Evolution/RegexBuilderDSL.md b/Documentation/Evolution/RegexBuilderDSL.md index 91e7c686a..5880701c5 100644 --- a/Documentation/Evolution/RegexBuilderDSL.md +++ b/Documentation/Evolution/RegexBuilderDSL.md @@ -1231,7 +1231,7 @@ The proposed feature relies heavily upon overloads of `buildBlock` and `buildPar ## Alternatives considered -### Operators for quantification and alterantion +### Operators for quantification and alternation While `ChoiceOf` and quantifier functions provide a general way of creating alternations and quantifications, we recognize that some synctactic sugar can be useful for creating one-liners like in textual regexes, e.g. infix operator `|`, postfix operator `*`, etc. @@ -1366,7 +1366,7 @@ However, given that one-or-more (`+`), zero-or-more (`*`) and optional (`?`) are One could argue that type such as `OneOrMore` could be defined as a top-level function that returns `Regex`. While it is entirely possible to do so, it would lose the name scoping benefits of a type and pollute the top-level namespace with `O(arity^2)` overloads of quantifiers, `capture`, `tryCapture`, etc. This could be detrimental to the usefulness of code completion. -Another reason to use types instead of free functions is consistency with existing result-buidler-based DSLs such as SwiftUI. +Another reason to use types instead of free functions is consistency with existing result-builder-based DSLs such as SwiftUI. [Declarative String Processing]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/DeclarativeStringProcessing.md [Strongly Typed Regex Captures]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/StronglyTypedCaptures.md From 4ba9fd38895dce43b78edbed3cc5432ffc70c499 Mon Sep 17 00:00:00 2001 From: Richard Wei Date: Wed, 16 Mar 2022 17:50:10 -0700 Subject: [PATCH 3/3] DSL pitch fixes. --- Documentation/Evolution/RegexBuilderDSL.md | 162 +++++++++++++++------ 1 file changed, 115 insertions(+), 47 deletions(-) diff --git a/Documentation/Evolution/RegexBuilderDSL.md b/Documentation/Evolution/RegexBuilderDSL.md index 5880701c5..3b6eaedc7 100644 --- a/Documentation/Evolution/RegexBuilderDSL.md +++ b/Documentation/Evolution/RegexBuilderDSL.md @@ -333,36 +333,43 @@ public enum RegexComponentBuilder { line: Int = #line, column: Int = #column ) -> Component - - /// Provides support for “if” statements in multi-statement closures, producing - /// conditional content for the “then” branch. - public static func buildEither( - first component: Component - ) -> Regex { - component - } - - /// Provides support for “if-else” statements in multi-statement closures, - /// producing conditional content for the “else” branch. - public static func buildEither( - second component: Component - ) -> Regex { - component - } } ``` When it comes to concatenation, `RegexComponentBuilder` utilizes the [recently proposed `buildPartialBlock` feature](https://forums.swift.org/t/pitch-buildpartialblock-for-result-builders/55561/1) to be able to concatenate all components' capture types to a single result tuple. `buildPartialBlock(first:)` provides support for creating a regex from a single component, and `buildPartialBlock(accumulated:next:)` support for creating a regex from multiple results. -Before Swift supports variadic generics, `buildPartialBlock(accumulated:next:)` must be overloaded to support concatenating regexes of supported capture quantities (arities). Due to the need for concatenating any pair of regexes that make up 10 captures, `buildPartialBlock(accumulated:next:)` is overloaded up to `arity^2` times. +Before Swift supports variadic generics, `buildPartialBlock(first:)` and `buildPartialBlock(accumulated:next:)` must be overloaded to support concatenating regexes of supported capture quantities (arities). +- `buildPartialBlock(first:)` is overloaded `arity` times such that a unary block with a component of any supported capture arity will produce a regex with capture type `Substring` followed by the component's capture types. The base overload, `buildPartialBlock(first:) -> Regex`, must be marked with `@_disfavoredOverload` to prevent it from shadowing other overloads. +- `buildPartialBlock(accumulated:next:)` is overloaded up to `arity^2` times to account for all possible pairs of regexes that make up 10 captures. -In the initial version of the DSL, we plan to support regexes with up to 10 captures, as 10 captures are sufficient for most use cases. These overloads can be superceded by a variadic version of `buildPartialBlock(accumulated:next:)` in a future release. +In the initial version of the DSL, we plan to support regexes with up to 10 captures, as 10 captures are sufficient for most use cases. These overloads can be superceded by variadic versions of `buildPartialBlock(first:)` and `buildPartialBlock(accumulated:next:)` in a future release. ```swift extension RegexComponentBuilder { + // The following builder methods implement what would be possible with + // variadic generics (using imaginary syntax) as a single method: + // + // public static func buildPartialBlock< + // R, WholeMatch, Capture... + // >( + // first component: Component + // ) -> Regex<(Substring, Capture...)> + // where Component.Output == (WholeMatch, Capture...), + + @_disfavoredOverload public static func buildPartialBlock( - first r: Compoment - ) -> Regex + first r: Component + ) -> Regex + + public static func buildPartialBlock( + first r: Component + ) -> Regex<(Substring, C0)> where R.Output == (W, C0) + + public static func buildPartialBlock( + first r: Component + ) -> Regex<(Substring, C0, C1)> where R.Output == (W, C0, C1) + + // ... `O(arity)` overloads of `buildPartialBlock(first:)` // The following builder methods implement what would be possible with // variadic generics (using imaginary syntax) as a single method: @@ -372,18 +379,18 @@ extension RegexComponentBuilder { // AccumulatedCapture..., NextCapture..., // Accumulated: RegexComponent, Next: RegexComponent // >( - // accumulated: Accumulated, next: Next + // accumulated: Accumulated, next: Component // ) -> Regex<(Substring, AccumulatedCapture..., NextCapture...)> // where Accumulated.Output == (AccumulatedWholeMatch, AccumulatedCapture...), // Next.Output == (NextWholeMatch, NextCapture...) public static func buildPartialBlock( accumulated: R0, next: Component - ) -> Regex<(Substring, C0)> where R0.Output == W0, R1.Output == (W1, C0) + ) -> Regex<(Substring, C0)> where R0.Output == W0, R1.Output == (W1, C0) public static func buildPartialBlock( accumulated: R0, next: Component - ) -> Regex<(Substring, C0, C1)> where R0.Output == W0, R1.Output == (W1, C0, C1) + ) -> Regex<(Substring, C0, C1)> where R0.Output == W0, R1.Output == (W1, C0, C1) public static func buildPartialBlock( accumulated: R0, next: Component @@ -393,10 +400,68 @@ extension RegexComponentBuilder { } ``` -To support `if` statements, `buildOptional(_:)` is defined with overloads to support up to 10 captures because each capture type needs to be transformed to an optional. The overload for non-capturing regexes, due to the lack of generic constraints, must be annotated with `@_disfavoredOverload` in order not to become the default choice by the compiler. We expect that a variadic-generic version of this method will eventually superceded all of these overloads. +To support `if` statements, `buildEither(first:)`, `buildEither(second:)` and `buildOptional(_:)` are defined with overloads to support up to 10 captures because each capture type needs to be transformed to an optional. The overload for non-capturing regexes, due to the lack of generic constraints, must be annotated with `@_disfavoredOverload` in order not shadow other overloads. We expect that a variadic-generic version of this method will eventually superseded all of these overloads. ```swift extension RegexComponentBuilder { + // The following builder methods implement what would be possible with + // variadic generics (using imaginary syntax) as a single method: + // + // public static func buildEither< + // Component, WholeMatch, Capture... + // >( + // first component: Component + // ) -> Regex<(Substring, Capture...)> + // where Component.Output == (WholeMatch, Capture...) + + public static func buildEither( + first component: Component + ) -> Regex { + component + } + + public static func buildEither( + first component: Component + ) -> Regex<(Substring, C0)> where R.Output == (W, C0) { + component + } + + public static func buildEither( + first component: Component + ) -> Regex<(Substring, C0, C1)> where R.Output == (W, C0, C1) { + component + } + + // The following builder methods implement what would be possible with + // variadic generics (using imaginary syntax) as a single method: + // + // public static func buildEither< + // Component, WholeMatch, Capture... + // >( + // second component: Component + // ) -> Regex<(Substring, Capture...)> + // where Component.Output == (WholeMatch, Capture...) + + public static func buildEither( + second component: Component + ) -> Regex { + component + } + + public static func buildEither( + second component: Component + ) -> Regex<(Substring, C0)> where R.Output == (W, C0) { + component + } + + public static func buildEither( + second component: Component + ) -> Regex<(Substring, C0, C1)> where R.Output == (W, C0, C1) { + component + } + + // ... `O(arity)` overloads of `buildEither(_:)` + // The following builder methods implement what would be possible with // variadic generics (using imaginary syntax) as a single method: // @@ -420,10 +485,6 @@ extension RegexComponentBuilder { ) -> Regex<(Substring, C0?, C1?)> // ... `O(arity)` overloads of `buildOptional(_:)` - - public static func buildOptional( - _ component: Component? - ) -> Regex<(Substring, C0?, C1?, C2?, C3?, C4?, C5?, C6?, C7?, C8, C9?)> where R.Output == (W, C0, C1, C2, C3, C4, C5, C6, C7, C8, C9) } ``` @@ -454,10 +515,6 @@ extension RegexComponentBuilder { ) -> Regex<(Substring, C0?, C1?)> // ... `O(arity)` overloads of `buildLimitedAvailability(_:)` - - public static func buildLimitedAvailability( - _ component: Component - ) -> Regex<(Substring, C0?, C1?, C2?, C3?, C4?, C5?, C6?, C7?, C8, C9?)> where R.Output == (W, C0, C1, C2, C3, C4, C5, C6, C7, C8, C9) } ``` @@ -498,15 +555,7 @@ public struct ChoiceOf: RegexComponent { ```swift @resultBuilder public enum AlternationBuilder { - /// A builder component that stores a regex component and its source location - /// for debugging purposes. - public struct Component { - public var value: Value - public var file: String - public var function: String - public var line: Int - public var column: Int - } + public typealias Component = RegexComponentBuilder.Component /// Returns a component by wrapping the component regex in `Component` and /// recording its source location. @@ -518,9 +567,28 @@ public enum AlternationBuilder { column: Int = #column ) -> Component + // The following builder methods implement what would be possible with + // variadic generics (using imaginary syntax) as a single method: + // + // public static func buildPartialBlock< + // R, WholeMatch, Capture... + // >( + // first component: Component + // ) -> Regex<(Substring, Capture?...)> + // where Component.Output == (WholeMatch, Capture...), + + @_disfavoredOverload public static func buildPartialBlock( - first: Component - ) -> Regex + first r: Component + ) -> Regex + + public static func buildPartialBlock( + first r: Component + ) -> Regex<(Substring, C0?)> where R.Output == (W, C0) + + public static func buildPartialBlock( + first r: Component + ) -> Regex<(Substring, C0?, C1?)> where R.Output == (W, C0, C1) // The following builder methods implement what would be possible with // variadic generics (using imaginary syntax) as a single method: @@ -970,7 +1038,7 @@ extension Repeat { _ behavior: QuantificationBehavior = .eagerly ) where Output == (Substring, C0), - Compoment.Output == (Substring, C0), + Component.Output == (Substring, C0), R.Bound == Int public init( @@ -979,7 +1047,7 @@ extension Repeat { @RegexComponentBuilder _ component: () -> Component ) where Output == (Substring, C0), - Compoment.Output == (Substring, C0), + Component.Output == (Substring, C0), R.Bound == Int public init( @@ -1118,7 +1186,7 @@ let regex = Regex { } ``` -Variants of `capture` and `tryCapture` accept a `Reference` argument. References can be used to achieve named captures and named backreferences from textual regexes. +Variants of `Capture` and `TryCapture` accept a `Reference` argument. References can be used to achieve named captures and named backreferences from textual regexes. ```swift public struct Reference: RegexComponent { @@ -1153,7 +1221,7 @@ A regex is considered invalid when it contains a use of reference without it eve In textual regex, one can refer to a subpattern to avoid duplicating the subpattern, for example: ``` -(you|I) say (goodbye|hello); (?0) say (?1) +(you|I) say (goodbye|hello); (?1) say (?2) ``` The above regex is equivalent to