You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Documentation/Evolution/RegexSyntax.md
+19-18Lines changed: 19 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -83,7 +83,7 @@ A regex literal may be prefixed with a sequence of [global matching options](#pc
83
83
84
84
Alternatives are a series of expressions concatenated together. The concatentation ends with either a `|` denoting the end of the alternative or a `)` denoting the end of a recursively parsed group.
85
85
86
-
Alternation has a lower precedence than concatenation or other operations, so e.g `abc|def` matches against `abc` or `def`..
86
+
Alternation has a lower precedence than concatenation or other operations, so e.g `abc|def` matches against `abc` or `def`.
87
87
88
88
### Concatenated subexpressions
89
89
@@ -130,9 +130,9 @@ Subexpressions can be quantified, meaning they will be repeated some number of t
130
130
131
131
Behavior can further be refined by a subsequent `?` or `+`:
132
132
133
-
-`x*`_eager_: consume as much of input as possible
134
-
-`x*?`_reluctant_: consume as little of the input as possible
135
-
-`x*+`: _possessive_: eager and never relinquishes any input consumed
133
+
-`x*`_eager_: consume as much of input as possible.
134
+
-`x*?`_reluctant_: consume as little of the input as possible.
135
+
-`x*+`: _possessive_: eager and never relinquishes any input consumed.
136
136
137
137
### Atoms
138
138
@@ -187,7 +187,7 @@ These escape sequences each denote a specific scalar value.
187
187
-`\f`: The form-feed character `U+C`.
188
188
-`\n`: The newline character `U+A`.
189
189
-`\r`: The carriage return character `U+D`.
190
-
-`\t`: The tab character `U+9`
190
+
-`\t`: The tab character `U+9`.
191
191
192
192
#### Builtin character classes
193
193
@@ -218,12 +218,12 @@ Precise definitions of character classes is discussed in [Character Classes for
These sequences define a unicode scalar value using hexadecimal or octal notation
235
+
These sequences define a unicode scalar value using hexadecimal or octal notation.
236
236
237
237
`\x`, when not followed by any hexadecimal digit characters, is treated as `\0`, matching PCRE's behavior.
238
238
@@ -362,7 +362,7 @@ A script run e.g `(*script_run:...)` specifies that the contents must match agai
362
362
BalancingGroupBody -> Identifier? '-' Identifier
363
363
```
364
364
365
-
Introduced by .NET, balancing groups extend the `GroupNameBody` syntax to support the ability to refer to a prior group. Upon matching, the prior group is deleted, and any intermediate matched input becomes the capture of the current group.
365
+
Introduced by .NET, [balancing groups][balancing-groups] extend the `GroupNameBody` syntax to support the ability to refer to a prior group. Upon matching, the prior group is deleted, and any intermediate matched input becomes the capture of the current group.
366
366
367
367
#### Group numbering
368
368
@@ -467,9 +467,9 @@ We support all the matching options accepted by PCRE, ICU, and Oniguruma. In add
467
467
468
468
These options are specific to the Swift regex matching engine and control the semantic level at which matching takes place.
A reference is an abstract identifier for a particular capturing group in a regular expression. It can either be named or numbered, and in the latter case may be specified relative to the current group. For example `-2` refers to the capture group `N - 2` where `N` is the number of the next capture group. References may refer to groups ahead of the current position e.g `+3`, or the name of a future group. These may be useful in recursive cases where the group being referenced has been matched in a prior iteration.
484
484
485
-
A backreference may optionally include a recursion level in certain cases, which is a syntactic element inherited from Oniguruma that allows the reference to specify a capture relative to a given recursion level.
485
+
A backreference may optionally include a recursion level in certain cases, which is a syntactic element inherited [from Oniguruma][oniguruma-syntax] that allows the reference to specify a capture relative to a given recursion level.
An absent function is an Oniguruma feature that allows for the easy inversion of a given pattern. There are 4 variants of the syntax:
642
+
An absent function is an [Oniguruma][oniguruma-syntax] feature that allows for the easy inversion of a given pattern. There are 4 variants of the syntax:
643
643
644
644
-`(?~|absent|expr)`: Absent expression, which attempts to match against `expr`, but is limited by the range that is not matched by `absent`.
645
645
-`(?~absent)`: Absent repeater, which matches against any input not matched by `absent`. Equivalent to `(?~|absent|\O*)`.
@@ -848,3 +848,4 @@ Note that this proposal regards _syntactic_ support, and does not necessarily me
0 commit comments