Skip to content

Commit a206f99

Browse files
committed
Update RegexSyntax.md
1 parent bafe039 commit a206f99

File tree

1 file changed

+19
-18
lines changed

1 file changed

+19
-18
lines changed

Documentation/Evolution/RegexSyntax.md

Lines changed: 19 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ A regex literal may be prefixed with a sequence of [global matching options](#pc
8383

8484
Alternatives are a series of expressions concatenated together. The concatentation ends with either a `|` denoting the end of the alternative or a `)` denoting the end of a recursively parsed group.
8585

86-
Alternation has a lower precedence than concatenation or other operations, so e.g `abc|def` matches against `abc` or `def`..
86+
Alternation has a lower precedence than concatenation or other operations, so e.g `abc|def` matches against `abc` or `def`.
8787

8888
### Concatenated subexpressions
8989

@@ -130,9 +130,9 @@ Subexpressions can be quantified, meaning they will be repeated some number of t
130130

131131
Behavior can further be refined by a subsequent `?` or `+`:
132132

133-
- `x*` _eager_: consume as much of input as possible
134-
- `x*?` _reluctant_: consume as little of the input as possible
135-
- `x*+`: _possessive_: eager and never relinquishes any input consumed
133+
- `x*` _eager_: consume as much of input as possible.
134+
- `x*?` _reluctant_: consume as little of the input as possible.
135+
- `x*+`: _possessive_: eager and never relinquishes any input consumed.
136136

137137
### Atoms
138138

@@ -187,7 +187,7 @@ These escape sequences each denote a specific scalar value.
187187
- `\f`: The form-feed character `U+C`.
188188
- `\n`: The newline character `U+A`.
189189
- `\r`: The carriage return character `U+D`.
190-
- `\t`: The tab character `U+9`
190+
- `\t`: The tab character `U+9`.
191191

192192
#### Builtin character classes
193193

@@ -218,12 +218,12 @@ Precise definitions of character classes is discussed in [Character Classes for
218218

219219
```
220220
UnicodeScalar -> '\u{' HexDigit{1...} '}'
221-
| '\u' HexDigit{4}
222-
| '\x{' HexDigit{1...} '}'
223-
| '\x' HexDigit{0...2}
224-
| '\U' HexDigit{8}
225-
| '\o{' OctalDigit{1...} '}'
226-
| '\0' OctalDigit{0...3}
221+
| '\u' HexDigit{4}
222+
| '\x{' HexDigit{1...} '}'
223+
| '\x' HexDigit{0...2}
224+
| '\U' HexDigit{8}
225+
| '\o{' OctalDigit{1...} '}'
226+
| '\0' OctalDigit{0...3}
227227
228228
HexDigit -> [0-9a-zA-Z]
229229
OctalDigit -> [0-7]
@@ -232,7 +232,7 @@ NamedScalar -> '\N{' ScalarName '}'
232232
ScalarName -> 'U+' HexDigit{1...8} | [\s\w-]+
233233
```
234234

235-
These sequences define a unicode scalar value using hexadecimal or octal notation
235+
These sequences define a unicode scalar value using hexadecimal or octal notation.
236236

237237
`\x`, when not followed by any hexadecimal digit characters, is treated as `\0`, matching PCRE's behavior.
238238

@@ -362,7 +362,7 @@ A script run e.g `(*script_run:...)` specifies that the contents must match agai
362362
BalancingGroupBody -> Identifier? '-' Identifier
363363
```
364364

365-
Introduced by .NET, balancing groups extend the `GroupNameBody` syntax to support the ability to refer to a prior group. Upon matching, the prior group is deleted, and any intermediate matched input becomes the capture of the current group.
365+
Introduced by .NET, [balancing groups][balancing-groups] extend the `GroupNameBody` syntax to support the ability to refer to a prior group. Upon matching, the prior group is deleted, and any intermediate matched input becomes the capture of the current group.
366366

367367
#### Group numbering
368368

@@ -467,9 +467,9 @@ We support all the matching options accepted by PCRE, ICU, and Oniguruma. In add
467467

468468
These options are specific to the Swift regex matching engine and control the semantic level at which matching takes place.
469469

470-
- `X`: Grapheme cluster matching
471-
- `u`: Unicode scalar matching
472-
- `b`: Byte matching
470+
- `X`: Grapheme cluster matching.
471+
- `u`: Unicode scalar matching.
472+
- `b`: Byte matching.
473473

474474
### References
475475

@@ -482,7 +482,7 @@ RecursionLevel -> '+' <Int> | '-' <Int>
482482

483483
A reference is an abstract identifier for a particular capturing group in a regular expression. It can either be named or numbered, and in the latter case may be specified relative to the current group. For example `-2` refers to the capture group `N - 2` where `N` is the number of the next capture group. References may refer to groups ahead of the current position e.g `+3`, or the name of a future group. These may be useful in recursive cases where the group being referenced has been matched in a prior iteration.
484484

485-
A backreference may optionally include a recursion level in certain cases, which is a syntactic element inherited from Oniguruma that allows the reference to specify a capture relative to a given recursion level.
485+
A backreference may optionally include a recursion level in certain cases, which is a syntactic element inherited [from Oniguruma][oniguruma-syntax] that allows the reference to specify a capture relative to a given recursion level.
486486

487487
#### Backreferences
488488

@@ -639,7 +639,7 @@ AbsentFunction -> '(?~' RegexNode ')'
639639
| '(?~|)'
640640
```
641641

642-
An absent function is an Oniguruma feature that allows for the easy inversion of a given pattern. There are 4 variants of the syntax:
642+
An absent function is an [Oniguruma][oniguruma-syntax] feature that allows for the easy inversion of a given pattern. There are 4 variants of the syntax:
643643

644644
- `(?~|absent|expr)`: Absent expression, which attempts to match against `expr`, but is limited by the range that is not matched by `absent`.
645645
- `(?~absent)`: Absent repeater, which matches against any input not matched by `absent`. Equivalent to `(?~|absent|\O*)`.
@@ -848,3 +848,4 @@ Note that this proposal regards _syntactic_ support, and does not necessarily me
848848
[unicode-prop-value-aliases]: https://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
849849
[unicode-scripts]: https://www.unicode.org/reports/tr24/#Script
850850
[unicode-script-extensions]: https://www.unicode.org/reports/tr24/#Script_Extensions
851+
[balancing-groups]: https://docs.microsoft.com/en-us/dotnet/standard/base-types/grouping-constructs-in-regular-expressions#balancing-group-definitions

0 commit comments

Comments
 (0)