@@ -24,14 +24,16 @@ Literals are tokens used in [literal expressions].
24
24
25
25
#### Characters and strings
26
26
27
- | | Example | ` # ` sets\* | Characters | Escapes |
28
- | ----------------------------------------------| -----------------| ------------| -------------| ---------------------|
29
- | [ Character] ( #character-literals ) | ` 'H' ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
30
- | [ String] ( #string-literals ) | ` "hello" ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
31
- | [ Raw string] ( #raw-string-literals ) | ` r#"hello"# ` | <256 | All Unicode | ` N/A ` |
32
- | [ Byte] ( #byte-literals ) | ` b'H' ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
33
- | [ Byte string] ( #byte-string-literals ) | ` b"hello" ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
34
- | [ Raw byte string] ( #raw-byte-string-literals ) | ` br#"hello"# ` | <256 | All ASCII | ` N/A ` |
27
+ | | Example | ` # ` sets\* | Characters | Escapes |
28
+ | ----------------------------------------------| -----------------| ------------| -----------------| ---------------------|
29
+ | [ Character] ( #character-literals ) | ` 'H' ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
30
+ | [ String] ( #string-literals ) | ` "hello" ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
31
+ | [ Raw string] ( #raw-string-literals ) | ` r#"hello"# ` | <256 | All Unicode | ` N/A ` |
32
+ | [ Byte] ( #byte-literals ) | ` b'H' ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
33
+ | [ Byte string] ( #byte-string-literals ) | ` b"hello" ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
34
+ | [ Raw byte string] ( #raw-byte-string-literals ) | ` br#"hello"# ` | <256 | All ASCII | ` N/A ` |
35
+ | [ C string] ( #c-string-literals ) | ` c"hello" ` | 0 | non-` NUL ` ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
36
+ | [ Raw C string] ( #raw-c-string-literals ) | ` cr#"hello"# ` | <256 | non-` NUL ` ASCII | ` N/A ` |
35
37
36
38
\* The number of ` # ` s on each side of the same literal must be equivalent.
37
39
@@ -328,6 +330,76 @@ b"\x52"; b"R"; br"R"; // R
328
330
b " \ \ x52" ; br " \x52" ; // \x52
329
331
```
330
332
333
+ ### C string and raw C string literals
334
+
335
+ #### C string literals
336
+
337
+ > ** <sup >Lexer</sup >** \
338
+ > C_STRING_LITERAL :\
339
+ >   ;  ; ` c" ` ( ASCII_FOR_C_STRING | BYTE_ESCAPE | STRING_CONTINUE )<sup >\* </sup > ` " ` SUFFIX<sup >?</sup >
340
+ >
341
+ > ASCII_FOR_C_STRING :\
342
+ >   ;  ; _ any non-NUL ASCII (i.e 0x01 to 0x7F), except_ ` " ` , ` \ ` _ and IsolatedCR_
343
+
344
+ A non-raw _ C string literal_ is a sequence of ASCII characters and _ escapes_ ,
345
+ preceded by the characters ` U+0063 ` (` c ` ) and ` U+0022 ` (double-quote), and
346
+ followed by the character ` U+0022 ` . If the character ` U+0022 ` is present within
347
+ the literal, it must be _ escaped_ by a preceding ` U+005C ` (` \ ` ) character.
348
+ Alternatively, a C string literal can be a _ raw C string literal_ , defined
349
+ below. The type of a C string literal is ` &core::ffi::CStr ` .
350
+
351
+ Some additional _ escapes_ are available in either C or non-raw C string
352
+ literals. An escape starts with a ` U+005C ` (` \ ` ) and continues with one of the
353
+ following forms:
354
+
355
+ * A _ byte escape_ escape starts with ` U+0078 ` (` x ` ) and is followed by exactly
356
+ two _ hex digits_ . It denotes the byte equal to the provided hex value. The
357
+ byte escape sequence ` \x00 ` is forbidden, as C strings may not contain ` NUL ` .
358
+ * A _ whitespace escape_ is one of the characters ` U+006E ` (` n ` ), ` U+0072 `
359
+ (` r ` ), or ` U+0074 ` (` t ` ), denoting the bytes values ` 0x0A ` (ASCII LF),
360
+ ` 0x0D ` (ASCII CR) or ` 0x09 ` (ASCII HT) respectively.
361
+ * The _ backslash escape_ is the character ` U+005C ` (` \ ` ) which must be
362
+ escaped in order to denote its ASCII encoding ` 0x5C ` .
363
+
364
+ #### Raw C string literals
365
+
366
+ > ** <sup >Lexer</sup >** \
367
+ > RAW_C_STRING_LITERAL :\
368
+ >   ;  ; ` cr ` RAW_C_STRING_CONTENT SUFFIX<sup >?</sup >
369
+ >
370
+ > RAW_C_STRING_CONTENT :\
371
+ >   ;  ;   ;  ; ` " ` ASCII_EXCEPT_NUL<sup >* (non-greedy)</sup > ` " ` \
372
+ >   ;  ; | ` # ` RAW_C_STRING_CONTENT ` # `
373
+ >
374
+ > ASCII_EXCEPT_NUL :\
375
+ >   ;  ; _ any non-NUL ASCII (i.e. 0x01 to 0x7F)_
376
+
377
+ Raw C string literals do not process any escapes. They start with the
378
+ character ` U+0063 ` (` c ` ), followed by ` U+0072 ` (` r ` ), followed by fewer than 256
379
+ of the character ` U+0023 ` (` # ` ), and a ` U+0022 ` (double-quote) character. The
380
+ _ raw string body_ can contain any sequence of non-` NUL ` ASCII characters and is terminated
381
+ only by another ` U+0022 ` (double-quote) character, followed by the same number of
382
+ ` U+0023 ` (` # ` ) characters that preceded the opening ` U+0022 ` (double-quote)
383
+ character. A raw C string literal can not contain any non-ASCII byte.
384
+
385
+ All characters contained in the raw string body represent their ASCII encoding,
386
+ the characters ` U+0022 ` (double-quote) (except when followed by at least as
387
+ many ` U+0023 ` (` # ` ) characters as were used to start the raw string literal) or
388
+ ` U+005C ` (` \ ` ) do not have any special meaning.
389
+
390
+ Examples for C string literals:
391
+
392
+ ``` rust
393
+ c " foo" ; cr " foo" ; // foo
394
+ c " \ " foo\ "" ; cr #"" foo "" #; // "foo"
395
+
396
+ c " foo #\ " # bar" ;
397
+ cr ##" foo #" # bar " ##; // foo #" # bar
398
+
399
+ c " \ x52 " ; c " R" ; cr " R" ; // R
400
+ c " \ \ x52" ; cr " \ x52 " ; // \x52
401
+ ```
402
+
331
403
### Number literals
332
404
333
405
A _ number literal_ is either an _ integer literal_ or a _ floating-point
0 commit comments