@@ -42,6 +42,15 @@ If you have suggestions to make, please try to focus them on *reductions* to
42
42
the language: possible features that can be combined or omitted. We aim to
43
43
keep the size and complexity of the language under control.
44
44
45
+ ** Note on grammar:** The grammar for Rust given in this document is rough and
46
+ very incomplete; only a modest number of sections have accompanying grammar
47
+ rules. Formalizing the grammar accepted by the Rust parser is ongoing work,
48
+ but future versions of this document will contain a complete
49
+ grammar. Moreover, we hope that this grammar will be be extracted and verified
50
+ as LL(1) by an automated grammar-analysis tool, and further tested against the
51
+ Rust sources. Preliminary versions of this automation exist, but are not yet
52
+ complete.
53
+
45
54
# Notation
46
55
47
56
Rust's grammar is defined over Unicode codepoints, each conventionally
@@ -81,13 +90,6 @@ Where:
81
90
82
91
This EBNF dialect should hopefully be familiar to many readers.
83
92
84
- The grammar for Rust given in this document is extracted and verified as LL(1)
85
- by an automated grammar-analysis tool, and further tested against the Rust
86
- sources. The generated parser is currently * not* the one used by the Rust
87
- compiler itself, but in the future we hope to relate the two together more
88
- precisely. As of this writing they are only related by testing against
89
- existing source code.
90
-
91
93
## Unicode productions
92
94
93
95
A small number of productions in Rust's grammar permit Unicode codepoints
@@ -917,7 +919,7 @@ In this example, `nonempty_list` is a predicate---it can be used in a
917
919
typestate constraint---but the auxiliary function ` pure_length ` is
918
920
not.
919
921
920
- * ToDo :* should actually define referential transparency.
922
+ * TODO :* should actually define referential transparency.
921
923
922
924
The effect checking rules previously enumerated are a restricted set of
923
925
typechecking rules meant to approximate the universe of observably
@@ -933,7 +935,7 @@ blocks, the compiler provides no static guarantee that the code will behave as
933
935
expected at runtime. Rather, the programmer has an independent obligation to
934
936
verify the semantics of the predicates they write.
935
937
936
- * ToDo :* last two sentences are vague.
938
+ * TODO :* last two sentences are vague.
937
939
938
940
An example of a predicate that uses an unchecked block:
939
941
@@ -1327,6 +1329,12 @@ declaring a function-local item.
1327
1329
1328
1330
#### Slot declarations
1329
1331
1332
+ ~~~~~~~~ {.ebnf .gram}
1333
+ let_decl : "let" pat [':' type ] ? [ init ] ? ';' ;
1334
+ init : [ '=' | '<-' ] expr ;
1335
+ ~~~~~~~~
1336
+
1337
+
1330
1338
A _ slot declaration_ has one one of two forms:
1331
1339
1332
1340
* ` let ` ` pattern ` ` optional-init ` ;
@@ -1382,6 +1390,12 @@ values.
1382
1390
1383
1391
### Record expressions
1384
1392
1393
+ ~~~~~~~~ {.ebnf .gram}
1394
+ rec_expr : '{' ident ':' expr
1395
+ [ ',' ident ':' expr ] *
1396
+ [ "with" expr ] '}'
1397
+ ~~~~~~~~
1398
+
1385
1399
A _ [ record] ( #record-types ) expression_ is one or more comma-separated
1386
1400
name-value pairs enclosed by braces. A fieldname can be any identifier
1387
1401
(including reserved words), and is separated from its value expression
@@ -1414,6 +1428,10 @@ let base = {x: 1, y: 2, z: 3};
1414
1428
1415
1429
### Field expressions
1416
1430
1431
+ ~~~~~~~~ {.ebnf .gram}
1432
+ field_expr : expr '.' expr
1433
+ ~~~~~~~~
1434
+
1417
1435
A dot can be used to access a field in a record.
1418
1436
1419
1437
~~~~~~~~ {.field}
@@ -1439,6 +1457,10 @@ expression on the left of the dot.
1439
1457
1440
1458
### Vector expressions
1441
1459
1460
+ ~~~~~~~~ {.ebnf .gram}
1461
+ vec_expr : '[' "mutable" ? [ expr [ ',' expr ] * ] ? ']'
1462
+ ~~~~~~~~
1463
+
1442
1464
A _ [ vector] ( #vector-types ) expression_ is written by enclosing zero or
1443
1465
more comma-separated expressions of uniform type in square brackets.
1444
1466
The keyword ` mutable ` can be written after the opening bracket to
@@ -1453,6 +1475,11 @@ When no mutability is specified, the vector is immutable.
1453
1475
1454
1476
### Index expressions
1455
1477
1478
+ ~~~~~~~~ {.ebnf .gram}
1479
+ idx_expr : expr '[' expr ']'
1480
+ ~~~~~~~~
1481
+
1482
+
1456
1483
[ Vector] ( #vector-types ) -typed expressions can be indexed by writing a
1457
1484
square-bracket-enclosed expression (the index) after them. When the
1458
1485
vector is mutable, the resulting _ lval_ can be assigned to.
@@ -1492,6 +1519,13 @@ operators, before the expression they apply to.
1492
1519
1493
1520
### Binary operator expressions
1494
1521
1522
+ ~~~~~~~~ {.ebnf .gram}
1523
+ binop_expr : expr binop expr ;
1524
+ ~~~~~~~~
1525
+
1526
+ Binary operators expressions are given in terms of
1527
+ [ operator precedence] ( #operator-precedence ) .
1528
+
1495
1529
#### Arithmetic operators
1496
1530
1497
1531
Binary arithmetic expressions require both their operands to be of the
@@ -1672,10 +1706,15 @@ as
1672
1706
== !=
1673
1707
&&
1674
1708
||
1709
+ = <- <->
1675
1710
~~~~
1676
1711
1677
1712
### Unary copy expressions
1678
1713
1714
+ ~~~~~~~~ {.ebnf .gram}
1715
+ copy_expr : "copy" expr ;
1716
+ ~~~~~~~~
1717
+
1679
1718
A _ unary copy expression_ consists of the unary ` copy ` operator applied to
1680
1719
some argument expression.
1681
1720
@@ -1684,8 +1723,8 @@ copies the resulting value, allocating any memory necessary to hold the new
1684
1723
copy.
1685
1724
1686
1725
[ Shared boxes] ( #shared-box-types ) (type ` @ ` ) are, as usual, shallow-copied, as
1687
- they may be cyclic. [ Unique boxes] ( unique-box-types ) , [ vectors ] ( #vector-types )
1688
- and similar unique types are deep-copied.
1726
+ they may be cyclic. [ Unique boxes] ( # unique-box-types) ,
1727
+ [ vectors ] ( #vector-types ) and similar unique types are deep-copied.
1689
1728
1690
1729
Since the binary [ assignment operator] ( #assignment-operator ) ` = ` performs a
1691
1730
copy implicitly, the unary copy operator is typically only used to cause an
@@ -1707,6 +1746,10 @@ assert v[0] == 1; // Original was not modified
1707
1746
1708
1747
### Unary move expressions
1709
1748
1749
+ ~~~~~~~~ {.ebnf .gram}
1750
+ move_expr : "move" expr ;
1751
+ ~~~~~~~~
1752
+
1710
1753
This is used to indicate that the referenced _ lval_ must be moved out,
1711
1754
rather than copied, when evaluating this expression. It will only have
1712
1755
an effect when the expression is _ stored_ somewhere or passed to a
@@ -1796,6 +1839,11 @@ way.
1796
1839
1797
1840
### While expressions
1798
1841
1842
+ ~~~~~~~~ {.ebnf .gram}
1843
+ while_expr : "while" expr '{' block '}'
1844
+ | "do" '{' block '}' "while" expr ;
1845
+ ~~~~~~~~
1846
+
1799
1847
A ` while ` expression is a loop construct. A ` while ` loop may be either a
1800
1848
simple ` while ` or a ` do ` -` while ` loop.
1801
1849
@@ -1813,7 +1861,7 @@ loop body. If it evaluates to `false`, control exits the loop.
1813
1861
An example of a simple ` while ` expression:
1814
1862
1815
1863
~~~~
1816
- while ( i < 10) {
1864
+ while i < 10 {
1817
1865
print("hello\n");
1818
1866
i = i + 1;
1819
1867
}
@@ -1825,17 +1873,25 @@ An example of a `do`-`while` expression:
1825
1873
do {
1826
1874
print("hello\n");
1827
1875
i = i + 1;
1828
- } while ( i < 10) ;
1876
+ } while i < 10;
1829
1877
~~~~
1830
1878
1831
1879
1832
1880
### Break expressions
1833
1881
1882
+ ~~~~~~~~ {.ebnf .gram}
1883
+ break_expr : "break" ;
1884
+ ~~~~~~~~
1885
+
1834
1886
Executing a ` break ` expression immediately terminates the innermost loop
1835
1887
enclosing it. It is only permitted in the body of a loop.
1836
1888
1837
1889
### Continue expressions
1838
1890
1891
+ ~~~~~~~~ {.ebnf .gram}
1892
+ break_expr : "cont" ;
1893
+ ~~~~~~~~
1894
+
1839
1895
Evaluating a ` cont ` expression immediately terminates the current iteration of
1840
1896
the innermost loop enclosing it, returning control to the loop * head* . In the
1841
1897
case of a ` while ` loop, the head is the conditional expression controlling the
@@ -1847,6 +1903,10 @@ A `cont` expression is only permitted in the body of a loop.
1847
1903
1848
1904
### For expressions
1849
1905
1906
+ ~~~~~~~~ {.ebnf .gram}
1907
+ for_expr : "for" pat "in" expr '{' block '}' ;
1908
+ ~~~~~~~~
1909
+
1850
1910
A _ for loop_ is controlled by a vector or string. The for loop bounds-checks
1851
1911
the underlying sequence * once* when initiating the loop, then repeatedly
1852
1912
executes the loop body with the loop variable referencing the successive
@@ -1865,6 +1925,14 @@ for e: foo in v {
1865
1925
1866
1926
### If expressions
1867
1927
1928
+ ~~~~~~~~ {.ebnf .gram}
1929
+ if_expr : "if" expr '{' block '}'
1930
+ [ "else" else_tail ] ? ;
1931
+
1932
+ else_tail : "else" [ if_expr
1933
+ | '{' block '} ] ;
1934
+ ~~~~~~~~
1935
+
1868
1936
An ` if ` expression is a conditional branch in program control. The form of
1869
1937
an ` if ` expression is a condition expression, followed by a consequent
1870
1938
block, any number of ` else if ` conditions and blocks, and an optional
@@ -1879,6 +1947,15 @@ then any `else` block is executed.
1879
1947
1880
1948
### Alternative expressions
1881
1949
1950
+ ~~~~~~~~ {.ebnf .gram}
1951
+ alt_expr : "alt" expr '{' alt_arm [ '|' alt_arm ] * '}' ;
1952
+
1953
+ alt_arm : alt_pat '{' block '}' ;
1954
+
1955
+ alt_pat : pat [ "to" pat ] ? [ "if" expr ] ;
1956
+ ~~~~~~~~
1957
+
1958
+
1882
1959
An ` alt ` expression branches on a * pattern* . The exact form of matching that
1883
1960
occurs depends on the pattern. Patterns consist of some combination of
1884
1961
literals, destructured tag constructors, records and tuples, variable binding
@@ -1971,13 +2048,21 @@ let message = alt maybe_digit {
1971
2048
1972
2049
### Fail expressions
1973
2050
2051
+ ~~~~~~~~ {.ebnf .gram}
2052
+ fail_expr : "fail" expr ? ;
2053
+ ~~~~~~~~
2054
+
1974
2055
Evaluating a ` fail ` expression causes a task to enter the * failing* state. In
1975
2056
the * failing* state, a task unwinds its stack, destroying all frames and
1976
2057
freeing all resources until it reaches its entry frame, at which point it
1977
2058
halts execution in the * dead* state.
1978
2059
1979
2060
### Note expressions
1980
2061
2062
+ ~~~~~~~~ {.ebnf .gram}
2063
+ note_expr : "note" expr ;
2064
+ ~~~~~~~~
2065
+
1981
2066
** Note: Note expressions are not yet supported by the compiler.**
1982
2067
1983
2068
A ` note ` expression has no effect during normal execution. The purpose of a
@@ -2023,6 +2108,10 @@ expression.
2023
2108
2024
2109
### Return expressions
2025
2110
2111
+ ~~~~~~~~ {.ebnf .gram}
2112
+ ret_expr : "ret" expr ? ;
2113
+ ~~~~~~~~
2114
+
2026
2115
Return expressions are denoted with the keyword ` ret ` . Evaluating a ` ret `
2027
2116
expression^[ footnote{A ` ret ` expression is analogous to a ` return ` expression
2028
2117
in the C family.] moves its argument into the output slot of the current
@@ -2042,6 +2131,10 @@ fn max(a: int, b: int) -> int {
2042
2131
2043
2132
### Log expressions
2044
2133
2134
+ ~~~~~~~~ {.ebnf .gram}
2135
+ log_expr : "log" '(' level ',' expr ')' ;
2136
+ ~~~~~~~~
2137
+
2045
2138
Evaluating a ` log ` expression may, depending on runtime configuration, cause a
2046
2139
value to be appended to an internal diagnostic logging buffer provided by the
2047
2140
runtime or emitted to a system console. Log expressions are enabled or
@@ -2094,6 +2187,10 @@ when it is changed.
2094
2187
2095
2188
### Check expressions
2096
2189
2190
+ ~~~~~~~~ {.ebnf .gram}
2191
+ check_expr : "check" call_expr ;
2192
+ ~~~~~~~~
2193
+
2097
2194
A ` check ` expression connects dynamic assertions made at run-time to the
2098
2195
static [ typestate system] ( #typestate-system ) . A ` check ` expression takes a
2099
2196
constraint to check at run-time. If the constraint holds at run-time, control
@@ -2134,13 +2231,21 @@ fn test() {
2134
2231
2135
2232
** Note: Prove expressions are not yet supported by the compiler.**
2136
2233
2234
+ ~~~~~~~~ {.ebnf .gram}
2235
+ prove_expr : "prove" call_expr ;
2236
+ ~~~~~~~~
2237
+
2137
2238
A ` prove ` expression has no run-time effect. Its purpose is to statically
2138
2239
check (and document) that its argument constraint holds at its expression
2139
2240
entry point. If its argument typestate does not hold, under the typestate
2140
2241
algorithm, the program containing it will fail to compile.
2141
2242
2142
2243
### Claim expressions
2143
2244
2245
+ ~~~~~~~~ {.ebnf .gram}
2246
+ claim_expr : "claim" call_expr ;
2247
+ ~~~~~~~~
2248
+
2144
2249
A ` claim ` expression is an unsafe variant on a ` check ` expression that is not
2145
2250
actually checked at runtime. Thus, using a ` claim ` implies a proof obligation
2146
2251
to ensure---without compiler assistance---that an assertion always holds.
@@ -2183,6 +2288,10 @@ if check even(x) {
2183
2288
2184
2289
### Assert expressions
2185
2290
2291
+ ~~~~~~~~ {.ebnf .gram}
2292
+ assert_expr : "assert" expr ;
2293
+ ~~~~~~~~
2294
+
2186
2295
An ` assert ` expression is similar to a ` check ` expression, except
2187
2296
the condition may be any boolean-typed expression, and the compiler makes no
2188
2297
use of the knowledge that the condition holds if the program continues to
0 commit comments