@@ -512,7 +512,7 @@ of St. Andrews (St. Andrews, Fife, UK).
512
512
Additional specific influences can be seen from the following languages:
513
513
@itemize
514
514
@item The structural algebraic types and compilation manager of SML.
515
- @item The syntax-extension systems of Camlp4 and the Common Lisp readtable.
515
+ @c @ item The syntax-extension systems of Camlp4 and the Common Lisp readtable.
516
516
@item The deterministic destructor system of C++.
517
517
@end itemize
518
518
@@ -599,12 +599,12 @@ U+0009 (tab, @code{'\t'}), U+000A (LF, @code{'\n'}), U+000D (CR, @code{'\r'}).
599
599
A @dfn {single-line comment } is any sequence of Unicode characters beginning
600
600
with U+002F U+002F (@code {"//" }) and extending to the next U+000A character,
601
601
@emph {excluding } cases in which such a sequence occurs within a string literal
602
- token or a syntactic extension token .
602
+ token.
603
603
604
604
A @dfn {multi-line comments } is any sequence of Unicode characters beginning
605
605
with U+002F U+002A (@code {"/*" }) and ending with U+002A U+002F (@code {"*/" }),
606
606
@emph {excluding } cases in which such a sequence occurs within a string literal
607
- token or a syntactic extension token . Multi-line comments may be nested.
607
+ token. Multi-line comments may be nested.
608
608
609
609
@node Ref.Lex.Ident
610
610
@subsection Ref.Lex.Ident
@@ -875,11 +875,11 @@ escaped in order to denote @emph{itself}.
875
875
@c * Ref.Lex.Syntax:: Syntactic extension tokens.
876
876
877
877
Syntactic extensions are marked with the @emph {pound } sigil U+0023 (@code {# }),
878
- followed by a qualified name of a compile-time imported module item, an
879
- optional parenthesized list of @emph { parsed expressions }, and an optional
880
- brace-enclosed region of free-form text (with brace-matching and
881
- brace-escaping used to determine the limit of the
882
- region). @xref { Ref.Comp.Syntax }.
878
+ followed by an identifier, one of @code { fmt }, @code { env },
879
+ @code { concat_idents }, @code { ident_to_str }, @code { log_syntax }, @code { macro }, or
880
+ the name of a user-defined macro. This is followed by a vector literal. (Its
881
+ value will be interpreted syntactically; in particular, it need not be
882
+ well-typed.)
883
883
884
884
@emph {TODO: formalize those terms more }.
885
885
@@ -1039,7 +1039,6 @@ Compilation Manager, a @emph{unit} in the Owens and Flatt module system, or a
1039
1039
@itemize
1040
1040
@item Metadata about the crate, such as author, name, version, and copyright.
1041
1041
@item The source-file and directory modules that make up the crate.
1042
- @item The set of syntax extensions to enable for the crate.
1043
1042
@item Any external crates or native modules that the crate imports to its top level.
1044
1043
@item The organization of the crate's internal namespace.
1045
1044
@item The set of names exported from the crate.
@@ -1086,11 +1085,13 @@ or Mach-O. The loadable object contains extensive DWARF metadata, describing:
1086
1085
derived from the same @code {use } directives that guided compile-time imports.
1087
1086
@end itemize
1088
1087
1089
- The @code {syntax } directives of a crate are similar to the @code {use }
1090
- directives, except they govern the syntax extension namespace (accessed
1091
- through the syntax-extension sigil @code {# }, @pxref {Ref.Comp.Syntax })
1092
- available only at compile time. A @code {syntax } directive also makes its
1093
- extension available to all subsequent directives in the crate file.
1088
+ @c This might come along sometime in the future.
1089
+
1090
+ @c The @code{syntax} directives of a crate are similar to the @code{use}
1091
+ @c directives, except they govern the syntax extension namespace (accessed
1092
+ @c through the syntax-extension sigil @code{#}, @pxref{Ref.Comp.Syntax})
1093
+ @c available only at compile time. A @code{syntax} directive also makes its
1094
+ @c extension available to all subsequent directives in the crate file.
1094
1095
1095
1096
An example of a crate:
1096
1097
@@ -1104,9 +1105,6 @@ meta (author = "Jane Doe",
1104
1105
// Import a module.
1105
1106
use std (ver = "1.0");
1106
1107
1107
- // Activate a syntax-extension.
1108
- syntax re;
1109
-
1110
1108
// Define some modules.
1111
1109
mod foo = "foo.rs";
1112
1110
mod bar @{
@@ -1123,8 +1121,8 @@ mod bar @{
1123
1121
1124
1122
In a crate, a @code {meta } directive associates free form key-value metadata
1125
1123
with the crate. This metadata can, in turn, be used in providing partial
1126
- matching parameters to syntax-extension loading and crate importing
1127
- directives, denoted by @code { syntax } and @code { use } keywords respectively .
1124
+ matching parameters to crate importing directives, denoted by the @code { use }
1125
+ keyword .
1128
1126
1129
1127
Alternatively, metadata can serve as a simple form of documentation.
1130
1128
@@ -1133,49 +1131,76 @@ Alternatively, metadata can serve as a simple form of documentation.
1133
1131
@c * Ref.Comp.Syntax:: Syntax extension.
1134
1132
@cindex Syntax extension
1135
1133
1134
+ @c , statement or item
1136
1135
Rust provides a notation for @dfn {syntax extension }. The notation is a marked
1137
- syntactic form that can appear as an expression, statement or item in the body
1138
- of a Rust program, or as a directive in a Rust crate, and which causes the
1139
- text enclosed within the marked form to be translated through a named
1140
- extension function loaded into the compiler at compile-time.
1141
-
1142
- The compile-time extension function must return a value of the corresponding
1143
- Rust AST type, either an expression node, a statement node or an item
1144
- node. @footnote {The syntax-extension system is analogous to the extensible
1145
- reader system provided by Lisp @emph {readtables }, or the Camlp4 system of
1146
- Objective Caml. } @xref {Ref.Lex.Syntax }.
1147
-
1148
- A syntax extension is enabled by a @code {syntax } directive, which must occur
1149
- in a crate file. When the Rust compiler encounters a @code {syntax } directive
1150
- in a crate file, it immediately loads the named syntax extension, and makes it
1151
- available for all subsequent crate directives within the enclosing block scope
1152
- of the crate file, and all Rust source files referenced as modules from the
1153
- enclosing block scope of the crate file.
1154
-
1155
- For example, this extension might provide a syntax for regular
1156
- expression literals:
1136
+ syntactic form that can appear as an expression in the body of a Rust
1137
+ program. Syntax extensions make use of bracketed lists, which are
1138
+ syntactically vector literals, but which have no run-time semantics. After
1139
+ parsing, the notation is translated into Rust expressions. The name of the
1140
+ extension determines the translation performed. The name may be one of the
1141
+ built-in extensions listed below, or a user-defined extension, defined using
1142
+ @code {macro }.
1157
1143
1158
- @example
1159
- // In a crate file:
1144
+ @itemize
1145
+ @item @code {fmt } expands into code to produce a formatted string, similar to
1146
+ @code {printf } from C.
1147
+ @item @code {env } expands into a string literal containing the value of that
1148
+ environment variable at compile-time.
1149
+ @item @code {concat_idents } expands into an identifier which is the
1150
+ concatenation of its arguments.
1151
+ @item @code {ident_to_str } expands into a string literal containing the name of
1152
+ its argument (which must be a literal).
1153
+ @item @code {log_syntax } causes the compiler to pretty-print its arguments.
1154
+ @end itemize
1160
1155
1161
- // Requests the 're' syntax extension from the compilation environment.
1162
- syntax re;
1156
+ Finally, @code {macro } is used to define a new macro. A macro can abstract over
1157
+ second-class Rust concepts that are present in syntax. The arguments to
1158
+ @code {macro } are a bracketed list of pairs (two-element lists). The pairs
1159
+ consist of an invocation and the syntax to expand into. An example:
1163
1160
1164
- // Also declares an import dependency on the module 're'.
1165
- use re;
1161
+ @example
1162
+ #macro[[#apply[fn, [args, ...]], fn(args, ...)]];
1163
+ @end example
1166
1164
1167
- // Reference to a Rust source file as a module in the crate.
1168
- mod foo = "foo.rs";
1165
+ In this case, the invocation @code {#apply[sum , 5 , 8 , 6] } expands to
1166
+ @code {sum(5 ,8 ,6) }. If @code {... } follows an expression (which need not be as
1167
+ simple as a single identifier) in the input syntax, the matcher will expect an
1168
+ arbitrary number of occurences of the thing preceeding it, and bind syntax to
1169
+ the identifiers it contains. If it follows an expression in the output syntax,
1170
+ it will transcribe that expression repeatedly, according to the identifiers
1171
+ (bound to syntax) that it contains.
1169
1172
1170
- @dots {}
1173
+ The behavior of @code {... } is known as Macro By Example. It allows you to
1174
+ write a macro with arbitrary repetition by specifying only one case of that
1175
+ repetition, and following it by @code {... }, both where the repeated input is
1176
+ matched, and where the repeated output must be transcribed. A more
1177
+ sophisticated example:
1171
1178
1172
- // In the source file "foo.rs", use the #re syntax extension and
1173
- // the re module at run-time.
1174
- let s: str = get_string() ;
1175
- let pattern: regex = #re.pat @{ aa+b? @} ;
1176
- let matched: bool = re.match(pattern, s) ;
1179
+ @example
1180
+ #macro[#zip_literals[[x, ...], [y, ...]],
1181
+ [[x, y], ...]] ;
1182
+ #macro[#unzip_literals[[x, y], ...],
1183
+ [[x, ...], [y, ...]]] ;
1177
1184
@end example
1178
1185
1186
+ In this case, @code {#zip_literals[[1 ,2 ,3] , [1 ,2 ,3]] } expands to
1187
+ @code {[[1 ,1] ,[2 ,2] ,[3 ,3]] }, and @code {#unzip_literals[[1 ,1] , [2 ,2] , [3 ,3]] }
1188
+ expands to @code {[[1 ,2 ,3] ,[1 ,2 ,3]] }.
1189
+
1190
+ Macro expansion takes place outside-in: that is,
1191
+ @code {#unzip_literals[#zip_literals[[1 ,2 ,3] ,[1 ,2 ,3]]] } will fail because
1192
+ @code {unzip_literals } expects a list, not a macro invocation, as an
1193
+ argument.
1194
+
1195
+ @c
1196
+ The macro system currently has some limitations. It's not possible to
1197
+ destructure anything other than vector literals (therefore, the arguments to
1198
+ complicated macros will tend to be an ocean of square brackets). Macro
1199
+ invocations and @code {... } can only appear in expression positions. Finally,
1200
+ macro expansion is currently unhygienic. That is, name collisions between
1201
+ macro-generated and user-written code can cause unintentional capture.
1202
+
1203
+
1179
1204
@page
1180
1205
@node Ref.Mem
1181
1206
@section Ref.Mem
0 commit comments