Skip to content

Commit 0aa42bc

Browse files
committed
Copyedit the macro tutorial
I hope I haven't introduced any grievous errors :-)
1 parent 22efa39 commit 0aa42bc

File tree

1 file changed

+73
-44
lines changed

1 file changed

+73
-44
lines changed

doc/tutorial-macros.md

Lines changed: 73 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,11 @@
22

33
# Introduction
44

5-
Functions are the programmer's primary tool of abstraction, but there are
6-
cases in which they are insufficient, because the programmer wants to
7-
abstract over concepts not represented as values. Consider the following
8-
example:
5+
Functions are the primary tool that programmers can use to build
6+
abstractions. Sometimes, though, programmers want to abstract over
7+
compile-time, syntactic structures rather than runtime values. For example,
8+
the following two code fragments both pattern-match on their input and return
9+
early in one case, doing nothing otherwise:
910

1011
~~~~
1112
# enum t { special_a(uint), special_b(uint) };
@@ -24,11 +25,12 @@ match input_2 {
2425
# }
2526
~~~~
2627

27-
This code could become tiresome if repeated many times. However, there is
28-
no reasonable function that could be written to solve this problem. In such a
29-
case, it's possible to define a macro to solve the problem. Macros are
28+
This code could become tiresome if repeated many times. However, there is no
29+
straightforward way to rewrite it without the repeated code, using functions
30+
alone. There is a solution, though: defining a macro to solve the problem. Macros are
3031
lightweight custom syntax extensions, themselves defined using the
31-
`macro_rules!` syntax extension:
32+
`macro_rules!` syntax extension. The following `early_return` macro captures
33+
the pattern in the above code:
3234

3335
~~~~
3436
# enum t { special_a(uint), special_b(uint) };
@@ -42,56 +44,85 @@ macro_rules! early_return(
4244
}
4345
);
4446
);
45-
// ...
47+
~~~~
48+
49+
Now, we can replace each `match` with an invocation of the `early_return`
50+
macro:
51+
52+
~~~~
4653
early_return!(input_1 special_a);
4754
// ...
4855
early_return!(input_2 special_b);
4956
# return 0;
5057
# }
5158
~~~~
5259

53-
Macros are defined in pattern-matching style:
60+
Macros are defined in pattern-matching style: in the above example, the text
61+
`($inp:expr $sp:ident)` that appears on the left-hand side of the `=>` is the
62+
*macro invocation syntax*, a pattern denoting how to write a call to the
63+
macro. The text on the right-hand side of the `=>`, beginning with `match
64+
$inp`, is the *macro transcription syntax*: what the macro expands to.
5465

5566
# Invocation syntax
5667

57-
On the left-hand-side of the `=>` is the macro invocation syntax. It is
58-
free-form, excepting the following rules:
68+
The macro invocation syntax specifies the syntax for the arguments to the
69+
macro. It appears on the left-hand side of the `=>` in a macro definition. It
70+
conforms to the following rules:
5971

60-
1. It must be surrounded in parentheses.
72+
1. It must be surrounded by parentheses.
6173
2. `$` has special meaning.
6274
3. The `()`s, `[]`s, and `{}`s it contains must balance. For example, `([)` is
6375
forbidden.
6476

77+
Otherwise, the invocation syntax is free-form.
78+
6579
To take as an argument a fragment of Rust code, write `$` followed by a name
66-
(for use on the right-hand side), followed by a `:`, followed by the sort of
67-
fragment to match (the most common ones are `ident`, `expr`, `ty`, `pat`, and
68-
`block`). Anything not preceded by a `$` is taken literally. The standard
80+
(for use on the right-hand side), followed by a `:`, followed by a *fragment
81+
specifier*. The fragment specifier denotes the sort of fragment to match. The
82+
most common fragment specifiers are:
83+
84+
* `ident` (an identifier, referring to a variable or item. Examples: `f`, `x`,
85+
`foo`.)
86+
* `expr` (an expression. Examples: `2 + 2`; `if true then { 1 } else { 2 }`;
87+
`f(42)`.)
88+
* `ty` (a type. Examples: `int`, `~[(char, ~str)]`, `&T`.)
89+
* `pat` (a pattern, usually appearing in a `match` or on the left-hand side of
90+
a declaration. Examples: `Some(t)`; `(17, 'a')`; `_`.)
91+
* `block` (a sequence of actions. Example: `{ log(error, "hi"); return 12; }`)
92+
93+
The parser interprets any token that's not preceded by a `$` literally. Rust's usual
6994
rules of tokenization apply,
7095

71-
So `($x:ident => (($e:expr)))`, though excessively fancy, would create a macro
72-
that could be invoked like `my_macro!(i=>(( 2+2 )))`.
96+
So `($x:ident -> (($e:expr)))`, though excessively fancy, would designate a macro
97+
that could be invoked like: `my_macro!(i->(( 2+2 )))`.
7398

7499
# Transcription syntax
75100

76101
The right-hand side of the `=>` follows the same rules as the left-hand side,
77-
except that `$` need only be followed by the name of the syntactic fragment
78-
to transcribe.
102+
except that a `$` need only be followed by the name of the syntactic fragment
103+
to transcribe into the macro expansion; its type need not be repeated.
79104

80-
The right-hand side must be surrounded by delimiters of some kind, and must be
81-
an expression; currently, user-defined macros can only be invoked in
82-
expression position (even though `macro_rules!` itself can be in item
83-
position).
105+
The right-hand side must be enclosed by delimiters, and must be
106+
an expression. Currently, invocations of user-defined macros can only appear in a context
107+
where the Rust grammar requires an expression, even though `macro_rules!` itself can appear
108+
in a context where the grammar requires an item.
84109

85110
# Multiplicity
86111

87112
## Invocation
88113

89-
Going back to the motivating example, suppose that we wanted each invocation
90-
of `early_return` to potentially accept multiple "special" identifiers. The
91-
syntax `$(...)*` accepts zero or more occurrences of its contents, much like
92-
the Kleene star operator in regular expressions. It also supports a separator
93-
token (a comma-separated list could be written `$(...),*`), and `+` instead of
94-
`*` to mean "at least one".
114+
Going back to the motivating example, recall that `early_return` expanded into
115+
a `match` that would `return` if the `match`'s scrutinee matched the
116+
"special case" identifier provided as the second argument to `early_return`,
117+
and do nothing otherwise. Now suppose that we wanted to write a
118+
version of `early_return` that could handle a variable number of "special"
119+
cases.
120+
121+
The syntax `$(...)*` on the left-hand side of the `=>` in a macro definition
122+
accepts zero or more occurrences of its contents. It works much
123+
like the `*` operator in regular expressions. It also supports a
124+
separator token (a comma-separated list could be written `$(...),*`), and `+`
125+
instead of `*` to mean "at least one".
95126

96127
~~~~
97128
# enum t { special_a(uint),special_b(uint),special_c(uint),special_d(uint)};
@@ -118,37 +149,35 @@ early_return!(input_2, [special_b]);
118149
### Transcription
119150

120151
As the above example demonstrates, `$(...)*` is also valid on the right-hand
121-
side of a macro definition. The behavior of Kleene star in transcription,
122-
especially in cases where multiple stars are nested, and multiple different
152+
side of a macro definition. The behavior of `*` in transcription,
153+
especially in cases where multiple `*`s are nested, and multiple different
123154
names are involved, can seem somewhat magical and intuitive at first. The
124155
system that interprets them is called "Macro By Example". The two rules to
125156
keep in mind are (1) the behavior of `$(...)*` is to walk through one "layer"
126157
of repetitions for all of the `$name`s it contains in lockstep, and (2) each
127158
`$name` must be under at least as many `$(...)*`s as it was matched against.
128-
If it is under more, it'll will be repeated, as appropriate.
159+
If it is under more, it'll be repeated, as appropriate.
129160

130161
## Parsing limitations
131162

132-
The parser used by the macro system is reasonably powerful, but the parsing of
133-
Rust syntax is restricted in two ways:
163+
The macro parser will parse Rust syntax with two limitations:
134164

135165
1. The parser will always parse as much as possible. For example, if the comma
136166
were omitted from the syntax of `early_return!` above, `input_1 [` would've
137167
been interpreted as the beginning of an array index. In fact, invoking the
138168
macro would have been impossible.
139169
2. The parser must have eliminated all ambiguity by the time it reaches a
140-
`$name:fragment_specifier`. This most often affects them when they occur in
141-
the beginning of, or immediately after, a `$(...)*`; requiring a distinctive
170+
`$name:fragment_specifier` declaration. This limitation can result in parse
171+
errors when declarations occur at the beginning of, or immediately after,
172+
a `$(...)*`. Changing the invocation syntax to require a distinctive
142173
token in front can solve the problem.
143174

144175
## A final note
145176

146177
Macros, as currently implemented, are not for the faint of heart. Even
147-
ordinary syntax errors can be more difficult to debug when they occur inside
148-
a macro, and errors caused by parse problems in generated code can be very
178+
ordinary syntax errors can be more difficult to debug when they occur inside a
179+
macro, and errors caused by parse problems in generated code can be very
149180
tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
150-
states, using `trace_macros!(true)` will automatically print those
151-
intermediate states out, and using `--pretty expanded` as an argument to the
152-
compiler will show the result of expansion.
153-
154-
181+
states, invoking `trace_macros!(true)` will automatically print those
182+
intermediate states out, and passing the flag `--pretty expanded` as a
183+
command-line argument to the compiler will show the result of expansion.

0 commit comments

Comments
 (0)