Skip to content

Commit a1cf766

Browse files
committed
sprinkle around a bunch of links
1 parent f05ff9c commit a1cf766

File tree

1 file changed

+85
-37
lines changed

1 file changed

+85
-37
lines changed

src/macro-expansion.md

Lines changed: 85 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -55,15 +55,19 @@ iteration, this represents a compile error. Here is the [algorithm][original]:
5555
1. Repeat until `queue` is empty (or we make no progress, which is an error):
5656
0. [Resolve](./name-resolution.md) imports in our partially built crate as
5757
much as possible.
58-
1. Collect as many macro invocations as possible from our partially built
59-
crate (fn-like, attributes, derives) and add them to the queue.
58+
1. Collect as many macro [`Invocation`s][inv] as possible from our
59+
partially built crate (fn-like, attributes, derives) and add them to the
60+
queue.
6061
2. Dequeue the first element, and attempt to resolve it.
6162
3. If it's resolved:
62-
0. Run the macro's expander function that consumes tokens or AST and
63-
produces tokens or AST (depending on the macro kind).
63+
0. Run the macro's expander function that consumes a [`TokenStream`] or
64+
AST and produces a [`TokenStream`] or [`AstFragment`] (depending on
65+
the macro kind). (A `TokenStream` is a collection of [`TokenTrees`],
66+
each of which are a token (punctuation, identifier, or literal) or a
67+
delimited group (anything inside `()`/`[]`/`{}`)).
6468
- At this point, we know everything about the macro itself and can
65-
call `set_expn_data` to fill in its properties in the global data
66-
-- that is the hygiene data associated with `ExpnId`. (See [the
69+
call `set_expn_data` to fill in its properties in the global data;
70+
that is the hygiene data associated with `ExpnId`. (See [the
6771
"Hygiene" section below][hybelow]).
6872
1. Integrate that piece of AST into the big existing partially built
6973
AST. This is essentially where the "token-like mass" becomes a
@@ -94,6 +98,10 @@ iteration, this represents a compile error. Here is the [algorithm][original]:
9498
[`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
9599
[`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
96100
[hybelow]: #hygiene-and-heirarchies
101+
[`TokenTree`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/enum.TokenTree.html
102+
[`TokenStream`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
103+
[inv]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.Invocation.html
104+
[`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
97105

98106
If we make no progress in an iteration, then we have reached a compilation
99107
error (e.g. an undefined macro). We attempt to recover from failures
@@ -110,6 +118,27 @@ macro names in the above algorithm. However, we don't try to resolve other
110118
names yet. This happens later, as we will see in the [next
111119
chapter](./name-resolution.md).
112120

121+
Here are some other notable data structures involved in expansion and integration:
122+
- [`Resolver`] - a trait used to break crate dependencies. This allows the resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and pretty much everything else depending on [`rustc_ast`].
123+
- [`ExtCtxt`]/[`ExpansionData`] - various intermediate data kept and used by expansion
124+
infrastructure in the process of its work
125+
- [`Annotatable`] - a piece of AST that can be an attribute target, almost same
126+
thing as AstFragment except for types and patterns that can be produced by
127+
macros but cannot be annotated with attributes
128+
- [`MacResult`] - a "polymorphic" AST fragment, something that can turn into a
129+
different `AstFragment` depending on its [`AstFragmentKind`] - item,
130+
or expression, or pattern etc.
131+
132+
[`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
133+
[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
134+
[`Resolver`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.Resolver.html
135+
[`ExtCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExtCtxt.html
136+
[`ExpansionData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExpansionData.html
137+
[`Annotatable`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.Annotatable.html
138+
[`MacResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MacResult.html
139+
[`AstFragmentKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragmentKind.html
140+
141+
113142
## Hygiene and Heirarchies
114143

115144
If you have ever used C/C++ preprocessor macros, you know that there are some
@@ -167,6 +196,10 @@ The context is attached to AST nodes. All AST nodes generated by macros have
167196
context attached. Additionally, there may be other nodes that have context
168197
attached, such as some desugared syntax (non-macro-expanded nodes are
169198
considered to just have the "root" context, as described below).
199+
Throughout the compiler, we use [`Span`s][span] to refer to code locations.
200+
This struct also has hygiene information attached to it, as we will see later.
201+
202+
[span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/struct.Span.html
170203
171204
Because macros invocations and definitions can be nested, the syntax context of
172205
a node must be a heirarchy. For example, if we expand a macro and there is
@@ -184,24 +217,33 @@ an integer ID, assigned continuously starting from 0 as we discover new macro
184217
calls. All heirarchies start at [`ExpnId::root()`][rootid], which is its own
185218
parent.
186219
187-
The actual heirarchies are stored in [`HygieneData`][hd], and all of the
188-
hygiene-related algorithms are implemented in [`rustc_span::hygiene`][hy], with
189-
the exception of some hacks [`Resolver::resolve_crate_root`][hacks].
220+
All of the hygiene-related algorithms are implemented in
221+
[`rustc_span::hygiene`][hy], with the exception of some hacks
222+
[`Resolver::resolve_crate_root`][hacks].
223+
224+
The actual heirarchies are stored in [`HygieneData`][hd]. This is a global
225+
piece of data containing hygiene and expansion info that can be accessed from
226+
any [`Ident`] without any context.
227+
190228
191229
[`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
192230
[rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root
193231
[hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html
194232
[hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
195233
[hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root
234+
[`Ident`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Ident.html
196235
197236
### The Expansion Order Heirarchy
198237
199238
The first heirarchy tracks the order of expansions, i.e., when a macro
200239
invocation is in the output of another macro.
201240
202-
Here, the children in the heirarchy will be the "innermost" tokens.
241+
Here, the children in the heirarchy will be the "innermost" tokens. The
242+
[`ExpnData`] struct itself contains a subset of properties from both macro
243+
definition and macro call available through global data.
203244
[`ExpnData::parent`][edp] tracks the child -> parent link in this heirarchy.
204245
246+
[`ExpnData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html
205247
[edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
206248
207249
For example,
@@ -226,11 +268,20 @@ The second heirarchy tracks the order of macro definitions, i.e., when we are
226268
expanding one macro another macro definition is revealed in its output. This
227269
one is a bit tricky and more complex than the other two heirarchies.
228270

229-
Here, [`SyntaxContextData::parent`][scdp] is the child -> parent link here.
230-
[`SyntaxContext`][sc] is the whole chain in this hierarchy, and
231-
[`SyntaxContextData::outer_expns`][scdoe] are individual elements in the chain.
232-
The "chaining operator" is [`SyntaxContext::apply_mark`][am] in compiler code.
271+
[`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID.
272+
[`SyntaxContextData`][scd] contains data associated with the given
273+
`SyntaxContext`; mostly it is a cache for results of filtering that chain in
274+
different ways. [`SyntaxContextData::parent`][scdp] is the child -> parent
275+
link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual
276+
elements in the chain. The "chaining operator" is
277+
[`SyntaxContext::apply_mark`][am] in compiler code.
278+
279+
A [`Span`][span], mentioned above, is actually just a compact representation of
280+
a code location and `SyntaxContext`. Likewise, an [`Ident`] is just an interned
281+
[`Symbol`] + `Span` (i.e. an interned string + hygiene data).
233282

283+
[`Symbol`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Symbol.html
284+
[scd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html
234285
[scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent
235286
[sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
236287
[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
@@ -323,6 +374,24 @@ There are two types of macros in Rust:
323374
Rust parser will set aside the contents of macros and their invocations. Later,
324375
macros are expanded using these portions of the code.
325376

377+
Some important data structures/interfaces here:
378+
- [`SyntaxExtension`] - a lowered macro representation, contains its expander
379+
function, which transforms a `TokenStream` or AST into another `TokenStream`
380+
or AST + some additional data like stability, or a list of unstable features
381+
allowed inside the macro.
382+
- [`SyntaxExtensionKind`] - expander functions may have several different
383+
signatures (take one token stream, or two, or a piece of AST, etc). This is
384+
an enum that lists them.
385+
- [`ProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
386+
traits representing the expander function signatures.
387+
388+
[`SyntaxExtension`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.SyntaxExtension.html
389+
[`SyntaxExtensionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.SyntaxExtensionKind.html
390+
[`ProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ProcMacro.html
391+
[`TTMacroExpander`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.TTMacroExpander.html
392+
[`AttrProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.AttrProcMacro.html
393+
[`MultiItemModifier`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MultiItemModifier.html
394+
326395
## Macros By Example
327396

328397
MBEs have their own parser distinct from the normal Rust parser. When macros
@@ -492,11 +561,10 @@ Custom derives are a special type of proc macro.
492561

493562
TODO: more?
494563

495-
## Notes from petrochenkov discussion
564+
## Important Modules and Data Structures
496565

497-
TODO: sprinkle these links around the chapter...
566+
TODO: sprinkle these throughout the chapter as much as possible...
498567

499-
Where to find the code:
500568
- librustc_span/hygiene.rs - structures related to hygiene and expansion that are kept in global data (can be accessed from any Ident without any context)
501569
- librustc_span/lib.rs - some secondary methods like macro backtrace using primary methods from hygiene.rs
502570
- librustc_builtin_macros - implementations of built-in macros (including macro attributes and derives) and some other early code generation facilities like injection of standard library imports or generation of test harness.
@@ -511,23 +579,3 @@ Where to find the code:
511579
- librustc_ast/ext/tt - implementation of macro_rules, turns macro_rules DSL into something with signature Fn(TokenStream) -> TokenStream that can eat and produce tokens, @mark-i-m knows more about this
512580
- librustc_resolve/macros.rs - resolving macro paths, validating those resolutions, reporting various "not found"/"found, but it's unstable"/"expected x, found y" errors
513581
- librustc_middle/hir/map/def_collector.rs + librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly expanded from a macro into various parent/child structures like module hierarchy or "definition paths"
514-
515-
Primary structures:
516-
- HygieneData - global piece of data containing hygiene and expansion info that can be accessed from any Ident without any context
517-
- ExpnId - ID of a macro call or desugaring (and also expansion of that call/desugaring, depending on context)
518-
- ExpnInfo/InternalExpnData - a subset of properties from both macro definition and macro call available through global data
519-
- SyntaxContext - ID of a chain of nested macro definitions (identified by ExpnIds)
520-
- SyntaxContextData - data associated with the given SyntaxContext, mostly a cache for results of filtering that chain in different ways
521-
- Span - a code location + SyntaxContext
522-
- Ident - interned string (Symbol) + Span, i.e. a string with attached hygiene data
523-
- TokenStream - a collection of TokenTrees
524-
- TokenTree - a token (punctuation, identifier, or literal) or a delimited group (anything inside ()/[]/{})
525-
- SyntaxExtension - a lowered macro representation, contains its expander function transforming a tokenstream or AST into tokenstream or AST + some additional data like stability, or a list of unstable features allowed inside the macro.
526-
- SyntaxExtensionKind - expander functions may have several different signatures (take one token stream, or two, or a piece of AST, etc), this is an enum that lists them
527-
- ProcMacro/TTMacroExpander/AttrProcMacro/MultiItemModifier - traits representing the expander signatures (TODO: change and rename the signatures into something more consistent)
528-
- Resolver - a trait used to break crate dependencies (so resolver services can be used in librustc_ast, despite librustc_resolve and pretty much everything else depending on librustc_ast)
529-
- ExtCtxt/ExpansionData - various intermediate data kept and used by expansion infra in the process of its work
530-
- AstFragment - a piece of AST that can be produced by a macro (may include multiple homogeneous AST nodes, like e.g. a list of items)
531-
- Annotatable - a piece of AST that can be an attribute target, almost same thing as AstFragment except for types and patterns that can be produced by macros but cannot be annotated with attributes (TODO: Merge into AstFragment)
532-
- MacResult - a "polymorphic" AST fragment, something that can turn into a different AstFragment depending on its context (aka AstFragmentKind - item, or expression, or pattern etc.)
533-
- Invocation/InvocationKind - a structure describing a macro call, these structures are collected by the expansion infra (InvocationCollector), queued, resolved, expanded when resolved, etc.

0 commit comments

Comments
 (0)