Skip to content

Commit f05ff9c

Browse files
committed
expand notes on expansion heirarchies
1 parent ba8620f commit f05ff9c

File tree

1 file changed

+128
-57
lines changed

1 file changed

+128
-57
lines changed

src/macro-expansion.md

Lines changed: 128 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -163,82 +163,153 @@ only within the macro (i.e. it should not be visible outside the macro).
163163
[code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/fn.parse_tt.html
164164
[parsing]: ./the-parser.html
165165
166-
TODO: expand these notes
166+
The context is attached to AST nodes. All AST nodes generated by macros have
167+
context attached. Additionally, there may be other nodes that have context
168+
attached, such as some desugared syntax (non-macro-expanded nodes are
169+
considered to just have the "root" context, as described below).
167170
168-
- Many AST nodes have some sort of syntax context, especially nodes from macros.
169-
- When we ask what is the syntax context of a node, the answer actually differs by what we are trying to do. Thus, we don't just keep track of a single context. There are in fact 3 different types of context used for different things.
170-
- Each type of context is tracked by an "expansion heirarchy". As we expand macros, new macro calls or macro definitions may be generated, leading to some nesting. This nesting is where the heirarchies come from. Each heirarchy tracks some different aspect, though, as we will see.
171-
- There are 3 expansion heirarchies
172-
- All macros receive an integer ID assigned continuously starting from 0 as we discover new macro calls
173-
- This is used as the `expn_id` where needed.
174-
- All heirarchies start at ExpnId::root, which is its own parent
175-
- The context of a node consists of a chain of expansions leading to `ExpnId::root`. A non-macro-expanded node has syntax context 0 (`SyntaxContext::empty()`) which represents just the root node.
176-
- There are vectors in `HygieneData` that contain expansion info.
177-
- There are entries here for both `SyntaxContext::empty()` and `ExpnId::root`, but they aren't used much.
171+
Because macros invocations and definitions can be nested, the syntax context of
172+
a node must be a heirarchy. For example, if we expand a macro and there is
173+
another macro invocation or definition in the generated output, then the syntax
174+
context should reflex the nesting.
178175
179-
1. Tracks expansion order: when a macro invocation is in the output of another macro.
180-
...
181-
expn_id2
182-
expn_id1
183-
InternalExpnData::parent is the child->parent link. That is the expn_id1 points to expn_id2 points to ...
176+
However, it turns out that there are actually a few types of context we may
177+
want to track for different purposes. Thus, there not just one but _three_
178+
expansion heirarchies that together comprise the hygiene information for a
179+
crate.
184180
185-
Ex:
186-
macro_rules! foo { () => { println!(); } }
187-
fn main() { foo!(); }
181+
All of these heirarchies need some sort of "macro ID" to identify individual
182+
elements in the chain of expansions. This ID is [`ExpnId`]. All macros receive
183+
an integer ID, assigned continuously starting from 0 as we discover new macro
184+
calls. All heirarchies start at [`ExpnId::root()`][rootid], which is its own
185+
parent.
188186
189-
// Then AST nodes that are finally generated would have parent(expn_id_println) -> parent(expn_id_foo), right?
187+
The actual heirarchies are stored in [`HygieneData`][hd], and all of the
188+
hygiene-related algorithms are implemented in [`rustc_span::hygiene`][hy], with
189+
the exception of some hacks [`Resolver::resolve_crate_root`][hacks].
190190
191-
2. Tracks macro definitions: when we are expanding one macro another macro definition is revealed in its output.
192-
...
193-
SyntaxContext2
194-
SyntaxContext1
195-
SyntaxContextData::parent is the child->parent link here.
196-
SyntaxContext is the whole chain in this hierarchy, and SyntaxContextData::outer_expns are individual elements in the chain.
191+
[`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
192+
[rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root
193+
[hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html
194+
[hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
195+
[hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root
197196
198-
- For built-in macros (e.g. `line!()`) or stable proc macros: tokens produced by the macro are given the context `SyntaxContext::empty().apply_mark(expn_id)`
199-
- Such macros are considered to have been defined at the root.
200-
- For proc macros this is because they are always cross-crate and we don't have cross-crate hygiene implemented.
197+
### The Expansion Order Heirarchy
201198
202-
The second hierarchy has the context transplantation hack. See https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732.
199+
The first heirarchy tracks the order of expansions, i.e., when a macro
200+
invocation is in the output of another macro.
203201
204-
If the token had context X before being produced by a macro then after being produced by the macro it has context X -> macro_id.
202+
Here, the children in the heirarchy will be the "innermost" tokens.
203+
[`ExpnData::parent`][edp] tracks the child -> parent link in this heirarchy.
205204
206-
Ex:
207-
```rust
208-
macro m() { ident }
209-
```
205+
[edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
210206
211-
Here `ident` originally has context SyntaxContext::root(). `ident` has context ROOT -> id(m) after it's produced by m.
212-
The "chaining operator" is `apply_mark` in compiler code.
207+
For example,
213208
214-
Ex:
209+
```rust,ignore
210+
macro_rules! foo { () => { println!(); } }
211+
212+
fn main() { foo!(); }
213+
```
214+
215+
In this code, the AST nodes that are finally generated would have heirarchy:
216+
217+
```
218+
root
219+
expn_id_foo
220+
expn_id_println
221+
```
222+
223+
### The Macro Definition Heirarchy
224+
225+
The second heirarchy tracks the order of macro definitions, i.e., when we are
226+
expanding one macro another macro definition is revealed in its output. This
227+
one is a bit tricky and more complex than the other two heirarchies.
228+
229+
Here, [`SyntaxContextData::parent`][scdp] is the child -> parent link here.
230+
[`SyntaxContext`][sc] is the whole chain in this hierarchy, and
231+
[`SyntaxContextData::outer_expns`][scdoe] are individual elements in the chain.
232+
The "chaining operator" is [`SyntaxContext::apply_mark`][am] in compiler code.
233+
234+
[scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent
235+
[sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
236+
[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
237+
[am]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.apply_mark
238+
239+
For built-in macros, we use the context:
240+
`SyntaxContext::empty().apply_mark(expn_id)`, and such macros are considered to
241+
be defined at the heirarchy root. We do the same for proc-macros because we
242+
haven't implemented cross-crate hygiene yet.
243+
244+
If the token had context `X` before being produced by a macro then after being
245+
produced by the macro it has context `X -> macro_id`. Here are some examples:
246+
247+
Example 0:
248+
249+
```rust,ignore
250+
macro m() { ident }
251+
252+
m!();
253+
```
254+
255+
Here `ident` originally has context [`SyntaxContext::root()`][scr]. `ident` has
256+
context `ROOT -> id(m)` after it's produced by `m`.
257+
258+
[scr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.root
215259

216-
```rust
217-
macro m() { macro n() { ident } }
218-
```
219-
In this example the ident has context ROOT originally, then ROOT -> id(m), then ROOT -> id(m) -> id(n).
220260

221-
Note that these chains are not entirely determined by their last element, in other words ExpnId is not isomorphic to SyntaxCtxt.
261+
Example 1:
222262

223-
Ex:
224-
```rust
225-
macro m($i: ident) { macro n() { ($i, bar) } }
263+
```rust,ignore
264+
macro m() { macro n() { ident } }
265+
266+
m!();
267+
n!();
268+
```
269+
In this example the `ident` has context `ROOT` originally, then `ROOT -> id(m)`
270+
after the first expansion, then `ROOT -> id(m) -> id(n)`.
271+
272+
Example 2:
273+
274+
Note that these chains are not entirely determined by their last element, in
275+
other words `ExpnId` is not isomorphic to `SyntaxContext`.
276+
277+
```rust,ignore
278+
macro m($i: ident) { macro n() { ($i, bar) } }
279+
280+
m!(foo);
281+
```
282+
283+
After all expansions, `foo` has context `ROOT -> id(n)` and `bar` has context
284+
`ROOT -> id(m) -> id(n)`.
226285

227-
m!(foo);
228-
```
286+
Finally, one last thing to mention is that currently, this heirarchy is subject
287+
to the ["context transplantation hack"][hack]. Basically, the more modern (and
288+
experimental) `macro` macros have stronger hygiene than the older MBE system,
289+
but this can result in weird interactions between the two. The hack is intended
290+
to make things "just work" for now.
229291

230-
After all expansions, foo has context ROOT -> id(n) and bar has context ROOT -> id(m) -> id(n)
292+
[hack]: https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732
231293

232-
3. Call-site: tracks the location of the macro invocation.
233-
Ex:
234-
If foo!(bar!(ident)) expands into ident
235-
then hierarchy 1 is root -> foo -> bar -> ident
236-
but hierarchy 3 is root -> ident
294+
### The Call-site Heirarchy
237295

238-
ExpnInfo::call_site is the child-parent link in this case.
296+
The third and final heirarchy tracks the location of macro invocations.
297+
298+
In this heirarchy [`ExpnData::call_site`][callsite] is the child -> parent link.
299+
300+
[callsite]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.call_site
301+
302+
Here is an example:
303+
304+
```rust,ignore
305+
macro bar($i: ident) { $i }
306+
macro foo($i: ident) { $i }
307+
308+
foo!(bar!(baz));
309+
```
239310

240-
- Hygiene-related algorithms are entirely in hygiene.rs
241-
- Some hacks in `resolve_crate_root`, though.
311+
For the `baz` AST node in the final output, the first heirarchy is `ROOT ->
312+
id(foo) -> id(bar) -> baz`, while the third heirarchy is `ROOT -> baz`.
242313

243314
## Producing Macro Output
244315

0 commit comments

Comments
 (0)