Skip to content

Commit bc9c0e6

Browse files
committed
SPRINKLE ALL THE THINGS
1 parent a1cf766 commit bc9c0e6

File tree

2 files changed

+57
-34
lines changed

2 files changed

+57
-34
lines changed

src/macro-expansion.md

Lines changed: 55 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,27 @@
44
> refactoring, so some of the links in this chapter may be broken.
55
66
Rust has a very powerful macro system. In the previous chapter, we saw how the
7-
parser sets aside macros to be expanded. This chapter is about the process of
8-
expanding those macros iteratively until we have a complete AST for our crate
9-
with no unexpanded macros (or a compile error).
7+
parser sets aside macros to be expanded (it temporarily uses [placeholders]).
8+
This chapter is about the process of expanding those macros iteratively until
9+
we have a complete AST for our crate with no unexpanded macros (or a compile
10+
error).
11+
12+
[placeholders]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/placeholders/index.html
1013

1114
First, we will discuss the algorithm that expands and integrates macro output
1215
into ASTs. Next, we will take a look at how hygiene data is collected. Finally,
1316
we will look at the specifics of expanding different types of macros.
1417

18+
Many of the algorithms and data structures described below are in [`rustc_expand`],
19+
with basic data structures in [`rustc_expand::base`][base].
20+
21+
Also of note, `cfg` and `cfg_attr` are treated specially from other macros, and are
22+
handled in [`rustc_expand::config`][cfg].
23+
24+
[`rustc_expand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/index.html
25+
[base]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/index.html
26+
[cfg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/config/index.html
27+
1528
## Expansion and AST Integration
1629

1730
First of all, expansion happens at the crate level. Given a raw source code for
@@ -24,10 +37,7 @@ method on a whole crate. If it is not run on a full crate, it means we are
2437
doing _eager macro expansion_. Eager expansion means that we expand the
2538
arguments of a macro invocation before the macro invocation itself. This is
2639
implemented only for a few special built-in macros that expect literals (it's
27-
not a generally available feature of Rust). Eager expansion generally performs
28-
a subset of the things that lazy (normal) expansion does, so we will focus on
29-
lazy expansion for the rest of this chapter.
30-
40+
not a generally available feature of Rust).
3141
As an example, consider the following:
3242

3343
```rust,ignore
@@ -40,7 +50,16 @@ foo!(bar!(baz));
4050
A lazy expansion would expand `foo!` first. An eager expansion would expand
4151
`bar!` first. Implementing eager expansion more generally would be challenging,
4252
but we implement it for a few special built-in macros for the sake of user
43-
experience.
53+
experience. The built-in macros are implemented in [`rustc_builtin_macros`],
54+
along with some other early code generation facilities like injection of
55+
standard library imports or generation of test harness. There are some
56+
additional helpers for building their AST fragments in
57+
[`rustc_expand::build`][reb]. Eager expansion generally performs a subset of
58+
the things that lazy (normal) expansion does, so we will focus on lazy
59+
expansion for the rest of this chapter.
60+
61+
[`rustc_builtin_macros`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_builtin_macros/index.html
62+
[reb]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/build/index.html
4463

4564
At a high level, [`fully_expand_fragment`][fef] works in iterations. We keep a
4665
queue of unresolved macro invocations (that is, macros we haven't found the
@@ -114,10 +133,15 @@ fail at this point. The recovery happens by expanding unresolved macros into
114133
[err]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/enum.ExprKind.html#variant.Err
115134

116135
Notice that name resolution is involved here: we need to resolve imports and
117-
macro names in the above algorithm. However, we don't try to resolve other
118-
names yet. This happens later, as we will see in the [next
136+
macro names in the above algorithm. This is done in
137+
[`rustc_resolve::macros`][mresolve], which resolves macro paths, validates
138+
those resolutions, and reports various errors (e.g. "not found" or "found, but
139+
it's unstable" or "expected x, found y"). However, we don't try to resolve
140+
other names yet. This happens later, as we will see in the [next
119141
chapter](./name-resolution.md).
120142

143+
[mresolve]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/macros/index.html
144+
121145
Here are some other notable data structures involved in expansion and integration:
122146
- [`Resolver`] - a trait used to break crate dependencies. This allows the resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and pretty much everything else depending on [`rustc_ast`].
123147
- [`ExtCtxt`]/[`ExpansionData`] - various intermediate data kept and used by expansion
@@ -217,9 +241,9 @@ an integer ID, assigned continuously starting from 0 as we discover new macro
217241
calls. All heirarchies start at [`ExpnId::root()`][rootid], which is its own
218242
parent.
219243
220-
All of the hygiene-related algorithms are implemented in
221-
[`rustc_span::hygiene`][hy], with the exception of some hacks
222-
[`Resolver::resolve_crate_root`][hacks].
244+
[`rustc_span::hygiene`][hy] contains all of the hygiene-related algorithms
245+
(with the exception of some hacks in [`Resolver::resolve_crate_root`][hacks])
246+
and structures related to hygiene and expansion that are kept in global data.
223247
224248
The actual heirarchies are stored in [`HygieneData`][hd]. This is a global
225249
piece of data containing hygiene and expansion info that can be accessed from
@@ -362,6 +386,13 @@ foo!(bar!(baz));
362386
For the `baz` AST node in the final output, the first heirarchy is `ROOT ->
363387
id(foo) -> id(bar) -> baz`, while the third heirarchy is `ROOT -> baz`.
364388

389+
### Macro Backtraces
390+
391+
Macro backtraces are implemented in [`rustc_span`] using the hygiene machinery
392+
in [`rustc_span::hygiene`][hy].
393+
394+
[`rustc_span`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/index.html
395+
365396
## Producing Macro Output
366397

367398
Above, we saw how the output of a macro is integrated into the AST for a crate,
@@ -551,7 +582,17 @@ stream, which is synthesized into the AST.
551582

552583
It's worth noting that the token stream type used by proc macros is _stable_,
553584
so `rustc` does not use it internally (since our internal data structures are
554-
unstable).
585+
unstable). The compiler's token stream is
586+
[`rustc_ast::tokenstream::TokenStream`][rustcts], as previously. This is
587+
converted into the stable [`proc_macro::TokenStream`][stablets] and back in
588+
[`rustc_expand::proc_macro`][pm] and [`rustc_expand::proc_macro_server`][pms].
589+
Because the Rust ABI is unstable, we use the C ABI for this conversion.
590+
591+
[tsmod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/index.html
592+
[rustcts]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
593+
[stablets]: https://doc.rust-lang.org/proc_macro/struct.TokenStream.html
594+
[pm]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/proc_macro/index.html
595+
[pms]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/proc_macro_server/index.html
555596

556597
TODO: more here.
557598

@@ -560,22 +601,3 @@ TODO: more here.
560601
Custom derives are a special type of proc macro.
561602

562603
TODO: more?
563-
564-
## Important Modules and Data Structures
565-
566-
TODO: sprinkle these throughout the chapter as much as possible...
567-
568-
- librustc_span/hygiene.rs - structures related to hygiene and expansion that are kept in global data (can be accessed from any Ident without any context)
569-
- librustc_span/lib.rs - some secondary methods like macro backtrace using primary methods from hygiene.rs
570-
- librustc_builtin_macros - implementations of built-in macros (including macro attributes and derives) and some other early code generation facilities like injection of standard library imports or generation of test harness.
571-
- librustc_ast/config.rs - implementation of cfg/cfg_attr (they treated specially from other macros), should probably be moved into librustc_ast/ext.
572-
- librustc_ast/tokenstream.rs + librustc_ast/parse/token.rs - structures for compiler-side tokens, token trees, and token streams.
573-
- librustc_ast/ext - various expansion-related stuff
574-
- librustc_ast/ext/base.rs - basic structures used by expansion
575-
- librustc_ast/ext/expand.rs - some expansion structures and the bulk of expansion infrastructure code - collecting macro invocations, calling into resolve for them, calling their expanding functions, and integrating the results back into AST
576-
- librustc_ast/ext/placeholder.rs - the part of expand.rs responsible for "integrating the results back into AST" basicallly, "placeholder" is a temporary AST node replaced with macro expansion result nodes
577-
- librustc_ast/ext/builer.rs - helper functions for building AST for built-in macros in librustc_builtin_macros (and user-defined syntactic plugins previously), can probably be moved into librustc_builtin_macros these days
578-
- librustc_ast/ext/proc_macro.rs + librustc_ast/ext/proc_macro_server.rs - interfaces between the compiler and the stable proc_macro library, converting tokens and token streams between the two representations and sending them through C ABI
579-
- librustc_ast/ext/tt - implementation of macro_rules, turns macro_rules DSL into something with signature Fn(TokenStream) -> TokenStream that can eat and produce tokens, @mark-i-m knows more about this
580-
- librustc_resolve/macros.rs - resolving macro paths, validating those resolutions, reporting various "not found"/"found, but it's unstable"/"expected x, found y" errors
581-
- librustc_middle/hir/map/def_collector.rs + librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly expanded from a macro into various parent/child structures like module hierarchy or "definition paths"

src/the-parser.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,11 @@ The very first thing the compiler does is take the program (in Unicode
77
characters) and turn it into something the compiler can work with more
88
conveniently than strings. This happens in two stages: Lexing and Parsing.
99

10-
Lexing takes strings and turns them into streams of tokens. For example,
10+
Lexing takes strings and turns them into streams of [tokens]. For example,
1111
`a.b + c` would be turned into the tokens `a`, `.`, `b`, `+`, and `c`.
1212
The lexer lives in [`librustc_lexer`][lexer].
1313

14+
[tokens]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/token/index.html
1415
[lexer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html
1516

1617
Parsing then takes streams of tokens and turns them into a structured

0 commit comments

Comments
 (0)