Skip to content

Commit b744e7e

Browse files
committed
Edit lexing
1 parent 0049fe6 commit b744e7e

File tree

1 file changed

+19
-0
lines changed

1 file changed

+19
-0
lines changed

src/overview.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,25 @@ Command line argument parsing occurs in the [`rustc_driver`]. This crate
118118
defines the compile configuration that is requested by the user and passes it
119119
to the rest of the compilation process as a [`rustc_interface::Config`].
120120

121+
### Lexing and parsing
122+
123+
The raw Rust source text is analyzed by a low-level *lexer* located in
124+
[`rustc_lexer`]. At this stage, the source text is turned into a stream of
125+
atomic source code units known as _tokens_. The lexer supports the
126+
Unicode character encoding.
127+
128+
The token stream passes through a higher-level lexer located in
129+
[`rustc_parse`] to prepare for the next stage of the compile process. The
130+
[`StringReader`] struct is used at this stage to perform a set of validations
131+
and turn strings into interned symbols (_interning_ is discussed later).
132+
[String interning] is a way of storing only one immutable
133+
copy of each distinct string value.
134+
135+
The lexer has a small interface and doesn't depend directly on the
136+
diagnostic infrastructure in `rustc`. Instead it provides diagnostics as plain
137+
data which are emitted in `rustc_parse::lexer::mod` as real diagnostics.
138+
The lexer preserves full fidelity information for both IDEs and proc macros.
139+
121140

122141
[String interning]: https://en.wikipedia.org/wiki/String_interning
123142
[`rustc_lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html

0 commit comments

Comments
 (0)