diff --git a/src/SUMMARY.md b/src/SUMMARY.md index f60fee488..a41f78a1a 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -20,6 +20,7 @@ - [Macro expansion](./macro-expansion.md) - [Name resolution](./name-resolution.md) - [The HIR (High-level IR)](./hir.md) + - [Lowering AST to HIR](./lowering.md) - [The `ty` module: representing types](./ty.md) - [Type inference](./type-inference.md) - [Trait solving (old-style)](./traits/resolution.md) diff --git a/src/high-level-overview.md b/src/high-level-overview.md index 60b9e80ea..733c523c9 100644 --- a/src/high-level-overview.md +++ b/src/high-level-overview.md @@ -105,7 +105,7 @@ take: 3. **Lowering to HIR** - Once name resolution completes, we convert the AST into the HIR, or "[high-level intermediate representation]". The HIR is defined in - `src/librustc/hir/`; that module also includes the lowering code. + `src/librustc/hir/`; that module also includes the [lowering] code. - The HIR is a lightly desugared variant of the AST. It is more processed than the AST and more suitable for the analyses that follow. It is **not** required to match the syntax of the Rust language. @@ -139,3 +139,4 @@ take: [query model]: query.html [high-level intermediate representation]: hir.html +[lowering]: lowering.html \ No newline at end of file diff --git a/src/hir.md b/src/hir.md index 2a11531ee..40a14dc25 100644 --- a/src/hir.md +++ b/src/hir.md @@ -1,12 +1,13 @@ # The HIR -The HIR – "High-Level Intermediate Representation" – is the primary IR used in -most of rustc. It is a compiler-friendly representation of the abstract syntax -tree (AST) that is generated after parsing, macro expansion, and name -resolution. Many parts of HIR resemble Rust surface syntax quite closely, with -the exception that some of Rust's expression forms have been desugared away. For -example, `for` loops are converted into a `loop` and do not appear in the HIR. -This makes HIR more amenable to analysis than a normal AST. +The HIR – "High-Level Intermediate Representation" – is the primary IR used +in most of rustc. It is a compiler-friendly representation of the abstract +syntax tree (AST) that is generated after parsing, macro expansion, and name +resolution (see [Lowering](./lowering.html) for how the HIR is created). +Many parts of HIR resemble Rust surface syntax quite closely, with +the exception that some of Rust's expression forms have been desugared away. +For example, `for` loops are converted into a `loop` and do not appear in +the HIR. This makes HIR more amenable to analysis than a normal AST. This chapter covers the main concepts of the HIR. diff --git a/src/lowering.md b/src/lowering.md new file mode 100644 index 000000000..eddc00af9 --- /dev/null +++ b/src/lowering.md @@ -0,0 +1,48 @@ +# Lowering + +The lowering step converts AST to [HIR](hir.html). +This means many structures are removed if they are irrelevant +for type analysis or similar syntax agnostic analyses. Examples +of such structures include but are not limited to + +* Parenthesis + * Removed without replacement, the tree structure makes order explicit +* `for` loops and `while (let)` loops + * Converted to `loop` + `match` and some `let` bindings +* `if let` + * Converted to `match` +* Universal `impl Trait` + * Converted to generic arguments + (but with some flags, to know that the user didn't write them) +* Existential `impl Trait` + * Converted to a virtual `existential type` declaration + +Lowering needs to uphold several invariants in order to not trigger the +sanity checks in `src/librustc/hir/map/hir_id_validator.rs`: + +1. A `HirId` must be used if created. So if you use the `lower_node_id`, + you *must* use the resulting `NodeId` or `HirId` (either is fine, since + any `NodeId`s in the `HIR` are checked for existing `HirId`s) +2. Lowering a `HirId` must be done in the scope of the *owning* item. + This means you need to use `with_hir_id_owner` if you are creating parts + of another item than the one being currently lowered. This happens for + example during the lowering of existential `impl Trait` +3. A `NodeId` that will be placed into a HIR structure must be lowered, + even if its `HirId` is unused. Calling + `let _ = self.lower_node_id(node_id);` is perfectly legitimate. +4. If you are creating new nodes that didn't exist in the `AST`, you *must* + create new ids for them. This is done by calling the `next_id` method, + which produces both a new `NodeId` as well as automatically lowering it + for you so you also get the `HirId`. + +If you are creating new `DefId`s, since each `DefId` needs to have a +corresponding `NodeId`, it is adviseable to add these `NodeId`s to the +`AST` so you don't have to generate new ones during lowering. This has +the advantage of creating a way to find the `DefId` of something via its +`NodeId`. If lowering needs this `DefId` in multiple places, you can't +generate a new `NodeId` in all those places because you'd also get a new +`DefId` then. With a `NodeId` from the `AST` this is not an issue. + +Having the `NodeId` also allows the `DefCollector` to generate the `DefId`s +instead of lowering having to do it on the fly. Centralizing the `DefId` +generation in one place makes it easier to refactor and reason about. \ No newline at end of file diff --git a/src/name-resolution.md b/src/name-resolution.md index 5095b750a..bba3142fc 100644 --- a/src/name-resolution.md +++ b/src/name-resolution.md @@ -36,9 +36,9 @@ hierarchy, it's types vs. values vs. macros. ## Scopes and ribs A name is visible only in certain area in the source code. This forms a -hierarchical structure, but not necessarily a simple one ‒ if one scope is part -of another, it doesn't mean the name visible in the outer one is also visible in -the inner one, or that it refers to the same thing. +hierarchical structure, but not necessarily a simple one ‒ if one scope is +part of another, it doesn't mean the name visible in the outer one is also +visible in the inner one, or that it refers to the same thing. To cope with that, the compiler introduces the concept of Ribs. This is abstraction of a scope. Every time the set of visible names potentially changes, @@ -54,9 +54,9 @@ example: When searching for a name, the stack of ribs is traversed from the innermost outwards. This helps to find the closest meaning of the name (the one not shadowed by anything else). The transition to outer rib may also change the -rules what names are usable ‒ if there are nested functions (not closures), the -inner one can't access parameters and local bindings of the outer one, even -though they should be visible by ordinary scoping rules. An example: +rules what names are usable ‒ if there are nested functions (not closures), +the inner one can't access parameters and local bindings of the outer one, +even though they should be visible by ordinary scoping rules. An example: ```rust fn do_something(val: T) { // <- New rib in both types and values (1)