From 760af301e0c6e133b6a90b624a493290e86b51a5 Mon Sep 17 00:00:00 2001 From: chj <506933131@qq.com> Date: Tue, 16 Aug 2022 19:10:19 +0800 Subject: [PATCH 1/3] Improve doc of MIR queries & passes --- src/SUMMARY.md | 4 +- src/mir/passes.md | 214 ++++++++++++++++++++++++++++++++-------------- 2 files changed, 151 insertions(+), 67 deletions(-) diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 4f975d375..6b70038a3 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -110,8 +110,8 @@ - [The MIR (Mid-level IR)](./mir/index.md) - [MIR construction](./mir/construction.md) - [MIR visitor and traversal](./mir/visitor.md) - - [MIR passes: getting the MIR for a function](./mir/passes.md) -- [Identifiers in the compiler](./identifiers.md) + - [MIR queries and passes: getting the MIR](./mir/passes.md) +- [Identifiers in the Compiler](./identifiers.md) - [Closure expansion](./closure.md) - [Inline assembly](./asm.md) diff --git a/src/mir/passes.md b/src/mir/passes.md index 4c7feb04e..9fbba045d 100644 --- a/src/mir/passes.md +++ b/src/mir/passes.md @@ -1,100 +1,184 @@ -# MIR passes +# MIR queries and passes -If you would like to get the MIR for a function (or constant, etc), -you can use the `optimized_mir(def_id)` query. This will give you back -the final, optimized MIR. For foreign def-ids, we simply read the MIR +If you would like to get the MIR: + +- for a function - you can use the `optimized_mir(def_id)` query; +- for a promoted - you can use the `promoted_mir(def_id)` query. + +These will give you back the final, optimized MIR. For foreign def-ids, we simply read the MIR from the other crate's metadata. But for local def-ids, the query will -construct the MIR and then iteratively optimize it by applying a -series of passes. This section describes how those passes work and how -you can extend them. - -To produce the `optimized_mir(D)` for a given def-id `D`, the MIR -passes through several suites of optimizations, each represented by a -query. Each suite consists of multiple optimizations and -transformations. These suites represent useful intermediate points -where we want to access the MIR for type checking or other purposes: - -- `mir_build(D)` – not a query, but this constructs the initial MIR -- `mir_const(D)` – applies some simple transformations to make MIR ready for - constant evaluation; -- `mir_validated(D)` – applies some more transformations, making MIR ready for - borrow checking; -- `optimized_mir(D)` – the final state, after all optimizations have been - performed. - -### Implementing and registering a pass - -A `MirPass` is some bit of code that processes the MIR, typically – -but not always – transforming it along the way somehow. For example, -it might perform an optimization. The `MirPass` trait itself is found -in [the `rustc_mir_transform` crate][mirtransform], and it -basically consists of one method, `run_pass`, that simply gets an -`&mut Mir` (along with the tcx and some information about where it -came from). The MIR is therefore modified in place (which helps to -keep things efficient). +construct the optimized MIR by requesting a pipeline of upstream queries[^query]. +Each query will contain a series of passes. +This section describes how those queries and passes work and how you can extend them. + +To produce the optimized MIR for a given def-id `D`, `optimized_mir(D)` +goes through several suites of passes, each grouped by a +query. Each suite consists of passes which perform analysis, transformation or optimization. +Each query represent a useful intermediate point +where we can access the MIR dialect for type checking or other purposes: + +- `mir_built(D)` – it gives the initial MIR just after it's built; +- `mir_const(D)` – it applies some simple transformation passes to make MIR ready for + const qualification; +- `mir_promoted(D)` - it extracts promotable temps into separate MIR bodies, and also makes MIR + ready for borrow checking; +- `mir_drops_elaborated_and_const_checked(D)` - it performs borrow checking, runs major + transformation passes (such as drop elaboration) and makes MIR ready for optimization; +- `optimized_mir(D)` – it performs all enabled optimizations and reaches the final state. + +[^query]: See the [Queries](../query.md) chapter for the general concept of query. + +## Implementing and registering a pass + +A `MirPass` is some bit of code that processes the MIR, typically transforming it along the way +somehow. But it may also do other things like lingint (e.g., [`CheckPackedRef`][lint1], +[`CheckConstItemMutation`][lint2], [`FunctionItemReferences`][lint3], which implement `MirLint`) or +optimization (e.g., [`SimplifyCfg`][opt1], [`RemoveUnneededDrops`][opt2]). While most MIR passes +are defined in the [`rustc_mir_transform`][mirtransform] crate, the `MirPass` trait itself is +[found][mirpass] in the `rustc_middle` crate, and it basically consists of one primary method, +`run_pass`, that simply gets an `&mut Body` (along with the `tcx`). +The MIR is therefore modified in place (which helps to keep things efficient). A basic example of a MIR pass is [`RemoveStorageMarkers`], which walks the MIR and removes all storage marks if they won't be emitted during codegen. As you can see from its source, a MIR pass is defined by first defining a -dummy type, a struct with no fields, something like: +dummy type, a struct with no fields: ```rust -struct MyPass; +pub struct RemoveStorageMarkers; ``` -for which you then implement the `MirPass` trait. You can then insert +for which we implement the `MirPass` trait. We can then insert this pass into the appropriate list of passes found in a query like -`optimized_mir`, `mir_validated`, etc. (If this is an optimization, it +`mir_built`, `optimized_mir`, etc. (If this is an optimization, it should go into the `optimized_mir` list.) +Another example of a simple MIR pass is [`CleanupNonCodegenStatements`][cleanup-pass], which walks +the MIR and removes all statements that are not relevant to code generation. As you can see from +its [source][cleanup-source], it is defined by first defining a dummy type, a struct with no +fields: + +```rust +pub struct CleanupNonCodegenStatements; +``` + +for which we then implement the `MirPass` trait: + +```rust +impl<'tcx> MirPass<'tcx> for CleanupNonCodegenStatements { + fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) { + ... + } +} +``` + +We [register][pass-register] this pass inside the `mir_drops_elaborated_and_const_checked` query. +(If this is an optimization, it should go into the `optimized_mir` list.) + If you are writing a pass, there's a good chance that you are going to want to use a [MIR visitor]. MIR visitors are a handy way to walk all the parts of the MIR, either to search for something or to make small edits. -### Stealing +## Stealing -The intermediate queries `mir_const()` and `mir_validated()` yield up -a `&'tcx Steal>`, allocated using -`tcx.alloc_steal_mir()`. This indicates that the result may be -**stolen** by the next suite of optimizations – this is an +The intermediate queries `mir_const()` and `mir_promoted()` yield up +a `&'tcx Steal>`, allocated using `tcx.alloc_steal_mir()`. +This indicates that the result may be **stolen** by a subsequent query – this is an optimization to avoid cloning the MIR. Attempting to use a stolen result will cause a panic in the compiler. Therefore, it is important -that you do not read directly from these intermediate queries except as -part of the MIR processing pipeline. +that you do not accidently read from these intermediate queries without +the consideration of the dependency in the MIR processing pipeline. -Because of this stealing mechanism, some care must also be taken to +Because of this stealing mechanism, some care must be taken to ensure that, before the MIR at a particular phase in the processing pipeline is stolen, anyone who may want to read from it has already -done so. Concretely, this means that if you have some query `foo(D)` +done so. + + +Concretely, this means that if you have a query `foo(D)` that wants to access the result of `mir_const(D)` or -`mir_validated(D)`, you need to have the successor pass "force" +`mir_promoted(D)`, you need to have the successor pass "force" `foo(D)` using `ty::queries::foo::force(...)`. This will force a query to execute even though you don't directly require its result. -As an example, consider MIR const qualification. It wants to read the -result produced by the `mir_const()` suite. However, that result will -be **stolen** by the `mir_validated()` suite. If nothing was done, -then `mir_const_qualif(D)` would succeed if it came before -`mir_validated(D)`, but fail otherwise. Therefore, `mir_validated(D)` -will **force** `mir_const_qualif` before it actually steals, thus -ensuring that the reads have already happened (remember that -[queries are memoized](../query.html), so executing a query twice -simply loads from a cache the second time): - -```text -mir_const(D) --read-by--> mir_const_qualif(D) - | ^ - stolen-by | - | (forces) - v | -mir_validated(D) ------------+ +> This mechanism is a bit dodgy. There is a discussion of more elegant +alternatives in [rust-lang/rust#41710]. + +### Overview + +Below is an overview of the stealing dependency in the MIR processing pipeline[^part]: + +```mermaid +flowchart BT + mir_for_ctfe* --borrow--> id40 + id5 --steal--> id40 + + mir_borrowck* --borrow--> id3 + id41 --steal part 1--> id3 + id40 --steal part 0--> id3 + + mir_const_qualif* -- borrow --> id2 + id3 -- steal --> id2 + + id2 -- steal --> id1 + + id1([mir_built]) + id2([mir_const]) + id3([mir_promoted]) + id40([mir_drops_elaborated_and_const_checked]) + id41([promoted_mir]) + id5([optimized_mir]) + + style id1 fill:#bbf + style id2 fill:#bbf + style id3 fill:#bbf + style id40 fill:#bbf + style id41 fill:#bbf + style id5 fill:#bbf ``` -This mechanism is a bit dodgy. There is a discussion of more elegant -alternatives in [rust-lang/rust#41710]. +The stadium-shape queries (e.g., `mir_built`) with a deep color are the primary queries in the +pipeline, while the rectangle-shape queries (e.g., `mir_const_qualif*`[^star]) with a shallow color +are those subsequent queries that need to read the results from `&'tcx Steal>`. With the +stealing mechanism, the rectangle-shape queries must be performed before any stadium-shape queries, +that have an equal or larger height in the dependency tree, ever do. + +[^part]: The `mir_promoted` query will yield up a tuple +`(&'tcx Steal>, &'tcx Steal>>)`, `promoted_mir` will steal +part 1 (`&'tcx Steal>>`) and `mir_drops_elaborated_and_const_checked` +will steal part 0 (`&'tcx Steal>`). And their stealing is irrelevant to each other, +i.e., can be performed separately. + +[^star]: Note that the `*` suffix in the queries represent a set of queries with the same prefix. +For example, `mir_borrowck*` represents `mir_borrowck`, `mir_borrowck_const_arg` and +`mir_borrowck_opt_const_arg`. + +### Example + +As an example, consider MIR const qualification. It wants to read the result produced by the +`mir_const` query. However, that result will be **stolen** by the `mir_promoted` query at some +time in the pipeline. Before `mir_promoted` is ever queried, calling the `mir_const_qualif` query +will succeed since `mir_const` will produce (if queried the first time) or cache (if queried +multiple times) the `Steal` result and the result is **not** stolen yet. After `mir_promoted` is +queried, the result would be stolen and calling the `mir_const_qualif` query to read the result +would cause a panic. + +Therefore, with this stealing mechanism, `mir_promoted` should guarantee any `mir_const_qualif*` +queries are called before it actually steals, thus ensuring that the reads have already happened +(remember that [queries are memoized](../query.html), so executing a query twice +simply loads from a cache the second time). [rust-lang/rust#41710]: https://github.com/rust-lang/rust/issues/41710 +[mirpass]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/trait.MirPass.html +[lint1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/check_packed_ref/struct.CheckPackedRef.html +[lint2]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/check_const_item_mutation/struct.CheckConstItemMutation.html +[lint3]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/function_item_references/struct.FunctionItemReferences.html +[opt1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/simplify/struct.SimplifyCfg.html +[opt2]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/remove_unneeded_drops/struct.RemoveUnneededDrops.html [mirtransform]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/ [`RemoveStorageMarkers`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/remove_storage_markers/struct.RemoveStorageMarkers.html +[cleanup-pass]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/cleanup_post_borrowck/struct.CleanupNonCodegenStatements.html +[cleanup-source]: https://github.com/rust-lang/rust/blob/e2b52ff73edc8b0b7c74bc28760d618187731fe8/compiler/rustc_mir_transform/src/cleanup_post_borrowck.rs#L27 +[pass-register]: https://github.com/rust-lang/rust/blob/e2b52ff73edc8b0b7c74bc28760d618187731fe8/compiler/rustc_mir_transform/src/lib.rs#L413 [MIR visitor]: ./visitor.html From 5d5eaba6575c8c3bc11f48369c62727214153baa Mon Sep 17 00:00:00 2001 From: Jaic1 <506933131@qq.com> Date: Mon, 22 Aug 2022 15:18:18 +0800 Subject: [PATCH 2/3] Typo in src/mir/passes.md accidently -> accidentally Co-authored-by: Tshepang Mbambo --- src/mir/passes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mir/passes.md b/src/mir/passes.md index 9fbba045d..d03da15f1 100644 --- a/src/mir/passes.md +++ b/src/mir/passes.md @@ -87,7 +87,7 @@ a `&'tcx Steal>`, allocated using `tcx.alloc_steal_mir()`. This indicates that the result may be **stolen** by a subsequent query – this is an optimization to avoid cloning the MIR. Attempting to use a stolen result will cause a panic in the compiler. Therefore, it is important -that you do not accidently read from these intermediate queries without +that you do not accidentally read from these intermediate queries without the consideration of the dependency in the MIR processing pipeline. Because of this stealing mechanism, some care must be taken to From 1f9d4e29e733febd9a6a8ae7eda7aa8289faf490 Mon Sep 17 00:00:00 2001 From: Jaic1 <506933131@qq.com> Date: Sun, 30 Jun 2024 11:04:30 +0800 Subject: [PATCH 3/3] refine mir passes doc --- src/mir/passes.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/src/mir/passes.md b/src/mir/passes.md index d03da15f1..e67d9d93f 100644 --- a/src/mir/passes.md +++ b/src/mir/passes.md @@ -2,8 +2,8 @@ If you would like to get the MIR: -- for a function - you can use the `optimized_mir(def_id)` query; -- for a promoted - you can use the `promoted_mir(def_id)` query. +- for a function - you can use the `optimized_mir` query (typically used by codegen) or the `mir_for_ctfe` query (typically used by compile time function evaluation, i.e., *CTFE*); +- for a promoted - you can use the `promoted_mir` query. These will give you back the final, optimized MIR. For foreign def-ids, we simply read the MIR from the other crate's metadata. But for local def-ids, the query will @@ -13,8 +13,8 @@ This section describes how those queries and passes work and how you can extend To produce the optimized MIR for a given def-id `D`, `optimized_mir(D)` goes through several suites of passes, each grouped by a -query. Each suite consists of passes which perform analysis, transformation or optimization. -Each query represent a useful intermediate point +query. Each suite consists of passes which perform linting, analysis, transformation or +optimization. Each query represent a useful intermediate point where we can access the MIR dialect for type checking or other purposes: - `mir_built(D)` – it gives the initial MIR just after it's built; @@ -62,7 +62,7 @@ fields: pub struct CleanupNonCodegenStatements; ``` -for which we then implement the `MirPass` trait: +for which we implement the `MirPass` trait: ```rust impl<'tcx> MirPass<'tcx> for CleanupNonCodegenStatements { @@ -95,11 +95,9 @@ ensure that, before the MIR at a particular phase in the processing pipeline is stolen, anyone who may want to read from it has already done so. - Concretely, this means that if you have a query `foo(D)` -that wants to access the result of `mir_const(D)` or -`mir_promoted(D)`, you need to have the successor pass "force" -`foo(D)` using `ty::queries::foo::force(...)`. This will force a query +that wants to access the result of `mir_promoted(D)`, you need to have `foo(D)` +calling the `mir_const(D)` query first. This will force it to execute even though you don't directly require its result. > This mechanism is a bit dodgy. There is a discussion of more elegant