Skip to content

Commit 0fe44f7

Browse files
Sl1mb0jyn514
authored andcommitted
Docs: consolidated parallelism information
1 parent 71d88b3 commit 0fe44f7

File tree

3 files changed

+47
-47
lines changed

3 files changed

+47
-47
lines changed

src/overview.md

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -297,12 +297,7 @@ Compiler performance is a problem that we would like to improve on
297297
(and are always working on). One aspect of that is parallelizing
298298
`rustc` itself.
299299

300-
Currently, there is only one part of rustc that is already parallel: codegen.
301-
During monomorphization, the compiler will split up all the code to be
302-
generated into smaller chunks called _codegen units_. These are then generated
303-
by independent instances of LLVM. Since they are independent, we can run them
304-
in parallel. At the end, the linker is run to combine all the codegen units
305-
together into one binary.
300+
Currently, there is only one part of rustc that is parallel by default: codegen.
306301

307302
However, the rest of the compiler is still not yet parallel. There have been
308303
lots of efforts spent on this, but it is generally a hard problem. The current

src/parallel-rustc.md

Lines changed: 46 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,54 @@
11
# Parallel Compilation
22

3-
Most of the compiler is not parallel. This represents an opportunity for
4-
improving compiler performance.
3+
As of <!-- date: 2021-09 --> September 2021, The only stage of the compiler
4+
that is already parallel is codegen. The nightly compiler implements query evaluation,
5+
but there is a lot of correctness work that needs to be done. The lack of parallelism at other stages
6+
also represents an opportunity for improving compiler performance. One can try out the current
7+
parallel compiler work by enabling it in the `config.toml`.
58

6-
As of <!-- date: 2021-07 --> July 2021, work on explicitly parallelizing the
7-
compiler has stalled. There is a lot of design and correctness work that needs
8-
to be done.
9+
These next few sections describe where and how parallelism is currently used,
10+
and the current status of making parallel compilation the default in `rustc`.
11+
12+
The underlying thread-safe data-structures used in the parallel compiler
13+
can be found in `rustc_data_structures/sync.rs`. Some of these data structures
14+
use the `parking_lot` API.
15+
16+
## Code Gen
17+
18+
During [monomorphization][monomorphization] the compiler splits up all the code to
19+
be generated into smaller chunks called _codegen units_. These are then generated by
20+
independent instances of LLVM running in parallel. At the end, the linker
21+
is run to combine all the codegen units together into one binary.
22+
23+
## Query System
924

10-
One can try out the current parallel compiler work by enabling it in the
11-
`config.toml`.
25+
The query model has some properties that make it actually feasible to evaluate
26+
multiple queries in parallel without too much of an effort:
1227

13-
There are a few basic ideas in this effort:
28+
- All data a query provider can access is accessed via the query context, so
29+
the query context can take care of synchronizing access.
30+
- Query results are required to be immutable so they can safely be used by
31+
different threads concurrently.
1432

15-
- There are a lot of loops in the compiler that just iterate over all items in
16-
a crate. These can possibly be parallelized.
17-
- We can use (a custom fork of) [`rayon`] to run tasks in parallel. The custom
18-
fork allows the execution of DAGs of tasks, not just trees.
19-
- There are currently a lot of global data structures that need to be made
20-
thread-safe. A key strategy here has been converting interior-mutable
21-
data-structures (e.g. `Cell`) into their thread-safe siblings (e.g. `Mutex`).
33+
34+
When a query `foo` is evaluated, the cache table for `foo` is locked.
35+
36+
- If there already is a result, we can clone it, release the lock and
37+
we are done.
38+
- If there is no cache entry and no other active query invocation computing the
39+
same result, we mark the key as being "in progress", release the lock and
40+
start evaluating.
41+
- If there *is* another query invocation for the same key in progress, we
42+
release the lock, and just block the thread until the other invocation has
43+
computed the result we are waiting for. This cannot deadlock because, as
44+
mentioned before, query invocations form a DAG. Some thread will always make
45+
progress.
46+
47+
## Current Status
48+
49+
As of <!-- date: 2021-07 --> July 2021, work on explicitly parallelizing the
50+
compiler has stalled. There is a lot of design and correctness work that needs
51+
to be done.
2252

2353
[`rayon`]: https://crates.io/crates/rayon
2454

@@ -45,3 +75,4 @@ are a bit out of date):
4575
[imlist]: https://github.com/nikomatsakis/rustc-parallelization/blob/master/interior-mutability-list.md
4676
[irlo1]: https://internals.rust-lang.org/t/help-test-parallel-rustc/11503
4777
[tracking]: https://github.com/rust-lang/rust/issues/48685
78+
[monomorphization]:https://rustc-dev-guide.rust-lang.org/backend/monomorph.html

src/queries/query-evaluation-model-in-detail.md

Lines changed: 0 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -211,29 +211,3 @@ much of a maintenance burden.
211211

212212
To summarize: "Steal queries" break some of the rules in a controlled way.
213213
There are checks in place that make sure that nothing can go silently wrong.
214-
215-
216-
## Parallel Query Execution
217-
218-
The query model has some properties that make it actually feasible to evaluate
219-
multiple queries in parallel without too much of an effort:
220-
221-
- All data a query provider can access is accessed via the query context, so
222-
the query context can take care of synchronizing access.
223-
- Query results are required to be immutable so they can safely be used by
224-
different threads concurrently.
225-
226-
The nightly compiler already implements parallel query evaluation as follows:
227-
228-
When a query `foo` is evaluated, the cache table for `foo` is locked.
229-
230-
- If there already is a result, we can clone it, release the lock and
231-
we are done.
232-
- If there is no cache entry and no other active query invocation computing the
233-
same result, we mark the key as being "in progress", release the lock and
234-
start evaluating.
235-
- If there *is* another query invocation for the same key in progress, we
236-
release the lock, and just block the thread until the other invocation has
237-
computed the result we are waiting for. This cannot deadlock because, as
238-
mentioned before, query invocations form a DAG. Some thread will always make
239-
progress.

0 commit comments

Comments
 (0)