11# Parallel Compilation
22
33As of <!-- date-check --> August 2022, the only stage of the compiler that
4- is already parallel is codegen. Some other parts of the nightly compiler
5- have parallel implementations, such as query evaluation, type check and
6- monomorphization, but there is still a lot of work to be done. The lack of
7- parallelism at other stages (for example, macro expansion) also represents
8- an opportunity for improving compiler performance.
4+ is already parallel is codegen. Some parts of the compiler already have
5+ parallel implementations, such as query evaluation, type check and
6+ monomorphization, but the general version of the compiler does not include
7+ these parallelization functions. ** To try out the current parallel compiler** ,
8+ one can install rustc from source code with ` parallel-compiler = true ` in
9+ the ` config.toml ` .
910
10- ** To try out the current parallel compiler ** , one can install rustc from
11- source code with ` parallel-compiler = true ` in the ` config.toml ` .
11+ The lack of parallelism at other stages (for example, macro expansion) also
12+ represents an opportunity for improving compiler performance .
1213
1314These next few sections describe where and how parallelism is currently used,
1415and the current status of making parallel compilation the default in ` rustc ` .
@@ -45,9 +46,15 @@ are implemented diferently depending on whether `parallel-compiler` is true.
4546| MappedLockGuard | parking_lot::MappedMutexGuard | std::cell::RefMut |
4647| MetadataRef | [ ` OwningRef<Box<dyn Erased + Send + Sync>, [u8]> ` ] [ OwningRef ] | [ ` OwningRef<Box<dyn Erased>, [u8]> ` ] [ OwningRef ] |
4748
48- - There are currently a lot of global data structures that need to be made
49- thread-safe. A key strategy here has been converting interior-mutable
50- data-structures (e.g. ` Cell ` ) into their thread-safe siblings (e.g. ` Mutex ` ).
49+ - These thread-safe data structures interspersed during compilation can
50+ cause a lot of lock contention, which actually degrades performance as the
51+ number of threads increases beyond 4. This inspires us to audit the use
52+ of these data structures, leading to either refactoring to reduce use of
53+ shared state, or persistent documentation covering invariants, atomicity,
54+ and lock orderings.
55+
56+ - On the other hand, we still need to figure out what other invariants
57+ during compilation might not hold in parallel compilation.
5158
5259### WorkLocal
5360
@@ -64,10 +71,10 @@ can be accessed directly through `Deref::deref`.
6471
6572## Parallel Iterator
6673
67- The parallel iterators provided by the [ ` rayon ` ] crate are efficient
68- ways to achieve parallelization. The current nightly rustc uses (a custom
69- fork of) [ ` rayon ` ] to run tasks in parallel. The custom fork allows the
70- execution of DAGs of tasks, not just trees.
74+ The parallel iterators provided by the [ ` rayon ` ] crate are easy ways
75+ to implement parallelism. In the current implementation of the parallel
76+ compiler we use a custom fork of [ ` rayon ` ] to run tasks in parallel.
77+ * (more information wanted here) *
7178
7279Some iterator functions are implemented in the current nightly compiler to
7380run loops in parallel when ` parallel-compiler ` is true.
@@ -124,9 +131,11 @@ When a query `foo` is evaluated, the cache table for `foo` is locked.
124131 start evaluating.
125132- If there * is* another query invocation for the same key in progress, we
126133 release the lock, and just block the thread until the other invocation has
127- computed the result we are waiting for. ** Deadlocks are possible** , in which
128- case ` rustc_query_system::query::job::deadlock() ` will be called to detect
129- and remove the deadlock and then return cycle error as the query result.
134+ computed the result we are waiting for. ** Cycle error detection** in the parallel
135+ compiler requires more complex logic than in single-threaded mode. When
136+ worker threads in parallel queries stop making progress due to interdependence,
137+ the compiler uses an extra thread * (named deadlock handler)* to detect, remove and
138+ report the cycle error.
130139
131140Parallel query still has a lot of work to do, most of which is related to
132141the previous ` Data Structures ` and ` Parallel Iterators ` . See [ this tracking issue] [ tracking ] .
@@ -137,22 +146,7 @@ As of <!-- date-check--> May 2022, there are still a number of steps
137146to complete before rustdoc rendering can be made parallel. More details on
138147this issue can be found [ here] [ parallel-rustdoc ] .
139148
140- ## Current Status
141-
142- As of <!-- date-check --> May 2022, work on explicitly parallelizing the
143- compiler has stalled. There is a lot of design and correctness work that needs
144- to be done.
145-
146- As of <!-- date-check --> May 2022, much of this effort is on hold due
147- to lack of manpower. We have a working prototype with promising performance
148- gains in many cases. However, there are two blockers:
149-
150- - It's not clear what invariants need to be upheld that might not hold in the
151- face of concurrency. An auditing effort was underway, but seems to have
152- stalled at some point.
153-
154- - There is a lot of lock contention, which actually degrades performance as the
155- number of threads increases beyond 4.
149+ ## Resources
156150
157151Here are some resources that can be used to learn more (note that some of them
158152are a bit out of date):
0 commit comments