You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: posts/inside-rust/2020-06-24-lto-improvements.md
+3-2Lines changed: 3 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ When compiling a library, `rustc` saves the output in an `rlib` file which is an
24
24
LTO is an optimization technique that can perform whole-program analysis. It analyzes all of the bitcode from every library at once, and performs optimizations and code generation. `rustc` supports several forms of LTO:
25
25
26
26
* Fat LTO. This performs "full" LTO, which can take a long time to complete and may require a significant amount of memory.
27
-
*[Thin LTO]. This is a lightweight version of "fat" LTO that can achieve similar performance improvements while taking much less time to complete.
27
+
*[Thin LTO]. This LTO variant supports much better parallelism than fat LTO. It can achieve similar performance improvements as fat LTO (sometimes even better!), while taking much less total time by taking advantage of more CPUs.
28
28
* Thin-local LTO. By default, `rustc` will split a crate into multiple "codegen units" so that they can be processed in parallel by LLVM. But this prevents some optimizations as code is separated into different codegen units, and is handled independently. Thin-local LTO will perform thin LTO across the codegen units within a single crate, bringing back some optimizations that would otherwise be lost by the separation. This is `rustc`'s default behavior if opt-level is greater than 0.
29
29
30
30
## What has changed
@@ -35,7 +35,7 @@ If the project is using LTO, then Cargo will instruct `rustc` to not place objec
35
35
36
36
Two `rustc` flags are now available to control how the rlib is constructed:
37
37
38
-
*[`-C linker-plugin-lto`] causes `rustc` to only place bitcode in the `.o` files, and skips code generation. Cargo uses this when the rlib is only intended for use with LTO. This can also be used when doing cross-language LTO.
38
+
*[`-C linker-plugin-lto`] causes `rustc` to only place bitcode in the `.o` files, and skips code generation. This flag was [originally added][linker-plugin-lto-track] to support cross-language LTO. Cargo now uses this when the rlib is only intended for use with LTO.
39
39
*[`-C embed-bitcode=no`] causes `rustc` to avoid placing bitcode in the rlib altogether. Cargo uses this when LTO is not being used, which reduces some disk space usage.
40
40
41
41
Additionally, the method in which bitcode is embedded in the rlib has changed. Previously, `rustc` would place compressed bitcode as a `.bc.z` file in the rlib archive. Now, the bitcode is placed as an uncompressed section within each `.o`[object file] in the rlib archive. This can sometimes be a small performance benefit, because it avoids cost of compressing the bitcode, and sometimes can be slower due to needing to write more data to disk. This change helped simplify the implementation, and also matches the behavior of clang's `-fembed-bitcode` option (typically used with Apple's iOS-based operating systems).
@@ -89,3 +89,4 @@ Although this is a conceptually simple change (LTO=bitcode, non-LTO=object code)
0 commit comments