Skip to content

MSVC rustc is unnaturally slower than Linux rustc #66192

Open
@alexcrichton

Description

@alexcrichton

While it's generally assumed that build systems on Windows are slower than build systems on Linux, I'm seeing a discrepancy of up to nearly 2x differences in compile times per crate on a Windows machine vs a Linux machine. These are personal machines I work on and they're not exactly equivalent machines, but I'm pretty surprised about the 2x differences I'm seeing here and wanted to open an issue to see if we can investigate to get to the bottom of what's going on.

The specifications of the machines I have are:

  • Linux - Intel(R) Core(TM) i9-7940X CPU @ 3.10GHz, 14-core/28-thread, 64GB ram
  • Windows - Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz, 4-core/8-thread, 32GB ram

I don't really know a ton about Intel CPUs, so I'm not actually sure if these are expected where the i9 is 2x faster than the i7. I wanted to write down some details though to see if others have thoughts. All Cargo commands were executed with -j4 to ensure that neither machine had an unfair parallelism advantage, and also to ideally isolate the effect of hyperthreads.

I started out by building https://github.com/cranestation/wasmtime/tree/ab3cd945bc2f4626a2fae8eabf6c7108973ce1a5, and the full -Ztimings graph I got was:

For the same project and the same compiler commit the Windows build is nearly 70% slower! I don't think that my CPUs have a 70% performance difference between them, and I don't have a perfect test environment for this, but 70% feels like a huge performance discrepancy between Linux and Windows.

Glancing at the slow building crates (use the "min unit time" slider to see them more easily) I'm seeing that almost all crates are 2x slower on Windows than on Linux. This doesn't look like a "chalk it up to windows being slow" issue, but this is where I started thinking that this was more likely to be a bug somewhere in rustc and/or LLVM.

Next up I wanted to try out -Z self-profile on a particular crate. One I wrote recently was the wast crate, which took 13.76s on Linux and 23.05s on Windows. I dug in a bit more building just that crate at https://github.com/alexcrichton/wat/tree/2288911124001d30de0a68e284db9ab010495536/crates/wast.

Here sure enough, the command cargo +nightly build --release -p wast -j4 has a huge discrepancy:

  • Linux - 5.18s
  • Windows - 8.58s

Next up I tried -Z self-profile and using measurme I ran summarize diff and got this output, notably:

+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| Item                                        | Self Time     | Item count | Cache hits | Blocked time | Incremental load time |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_thin_lto_optimize                      | +3.86042516s  | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_optimize_module_passes          | +3.152410865s | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj                | +1.783877999s | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| codegen_crate                               | +1.021669947s | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_thin_lto_import                        | +245.950489ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| codegen_module                              | +220.253166ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_optimize_function_passes        | +134.256719ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode            | +111.530996ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+

For whatever reason, it appears that LLVM is massively slower on Windows than it is on Linux.

It was at this point that I decided to write up the issue here and get this all down in a report. I suspect that this is either a build system problem with Windows or it's a compiler problem. We're using Clang on Linux but we're not using Clang on Windows yet, so it may be time to make the transition!

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.I-compiletimeIssue: Problems and improvements with respect to compile times.O-windows-msvcToolchain: MSVC, Operating system: WindowsT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-infraRelevant to the infrastructure team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions