Skip to content

Regression of compilation times on the latest nightly (2018-03-20) #49237

Closed
@newpavlov

Description

@newpavlov

For keccak crate I use simple loop unrolling macros:

macro_rules! unroll5 {
    ($var:ident, $body:block) => {
        { const $var: usize = 0; $body; }
        { const $var: usize = 1; $body; }
        { const $var: usize = 2; $body; }
        { const $var: usize = 3; $body; }
        { const $var: usize = 4; $body; }
    };
}

Which combined generate a lot of unrolled code as a final result. On the latest Nightly it takes a lot more time to compile the crate than previously:

$ cargo clean; rustup run stable cargo build
   Compiling keccak v0.1.0 (...)
    Finished dev [unoptimized + debuginfo] target(s) in 3.9 secs
cargo clean; rustup run stable cargo check
   Compiling keccak v0.1.0 (...)
    Finished dev [unoptimized + debuginfo] target(s) in 2.30 secs
$ cargo clean; rustup run nightly cargo build
   Compiling keccak v0.1.0 (...)
    Finished dev [unoptimized + debuginfo] target(s) in 18.65 secs
$ cargo clean; rustup run nightly cargo check
   Compiling keccak v0.1.0 (...)
    Finished dev [unoptimized + debuginfo] target(s) in 2.32 secs

Here I've used only 10 iterations instead of 24 in the unroll24 macro, full version takes more than several minutes to compile. Judging by cargo check warnings expansion takes approximately the same time and the drastic difference comes from later stages.

EDIT: RUSTFLAGS="-Z time-passes" rustup run nightly cargo build produces the following result:

Click to expand
  time: 0.001; rss: 48MB	parsing
  time: 0.000; rss: 50MB	garbage collect incremental cache directory
  time: 0.000; rss: 50MB	recursion limit
  time: 0.000; rss: 50MB	crate injection
  time: 0.000; rss: 50MB	plugin loading
  time: 0.000; rss: 50MB	plugin registration
  time: 0.000; rss: 50MB	background load prev dep-graph
  time: 0.059; rss: 71MB	expansion
  time: 0.000; rss: 71MB	maybe building test harness
  time: 0.001; rss: 71MB	maybe creating a macro crate
  time: 0.003; rss: 71MB	creating allocators
  time: 0.002; rss: 71MB	AST validation
  time: 0.013; rss: 74MB	name resolution
  time: 0.002; rss: 74MB	complete gated feature checking
  time: 0.000; rss: 74MB	blocked while dep-graph loading finishes
  time: 0.016; rss: 80MB	lowering ast -> hir
  time: 0.008; rss: 80MB	early lint checks
  time: 0.021; rss: 83MB	indexing hir
  time: 0.000; rss: 79MB	load query result cache
  time: 0.000; rss: 79MB	looking for entry point
  time: 0.000; rss: 79MB	looking for plugin registrar
  time: 0.001; rss: 79MB	loop checking
  time: 0.000; rss: 81MB	attribute checking
  time: 0.006; rss: 84MB	stability checking
  time: 0.012; rss: 88MB	type collecting
  time: 0.000; rss: 88MB	outlives testing
  time: 0.000; rss: 88MB	impl wf inference
  time: 0.000; rss: 88MB	coherence checking
  time: 0.000; rss: 88MB	variance testing
  time: 0.046; rss: 106MB	wf checking
  time: 0.023; rss: 109MB	item-types checking
  time: 1.518; rss: 121MB	item-bodies checking
  time: 0.031; rss: 122MB	rvalue promotion
  time: 0.014; rss: 122MB	privacy checking
  time: 0.002; rss: 122MB	intrinsic checking
  time: 0.006; rss: 122MB	match checking
  time: 0.084; rss: 118MB	liveness checking
  time: 0.324; rss: 134MB	borrow checking
  time: 0.002; rss: 135MB	MIR borrow checking
  time: 0.000; rss: 135MB	MIR effect checking
  time: 0.003; rss: 135MB	death checking
  time: 0.000; rss: 135MB	unused lib feature checking
  time: 0.039; rss: 138MB	lint checking
  time: 0.000; rss: 138MB	dumping chalk-like clauses
  time: 0.000; rss: 138MB	resolving dependency formats
    time: 0.037; rss: 140MB	write metadata
    time: 14.945; rss: 143MB	translation item collection
    time: 0.000; rss: 143MB	codegen unit partitioning
    time: 0.142; rss: 151MB	translate to LLVM IR
    time: 0.000; rss: 151MB	assert dep graph
    time: 0.000; rss: 152MB	llvm function passes [2lyh15q6cjwzy18c]
    time: 0.000; rss: 152MB	llvm module passes [2lyh15q6cjwzy18c]
    time: 0.002; rss: 159MB	codegen passes [2lyh15q6cjwzy18c]
      time: 0.024; rss: 160MB	persist query result cache
    time: 0.030; rss: 162MB	llvm function passes [30rksvufw6ddw8se]
    time: 0.004; rss: 163MB	llvm module passes [30rksvufw6ddw8se]
      time: 0.011; rss: 161MB	persist dep-graph
    time: 0.034; rss: 161MB	serialize dep graph
  time: 15.160; rss: 161MB	translation
    time: 0.526; rss: 117MB	codegen passes [30rksvufw6ddw8se]
  time: 0.645; rss: 108MB	LLVM passes
  time: 0.000; rss: 108MB	serialize work products
  time: 0.001; rss: 108MB	linking

Metadata

Metadata

Assignees

Labels

A-const-evalArea: Constant evaluation, covers all const contexts (static, const fn, ...)A-macrosArea: All kinds of macros (custom derive, macro_rules!, proc macros, ..)C-enhancementCategory: An issue proposing an enhancement or a PR with one.I-compiletimeIssue: Problems and improvements with respect to compile times.P-mediumMedium priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.regression-from-stable-to-stablePerformance or correctness regression from one stable version to another.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions