Skip to content

Compilation memory spikes during LTO #65431

Closed
@tiagolam

Description

@tiagolam

Test case:

static FILLER: [u8; 60 * 1024 * 1024] = [1; 60 * 1024 * 1024];

fn main() {
    println!("Hello, world! {}", FILLER[0]);

    tokio::runtime::current_thread::Runtime::new().unwrap();
}

(with the dependency tokio = "^0.2.0-alpha.3" added to Cargo.toml)

The FILLER is there so we can simulate a binary of a certain size.

Compiling with:

cargo build --release

Generates memory spikes of around 12GiB, which sometimes means the compilation fails as it ends up being killed by the OS (signal: 9, SIGKILL: kill).

Further analysis suggests it has to do with lto cargo rustc --release -p spike -- -Ztime-passes:

time: 0.000; rss: 51MB    parsing
  time: 0.000; rss: 51MB    attributes injection
  time: 0.000; rss: 51MB    recursion limit
  time: 0.000; rss: 51MB    plugin loading
  time: 0.000; rss: 51MB    plugin registration
  time: 0.000; rss: 51MB    pre-AST-expansion lint checks
  time: 0.000; rss: 54MB    crate injection
    time: 0.009; rss: 67MB  expand crate
    time: 0.000; rss: 67MB  check unused macros
  time: 0.009; rss: 67MB    expansion
  time: 0.000; rss: 67MB    maybe building test harness
  time: 0.000; rss: 67MB    AST validation
  time: 0.000; rss: 67MB    maybe creating a macro crate
  time: 0.013; rss: 85MB    name resolution
  time: 0.000; rss: 85MB    complete gated feature checking
  time: 0.000; rss: 87MB    lowering AST -> HIR
  time: 0.000; rss: 87MB    early lint checks
    time: 0.000; rss: 87MB  validate HIR map
  time: 0.000; rss: 87MB    indexing HIR
  time: 0.000; rss: 87MB    load query result cache
  time: 0.000; rss: 89MB    dep graph tcx init
    time: 0.000; rss: 89MB  looking for entry point
    time: 0.000; rss: 89MB  looking for plugin registrar
    time: 0.000; rss: 89MB  looking for derive registrar
  time: 0.000; rss: 89MB    misc checking 1
  time: 0.000; rss: 92MB    type collecting
  time: 0.000; rss: 92MB    impl wf inference
    time: 0.000; rss: 92MB  unsafety checking
    time: 0.000; rss: 92MB  orphan checking
  time: 0.000; rss: 92MB    coherence checking
  time: 0.009; rss: 112MB   wf checking
  time: 0.000; rss: 112MB   item-types checking
  time: 0.013; rss: 127MB   item-bodies checking
    time: 0.000; rss: 127MB match checking
    time: 0.000; rss: 127MB liveness checking + intrinsic checking
  time: 0.000; rss: 127MB   misc checking 2
  time: 0.002; rss: 127MB   MIR borrow checking
  time: 0.000; rss: 127MB   dumping Chalk-like clauses
  time: 0.000; rss: 127MB   MIR effect checking
  time: 0.000; rss: 127MB   layout testing
    time: 0.000; rss: 127MB privacy access levels
    time: 0.000; rss: 127MB private in public
    time: 0.000; rss: 127MB death checking
    time: 0.000; rss: 127MB unused lib feature checking
      time: 3.072; rss: 199MB   crate lints
      time: 0.000; rss: 199MB   module lints
    time: 3.072; rss: 199MB lint checking
    time: 0.000; rss: 199MB privacy checking modules
  time: 3.073; rss: 199MB   misc checking 3
  time: 0.000; rss: 199MB   metadata encoding and writing
      time: 0.000; rss: 199MB   collecting roots
      time: 0.169; rss: 214MB   collecting mono items
    time: 0.169; rss: 214MB monomorphization collection
    time: 0.001; rss: 214MB codegen unit partitioning
    time: 0.000; rss: 215MB write allocator module
    time: 0.003; rss: 221MB llvm function passes [what.i4xg1d4t-cgu.0]
    time: 0.003; rss: 225MB llvm function passes [what.i4xg1d4t-cgu.3]
    time: 0.003; rss: 227MB llvm function passes [what.i4xg1d4t-cgu.8]
    time: 0.002; rss: 228MB llvm function passes [what.i4xg1d4t-cgu.15]
    time: 0.002; rss: 229MB llvm function passes [what.i4xg1d4t-cgu.9]
    time: 0.002; rss: 231MB llvm function passes [what.i4xg1d4t-cgu.4]
    time: 0.003; rss: 233MB llvm function passes [what.i4xg1d4t-cgu.2]
    time: 0.023; rss: 233MB llvm module passes [what.i4xg1d4t-cgu.15]
    time: 0.002; rss: 234MB llvm function passes [what.i4xg1d4t-cgu.6]
    time: 0.002; rss: 235MB llvm function passes [what.i4xg1d4t-cgu.13]
    time: 0.023; rss: 235MB llvm module passes [what.i4xg1d4t-cgu.4]
    time: 0.039; rss: 235MB llvm module passes [what.i4xg1d4t-cgu.8]
    time: 0.031; rss: 235MB llvm module passes [what.i4xg1d4t-cgu.9]
    time: 0.060; rss: 236MB llvm module passes [what.i4xg1d4t-cgu.0]
    time: 0.055; rss: 237MB llvm module passes [what.i4xg1d4t-cgu.3]
    time: 0.026; rss: 237MB llvm module passes [what.i4xg1d4t-cgu.2]
    time: 0.014; rss: 237MB llvm module passes [what.i4xg1d4t-cgu.13]
    time: 0.025; rss: 238MB llvm module passes [what.i4xg1d4t-cgu.6]
    time: 0.002; rss: 301MB llvm function passes [what.i4xg1d4t-cgu.12]
    time: 0.001; rss: 301MB llvm function passes [what.i4xg1d4t-cgu.11]
    time: 0.001; rss: 301MB llvm function passes [what.i4xg1d4t-cgu.1]
    time: 0.001; rss: 301MB llvm function passes [what.i4xg1d4t-cgu.5]
    time: 0.001; rss: 301MB llvm function passes [what.i4xg1d4t-cgu.7]
    time: 0.001; rss: 300MB llvm function passes [what.i4xg1d4t-cgu.14]
    time: 0.160; rss: 300MB codegen to LLVM IR
    time: 0.000; rss: 300MB assert dep graph
    time: 0.000; rss: 300MB serialize dep graph
  time: 0.335; rss: 300MB   codegen
    time: 0.008; rss: 300MB llvm module passes [what.i4xg1d4t-cgu.11]
    time: 0.001; rss: 300MB llvm function passes [what.i4xg1d4t-cgu.10]
    time: 0.006; rss: 301MB llvm module passes [what.i4xg1d4t-cgu.7]
    time: 0.011; rss: 301MB llvm module passes [what.i4xg1d4t-cgu.1]
    time: 0.004; rss: 301MB llvm module passes [what.i4xg1d4t-cgu.10]
    time: 0.007; rss: 301MB llvm module passes [what.i4xg1d4t-cgu.14]
    time: 0.017; rss: 301MB llvm module passes [what.i4xg1d4t-cgu.5]
    time: 0.027; rss: 301MB llvm module passes [what.i4xg1d4t-cgu.12]
    time: 0.001; rss: 1359MB    LTO passes
    time: 0.002; rss: 1369MB    LTO passes
    time: 0.001; rss: 1371MB    codegen passes [what.i4xg1d4t-cgu.1]
    time: 0.003; rss: 1386MB    codegen passes [what.i4xg1d4t-cgu.10]
    time: 0.007; rss: 1397MB    LTO passes
    time: 0.004; rss: 1408MB    LTO passes
    time: 0.010; rss: 1416MB    LTO passes
    time: 0.008; rss: 1463MB    codegen passes [what.i4xg1d4t-cgu.14]
    time: 0.013; rss: 1480MB    codegen passes [what.i4xg1d4t-cgu.7]
    time: 0.015; rss: 1515MB    codegen passes [what.i4xg1d4t-cgu.4]
    time: 0.009; rss: 1561MB    LTO passes
    time: 0.001; rss: 1569MB    LTO passes
    time: 0.001; rss: 1575MB    codegen passes [what.i4xg1d4t-cgu.11]
    time: 0.006; rss: 1586MB    codegen passes [what.i4xg1d4t-cgu.5]
    time: 0.065; rss: 7193MB    LTO passes
    time: 0.222; rss: 8929MB    codegen passes [what.i4xg1d4t-cgu.12]
    time: 0.032; rss: 9334MB    LTO passes
    time: 0.034; rss: 9430MB    codegen passes [what.i4xg1d4t-cgu.0]
    time: 0.025; rss: 9572MB    LTO passes
    time: 0.021; rss: 9604MB    LTO passes
    time: 0.021; rss: 9620MB    codegen passes [what.i4xg1d4t-cgu.2]
    time: 0.028; rss: 9622MB    LTO passes
    time: 0.017; rss: 9668MB    codegen passes [what.i4xg1d4t-cgu.8]
    time: 0.026; rss: 9669MB    codegen passes [what.i4xg1d4t-cgu.3]
    time: 0.006; rss: 9756MB    LTO passes
    time: 0.007; rss: 9756MB    codegen passes [what.i4xg1d4t-cgu.13]
    time: 0.035; rss: 9756MB    LTO passes
    time: 0.019; rss: 9756MB    LTO passes
    time: 0.040; rss: 9756MB    LTO passes
    time: 0.013; rss: 9756MB    codegen passes [what.i4xg1d4t-cgu.9]
    time: 0.016; rss: 9756MB    codegen passes [what.i4xg1d4t-cgu.15]
    time: 0.048; rss: 9757MB    codegen passes [what.i4xg1d4t-cgu.6]
  time: 2.697; rss: 9757MB  LLVM passes
  time: 0.000; rss: 9757MB  serialize work products
    time: 0.493; rss: 9758MB    running linker
  time: 0.500; rss: 9758MB  linking
time: 6.535; rss: 9713MB        total
    Finished release [optimized] target(s) in 6.82s

The issue can be worked around by either disabling LTO with RUSTFLAGS='-C lto=no' or change the FILLER to be a static array of 0's:

static FILLER: [u8; 60 * 1024 * 1024] = [0; 60 * 1024 * 1024];

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.I-compilememIssue: Problems and improvements with respect to memory usage during compilation.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions