Skip to content

Cache dependencies between crates published by the same owner #1757

Open
@jyn514

Description

@jyn514

Right now, when a large workspace publishes its crates, we spend a lot of time rebuilding the same dependencies over and over again (e.g. https://github.com/aptos-labs/aptos-core). To reduce the load on docs.rs and speed up builds, we can cache those dependencies to avoid rebuilding them.

We are only planning to do this between crates owned by the same crates.io owner for security reasons; see https://discord.com/channels/442252698964721669/541978667522195476/996125244026794134 for a discussion of how this causes issues between crates. In practice, this should still be a giant speed up since most of our build times come from large workspaces.

In order to have deterministic time limits, we are going to count the time spent to compile a dependency against the time limit, even if we already have the dependency cached locally. To avoid running out of disk space we'll use an LRU cache that deletes crates when we have less than ~20 GB free on the production server. For reproducibility we are going to make the cache read-only, and wipe the target/doc directory after each root crate we build.

Here is the rough approach we're imagining:

  1. Run cargo check --timings=json -Zunstable-options -p {dep} for each dependency of the crate we're building
  2. If we can parse the JSON output, record the timings in a database. Otherwise, make a note to wipe the cache for everything we were unable to parse after this build finishes.
  3. Run chmod -w on everything in target (except for the top-level directory, so cargo can create target/doc)
  4. Subtract the timings of all dependencies from the time limit (even if some were already cached). If we were unable to parse the JSON, subtract the entire time needed in step 2.
  5. Run cargo doc and upload to S3.
  6. Delete target/doc.
  7. repeat 1-6 for all new crates.io crates published by the same owner.
  8. Wipe the cache and the database table the next time we call rustwide.update_toolchain, since the new nightly will be unable to reusse the cache.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-buildsArea: Building the documentation for a crateE-mediumEffort: This requires a fair amount of work

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions