Skip to content

Performance bottlenecks in Invoker.invoked() #201

Closed
@JoshRosen

Description

@JoshRosen

I'm in the process of integrating Scoverage into Apache Spark's build and am currently blocked by performance issues in Invoker.invoked(): with coverage instrumentation enabled, some of my suites take up to 10x longer to run.

I profiled one suite in YourKit and found that Invoker.invoked() spends a huge amount of time computing hashcodes to index into a TreeMap. For example, see the following profiler screenshot:

image

This particular bottleneck is caused by the fact that Invoker has a map

private val ids = ThreadSafeMap.empty[(String, Int), Any]

where the keys are (coverageDirectory, id) pairs. Replacing this single-level map by nested maps removes the need to construct and hash a tuple on every invocation, massively speeding things up.

After that optimization, this method ends up becoming bottlenecked on OutputStreamWriter.flush() calls. This bottleneck wasn't apparent before because it was masked by the hashCode issue. Here's profiling output showing this:

image

I see a few rationales for the current aggressive flush()ing behavior:

  1. We may not always have the opportunity to flush in a JVM exit hook (e.g. if SBT is running tests in non-fork mode.)
  2. You may want to collect coverage data if the JVM exits in an unclean way (e.g. kill -9).

I think that (2) is less of a concern, but (1) is a problem in some environments.

For Spark, we always run tests in forked JVMs so flushing on JVM exit would be perfectly acceptable to us. Therefore I would like to introduce an option to specify this behavior. I can't spot a clear mechanism to plumb configuration options from SBT to the Invoker, so therefore I propose to use a system property to control this.

I plan to submit pull requests for both issues (the hashCode() optimization and the shutdown hook option).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions