Skip to content

CI: Continuous benchmarking #36860

Closed
Closed
@dsaxton

Description

@dsaxton

I think it would be helpful if pandas had a set of performance benchmarks to run automatically during every CI run (it looks like there is something for the asvs, but it seems they almost never actually run?). It would reduce some of the friction involved in manually running the asv suite and pasting results as comments, and also help prevent certain things from slipping by just not thinking to run benchmarks.

There exists the pytest plugin pytest-benchmark which seems even more lightweight than asv, and appears to be what RAPIDS uses for a benchmark tool of their own https://github.com/rapidsai/benchmark. Could it be an option to have a GitHub Action that runs pytest-benchmark on every PR (I don't know a lot about the plugin, but I am assuming it could be configured to either take a delta between master and the given branch, as well as possibly between that branch and some baseline commit on master such as a major release)? It may also be possible to cache the result from master somewhere to prevent it from being rerun every time.

What constitutes "failure" is another question, and maybe it's best to configure things to only warn on failure instead of making the whole run red (would help address flaky benchmarks as well). Also we would presumably need fine-grained control over the architecture used by GitHub to make sure it doesn't drift, and I'm not sure if that's possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BenchmarkPerformance (ASV) benchmarksCIContinuous IntegrationPerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions