BENCH: collect low-dependency asvs #39917
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The idea here is to collect benchmarks in such a way that we can say "This PR doesn't change code in files [...] -> We can skip running asvs in files [...]". (putting together a POC script to automate a bit of this)
There is a cost to refactoring benchmarks, since we have to re-run the new benchmarks on older commits (and im not sure if that happens automatically or if some human intervention is needed cc @TomAugspurger ?). On the flip side, this is the kind of problem that we can throw hardware at, so I don't worry about it that much. Still, the bar for merging is higher than it would be for non-benchmark refactors.
With that in mind, the other refactors I'm looking at:
We have benchmarks for e.g.
isin
scattered around, benchmarking[algos.isin(x, y), pd.array(x).isin(y), Index(s).isin(y), Series(x).isin(y)]
. I'd like to collect these in one place and parametrize them.Similar for Index methods, we have a lot of similar benchmarks for different index subclasses that id like to collect+parametrize.