Open
Description
Is your feature request related to a problem?
I often have a list of (variable-size) intervals and I want to aggregate multiple statistics for these intervals.
import numpy as np
import pandas as pd
data = pd.DataFrame({'a': [1]*100, 'b': [2]*100}, index=np.arange(0.0, 10, 0.1))
starts = [0.0, 2.5, 5, 7]
ends = [0.7, 3.8, 6.1, 9.5]
means = []
for start, end in zip(starts, ends):
means.append(data.loc[start:end, :].mean())
Describe the solution you'd like
It would be cool (and probably faster) to do:
groups = pd.IntervalIndex.from_arrays(starts, ends)
data.groupby(groups).mean() # *** ValueError: Grouper and axis must be same length
# or for multiple statistics
data.groupby(groups).aggregate(['mean', 'sum', lambda x: x.quantile(0.75) - x.quantile(0.25)])