Open
Description
When doing profiling for #51722 I found a number of cases where operating group-by-group performed better than our cython implementation. The group-by-group iteration is expensive, which suggests that the non-iteration portion of that call must be performant. That would go through DataFrame.quantile, which would go through np.percentile (in core.array_algos.quantile). This suggests that the np.percentile implementation may be doing something that we should try to port to group_quantile.
Copy/pasting from my notes-to-self at the time
- Investigate numpy's percentile code
- Our nanmedian does casting and type inference in a way I think is unnecessary
- Profiling groupby.quantile (xref https://github.com/pandas-dev/pandas/pull/51722) suggests that numpy's percentile may just be much more performant than what we have
- https://github.com/numpy/numpy/blob/v1.24.0/numpy/lib/function_base.py#L3920-L4206
- https://github.com/numpy/numpy/blob/v1.24.0/numpy/lib/function_base.py#L3774-L3857