Closed
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas
df = pandas.DataFrame(
{
"A": [1, 2, 3, 4, 5] * 4,
"B": [1, 2, 3, 4, 5] * 4,
"C": [1, 2, 3, 4, 5] * 4,
}
)
df.groupby(["A", "B"]).transform(lambda x: x)
Issue Description
👋 Since about February 27, the above snippet has been generating a segmentation fault in pandas main
. As far as I can tell, this is coming from get_group_index_sorter()
in pandas.core.sorting
.
Based on the timing and git history, it may be related to #45953, though I've been unable to identify the source of the problem thus far.
A few observations:
- The length of the series seem to matter. If I shorten the sample df to have length 15, things work fine.
- It seems to matter if I groupby more than one field (just grouping by
"A"
works fine) - The segfault only happens for
transform
. If I useapply
it works.
Expected Behavior
No segfault should occur.
Installed Versions
This shows up on pandas main
.
Based on the nightly builds here, it seems like the first affected version was this one.