Skip to content

BUG: retain ordered Categorical dtype in SeriesGroupBy aggregations #41147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 30, 2021

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Apr 24, 2021

  • closes #xxxx
  • tests added / passed
  • Ensure all linting tests pass, see here for how to run them
  • whatsnew entry

ATM we correctly wrap them for DataFrameGroupBy but not for SeriesGroupBy. Doing this correctly for SeriesGroupBy has the benefit of avoiding the need to special-case it in DataFrameGroupBy

@jbrockmendel jbrockmendel added Bug Categorical Categorical Data Type Groupby labels Apr 25, 2021
result = type(values)._from_sequence(result.ravel(), dtype=values.dtype)
# Note this will have result.dtype == dtype from above

elif (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in conjunction with #41086, cast_agg_result becomes a 1-liner and can be cleanly tacked onto the end of py_fallback

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a question regarding possible future work

)
objvals = obj._values

if isinstance(objvals, Categorical):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to try to separate this out further upstream in the future, so that _cython_agg_general isn't calling pure-python methods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'll get this case handled within WrappedCythonOp._ea_wrap_cython_operation before too long, but im still troubleshooting that

@jreback jreback added this to the 1.3 milestone Apr 30, 2021
@jreback
Copy link
Contributor

jreback commented Apr 30, 2021

@jbrockmendel lgtm. was there a precursor to merge first?

@jbrockmendel
Copy link
Member Author

lgtm. was there a precursor to merge first?

already got #41111

@jreback jreback merged commit bddc683 into pandas-dev:master Apr 30, 2021
@jbrockmendel jbrockmendel deleted the bug-gb-cat branch April 30, 2021 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type Groupby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants