Skip to content

ENH: Better documentation or default behavior for GroupBy for columns with non-sortable values #57525

Open
@gabuzi

Description

@gabuzi

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Currently (2.2.0), there is a relatively cryptic error message for groupby on columns that contain object values which are not sortable.

The default argument value for sort in groupby is True, which understandably can't be honored for columns with values that just can't be sorted (e.g. of different types).

Feature Description

currently:

import pandas as pd

df = pd.DataFrame({'a': [False, 'string', True], 'b': [1, 2, 3]})
df.groupby('a').describe()  # works

df = pd.DataFrame({'a': [False, (1, 2,), True], 'b': [1, 2, 3]})
df.groupby('a').describe()  # fails!

# TypeError: '<' not supported between instances of 'tuple' and 'bool'

It would be nice if Pandas would just fall back to not sorting the output. As illustrated by the case with the 'string' value, Pandas is lenient and sorts booleans and strings without complaining, which is not technically correct, and it would be convenient if this behavior is extended to other types.

Alternative Solutions

I realize that the suggested change may be a conflicting one as the argument sort=True clearly requests a sorted output. For applications that would rely on this output being indeed sorted, it might be better to be more strict and keep having an error. In this case, however, it would be good to get a hint that sort=False would solve the issue. But then again, there's the issue that it currently is supported between non-sortable values (string vs bool), and for consistency, I would suggest the sort argument be ignored.

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementGroupbyNeeds DiscussionRequires discussion from core team before further actionSortinge.g. sort_index, sort_values

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions