Skip to content

Type issue in empty groupby from DataFrame with categorical #9614

Closed
@xflr6

Description

@xflr6

In a DataFrame without a categorical, the following comparisons work as expected:

df = pd.DataFrame({'id': [None] * 3, 'spam': [None] * 3})
df['spam'] == 'spam'
df.groupby('id').first()['spam'] == 'spam'

However, when a column is Categorical, a groupby on the all-null column behaves unexpected:

df['spam'] = df['spam'].astype('category')
df['spam'] == 'spam'  # works as expected
df.groupby('id').first()['spam'] == 'spam'  # raises TypeError: invalid type comparison

Looks like the groupby converts all types in the group to float64:

>>> df.groupby('id').first().info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 0 entries
Data columns (total 1 columns):
spam    0 non-null float64
dtypes: float64(1)
memory usage: 0.0 bytes

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions