Skip to content

BUG: groupby with dropna=False and pa.dictionary drops NA values #60567

Closed
@rhshadrach

Description

@rhshadrach
df = pd.DataFrame({'A': ['a1', pd.NA]}, dtype=pd.ArrowDtype(pa.dictionary(pa.int32(), pa.utf8())))
print(df.groupby("A", dropna=False)[["A"]].first())
#      A
# A     
# a1  a1

There should be a 2nd row with the NA value since dropna=False.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityGroupbyMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions