Skip to content

BUG: DataFrame float reductions with object input #49618

Closed
@rhshadrach

Description

@rhshadrach

By "float reduction", I mean any reduction that would coerce bool or int to float - e.g. mean, std, skew, kurt.

Along with bool and int, we also coerce object to float:

print(pd.DataFrame(data=[[1]], columns=["a"], dtype=object).mean(axis=0))
# a    1.0
# dtype: float64

The reason we cast bool / int to float is to make the resulting dtype not value-dependent. However it's not clear to me if this is the correct thing to do with object. I think I would have expected to get object dtype back.

This was noticed because of the following inconsistency:

print(pd.DataFrame(columns=["a"], dtype=object).mean())
# a    NaN
# dtype: object

In the case of an empty frame with columns, we do end up with object dtype.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Closing CandidateMay be closeable, needs more eyeballsDataFrameDataFrame data structureDtype ConversionsUnexpected or buggy dtype conversionsNeeds DiscussionRequires discussion from core team before further actionReduction Operationssum, mean, min, max, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions