Skip to content

BUG: nanops on empty Series have wrong type #7869

Closed
@ischwabacher

Description

@ischwabacher
In [1]: import pandas as pd

In [2]: pd.Series(dtype='m8[ns]').sum()
Out[2]: 0

In [3]: pd.Series(dtype='m8[ns]').mean()
Out[3]: 
0   NaT
dtype: timedelta64[ns]

In [4]: pd.Series(dtype='m8[ns]').median()
Out[4]: nan

In [5]: pd.Series(dtype=float).sum()
Out[5]: 0

In [6]: pd.Series([0], dtype=float).sum()
Out[6]: 0.0

It looks like _wrap_results is doing double duty, both converting an output to the right time dtype, and wrapping that result in a series (I don't know why it does this), and in some cases results aren't getting wrapped. But the wrapper ignores other dtypes, which doesn't make sense to me; aside from the unfortunate fact that numpy provides no _NA_integer equivalent, I as a user expect the dtype of a result to depend only on the dtype of the inputs, not on their shapes or values, so it would make sense to have a more general wrapper somewhere.

I'm not sure how I'd implement such a wrapper, because ideally it would be able to handle situations like that the standard deviation of a Series of datetime64s is a timedelta64 or such. Maybe that's a bit ambitious, though, since it's inching towards a general notion of symbolic units; in such a case the user can unwrap, operate and wrap manually.

It looks like there are several points in pandas/core/nanops.py where a result is returned without wrapping, and I could submit a narrow PR for just this issue but I want to wait until I have time to really go over nanops.py and get all of these.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions