Description
In [1]: import pandas as pd
In [2]: pd.Series(dtype='m8[ns]').sum()
Out[2]: 0
In [3]: pd.Series(dtype='m8[ns]').mean()
Out[3]:
0 NaT
dtype: timedelta64[ns]
In [4]: pd.Series(dtype='m8[ns]').median()
Out[4]: nan
In [5]: pd.Series(dtype=float).sum()
Out[5]: 0
In [6]: pd.Series([0], dtype=float).sum()
Out[6]: 0.0
It looks like _wrap_results
is doing double duty, both converting an output to the right time dtype, and wrapping that result in a series (I don't know why it does this), and in some cases results aren't getting wrapped. But the wrapper ignores other dtypes, which doesn't make sense to me; aside from the unfortunate fact that numpy provides no _NA_integer
equivalent, I as a user expect the dtype of a result to depend only on the dtype of the inputs, not on their shapes or values, so it would make sense to have a more general wrapper somewhere.
I'm not sure how I'd implement such a wrapper, because ideally it would be able to handle situations like that the standard deviation of a Series
of datetime64
s is a timedelta64
or such. Maybe that's a bit ambitious, though, since it's inching towards a general notion of symbolic units; in such a case the user can unwrap, operate and wrap manually.
It looks like there are several points in pandas/core/nanops.py where a result is returned without wrapping, and I could submit a narrow PR for just this issue but I want to wait until I have time to really go over nanops.py
and get all of these.