Skip to content

Regression in DataFrame.sum with mixed datetime, numeric and missing values #30886

Closed
@TomAugspurger

Description

@TomAugspurger

On 0.25.3

In [3]: df = pd.DataFrame({"A": pd.date_range("2000", periods=4), "B": [1, 2, 3, 4]}).reindex([2, 3, 4])

In [4]: df.sum()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: unsupported operand type(s) for +: 'Timestamp' and 'Timestamp'

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
<ipython-input-4-7e5fdb616c56> in <module>
----> 1 df.sum()

~/sandbox/pandas/pandas/core/generic.py in stat_func(self, axis, skipna, level, numeric_only, min_count, **kwargs)
  11035             skipna=skipna,
  11036             numeric_only=numeric_only,
> 11037             min_count=min_count,
  11038         )
  11039

~/sandbox/pandas/pandas/core/frame.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   7885             values = self.values
   7886             try:
-> 7887                 result = f(values)
   7888
   7889                 if filter_type == "bool" and is_object_dtype(values) and axis is None:

~/sandbox/pandas/pandas/core/frame.py in f(x)
   7843
   7844         def f(x):
-> 7845             return op(x, axis=axis, skipna=skipna, **kwds)
   7846
   7847         def _get_data(axis_matters):

~/sandbox/pandas/pandas/core/nanops.py in _f(*args, **kwargs)
     67             try:
     68                 with np.errstate(invalid="ignore"):
---> 69                     return f(*args, **kwargs)
     70             except ValueError as e:
     71                 # we want to transform an object array

~/sandbox/pandas/pandas/core/nanops.py in nansum(values, axis, skipna, min_count, mask)
    491     elif is_timedelta64_dtype(dtype):
    492         dtype_sum = np.float64
--> 493     the_sum = values.sum(axis, dtype=dtype_sum)
    494     the_sum = _maybe_null_out(the_sum, axis, mask, values.shape, min_count=min_count)
    495

~/Envs/pandas-dev/lib/python3.7/site-packages/numpy/core/_methods.py in _sum(a, axis, dtype, out, keepdims, initial, where)
     36 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
     37          initial=_NoValue, where=True):
---> 38     return umr_sum(a, axis, dtype, out, keepdims, initial, where)
     39
     40 def _prod(a, axis=None, dtype=None, out=None, keepdims=False,

~/sandbox/pandas/pandas/_libs/tslibs/c_timestamp.pyx in pandas._libs.tslibs.c_timestamp._Timestamp.__add__()

~/sandbox/pandas/pandas/_libs/tslibs/c_timestamp.pyx in pandas._libs.tslibs.c_timestamp.integer_op_not_supported()

SystemError: <class 'TypeError'> returned a result with an error set

On 0.25.x

In [1]: import pandas as pd
df
In [2]: df = pd.DataFrame({"A": pd.date_range("2000", periods=4), "B": [1, 2, 3, 4]}).reindex([2, 3, 4])

In [3]: df.sum()
Out[3]:
B    7.0
dtype: float64

cc @jbrockmendel

Metadata

Metadata

Assignees

Labels

DatetimeDatetime data dtypeNumeric OperationsArithmetic, Comparison, and Logical operationsRegressionFunctionality that used to work in a prior pandas version

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions