Description
Cannot compute mean of timedelta in gropuby
In [1]: import pandas as pd
...: import numpy as np
...: print(pd.__version__)
...:
...: time = pd.to_timedelta(np.random.randint(100,1000, size=10), unit='s')
...: cat = np.random.choice(['A','B','C'], size=10)
...:
...: df = pd.DataFrame(dict(time=time, cat=cat))
...:
...: print(df['time'].mean())
...:
...: df.groupby('cat')['time'].mean()
...:
0.20.3
0 days 00:07:15.800000
---------------------------------------------------------------------------
DataError Traceback (most recent call last)
<ipython-input-1-b649ebd78333> in <module>()
10 print(df['time'].mean())
11
---> 12 df.groupby('cat')['time'].mean()
/Users/adefusco/Applications/anaconda3/4.3/envs/module-pandas/lib/python3.6/site-packages/pandas/core/groupby.py in mean(self, *args, **kwargs)
1035 nv.validate_groupby_func('mean', args, kwargs, ['numeric_only'])
1036 try:
-> 1037 return self._cython_agg_general('mean', **kwargs)
1038 except GroupByError:
1039 raise
/Users/adefusco/Applications/anaconda3/4.3/envs/module-pandas/lib/python3.6/site-packages/pandas/core/groupby.py in _cython_agg_general(self, how, alt, numeric_only)
834
835 if len(output) == 0:
--> 836 raise DataError('No numeric types to aggregate')
837
838 return self._wrap_aggregated_output(output, names)
DataError: No numeric types to aggregate
Problem description
The mean of TimeDelta
can be computed, but it is not working in GroupBy.
Expected Output
This is the output I would expect.
In [3]: def average_time(x):
...: s = x.dt.seconds
...: return pd.Timedelta(s.mean(), unit='s')
...:
...: df.groupby('cat')['time'].apply(average_time)
...:
Out[3]:
cat
A 00:09:27.666667
B 00:06:18.666667
C 00:06:19.750000
Name: time, dtype: timedelta64[ns]
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.0.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: 0.2.1