Closed
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
df = pd.DataFrame({
'tag': [1,1],
'date': [
pd.Timestamp('2018-01-01', tz='UTC'),
pd.Timestamp('2018-01-02', tz='UTC')]
})
df.groupby('tag').agg({'date': lambda e: e.head(1)})
Problem description
The above code makes the following errors.
Traceback (most recent call last):
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 2670, in agg_series
return self._aggregate_series_fast(obj, func)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 2689, in _aggregate_series_fast
dummy)
File "pandas/_libs/reduction.pyx", line 334, in pandas._libs.reduction.SeriesGrouper.__init__
File "pandas/_libs/reduction.pyx", line 347, in pandas._libs.reduction.SeriesGrouper._check_dummy
ValueError: Dummy array must be same dtype
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 3495, in aggregate
return self._python_agg_general(func_or_funcs, *args, **kwargs)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 1068, in _python_agg_general
result, counts = self.grouper.agg_series(obj, f)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 2672, in agg_series
return self._aggregate_series_pure_python(obj, func)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 2706, in _aggregate_series_pure_python
raise ValueError('Function does not reduce')
ValueError: Function does not reduce
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 4656, in aggregate
return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 4087, in aggregate
result, how = self._aggregate(arg, _level=_level, *args, **kwargs)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/base.py", line 490, in _aggregate
result = _agg(arg, _agg_1dim)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/base.py", line 441, in _agg
result[fname] = func(fname, agg_how)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/base.py", line 424, in _agg_1dim
return colg.aggregate(how, _level=(_level or 0) + 1)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 3497, in aggregate
result = self._aggregate_named(func_or_funcs, *args, **kwargs)
File "~/.pyenv/versions/anaconda3-5.3.0/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 3627, in _aggregate_named
raise Exception('Must produce aggregated value')
Exception: Must produce aggregated value
Actually, it works if I remove tz from the timestamps like this. So I guess it is a bug.
df = pd.DataFrame({
'tag': [1,1],
'date': [
pd.Timestamp('2018-01-01'),
pd.Timestamp('2018-01-02')]
})
df.groupby('tag').agg({'date': lambda e: e.head(1)})
Expected Output
date
tag
1 2018-01-01 00:00:00+00:00
Output of pd.show_versions()
>>> pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-34-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.4
pytest: 3.8.0
pip: 18.1
setuptools: 40.2.0
Cython: 0.28.5
numpy: 1.15.1
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: 1.7.9
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.6
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.1.0
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.11
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None