Skip to content

data frame group by with extra categorical column aggregates datetime64 NaT column to float NaN #20520

Closed
@mathiasdiez

Description

@mathiasdiez

Code Sample, a copy-pastable example if possible

# Minimal example:
df = pd.DataFrame({'group': ['first', 'first', 'second', 'third', 'third'], 'time': 5*[np.datetime64('NaT')], 'categories': pd.Series(["a","b","c","a","b"], dtype="category")})
df.groupby('group').first()
# results in
>>> 	       categories	time
>>> group		
>>> first       a                NaN
>>> second	c	         NaN
>>> third	a	         NaN

# while
df = pd.DataFrame({'group': ['first', 'first', 'second', 'third', 'third'], 'time': 5*[np.datetime64('NaT')]})
df.groupby('group').first()
# results in
>>> 	      time
>>> group		
>>> first     NaT
>>> second    NaT
>>> third     NaT

Problem description

When aggregating a NaT valued datetime64 data frame column in the presence of a second categorical column the resulting aggregated column is a NaN valued float. This does not happen in the absence of the categorical column.

Expected Output

       categories	time

group
first a NaT
second c NaT
third a NaT

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.9.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-0.bpo.4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.22.0
pytest: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.14.0
scipy: 1.0.0
pyarrow: 0.8.0
xarray: None
IPython: 5.5.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.2
pymysql: None
psycopg2: 2.7.3.2 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions