Skip to content

Groupby aggregation of date/datetime columns returns datetime64 rather than numeric type #11444

Closed
@michaelbilow

Description

@michaelbilow
import pandas as pd
import datetime
u = [datetime.datetime(2015, x, 1) for x in range(12)]
v = list('aaabbbbbbccd')
df = pd.DataFrame('X':v, 'Y':u)
df.groupby('X')['Y'].agg(len)
## Returns the following:
X
a   1970-01-01 00:00:00.000000003
b   1970-01-01 00:00:00.000000006
c   1970-01-01 00:00:00.000000002
d   1970-01-01 00:00:00.000000001

You can fix the problem by casting the dates to strings before groupby/agg, but if you try to cast the returned datetimes to ints, errors go off in some versions of pandas. Either way, aggregating by length should always return an int. Also, this may be similar to #11442, which was just posted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions