Skip to content

BUG: Timezones are lost in pivot_table #14948

Closed
@gte620v

Description

@gte620v

Code Sample, a copy-pastable example if possible

# table to pivot
df = pd.DataFrame([{'uid': u'aa',
  'ts': pd.Timestamp('2016-08-12 13:00:00-0700', tz='US/Pacific')},
 {'uid': u'aa',
  'ts': pd.Timestamp('2016-08-12 08:00:00-0700', tz='US/Pacific')},
 {'uid': u'aa',
  'ts': pd.Timestamp('2016-08-12 14:00:00-0700', tz='US/Pacific')},
 {'uid': u'aa',
  'ts': pd.Timestamp('2016-08-25 11:00:00-0700', tz='US/Pacific')},
 {'uid': u'aa',
  'ts': pd.Timestamp('2016-08-25 13:00:00-0700', tz='US/Pacific')}])

# not sure why I need this, but it doesn't work without it....
df = df.set_index('ts').reset_index()

# Series of day timestamps we want to pivot over
mins = df.ts.map(lambda x: x.replace(hour=0, minute=0,
                                                    second=0, microsecond=0))

# pivot
pd.pivot_table(df.set_index('ts').reset_index(),
               values='ts', index=['uid'], columns=[mins], aggfunc=np.min) 
# prints in UTC time and discards timezones

# do the aggregation portion of the pivot
df.groupby(mins).agg(np.min) # works as expected where timezones are maintained

image

Problem description

pivot_table strips timezone information from the aggregated values.

I traced back in the code and it seems that the culprit is unstack (

table = agged.unstack(to_unstack)
).

Specifically, either of these numpy operations will strip the tz:

values = np.array(values)
values = values[:, np.newaxis]

Expected Output

Would expect aggregated values to maintain timezone like they do with groupby.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.4.11-23.53.amzn1.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 32.1.0.post20161217
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.0
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.5.0
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.38.0
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, ExplodeTimezonesTimezone data dtype

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions