Closed
Description
Original post, edited to correct ISO 8601 formats. The originally titled bug does not exist. There is, however, some unexpected behavior:
It looks like short-form ISO 8601 parsing uses the right sign after all:
In [1]: import pandas as pd In [2]: t = pd.Timestamp('2013-11-03 1:30:00', tz='America/Havana') In [3]: t.strftime('%Y-%m-%d %H:%M:%S %Z %z') Out[3]: '2013-11-03 01:30:00 CST -0500' In [4]: t.strftime('%Y%m%dT%H%M%S%z') # ISO 8601 short format Out[4]: '20131103T013000-0500' In [5]: pd.Timestamp(t.strftime('%Y%m%dT%H%M%S%z')) Out[5]: Timestamp('2013-11-03 01:30:00-0500', tz='tzoffset(None, -18000)') In [6]: t == _ Out[6]: True # Not a bug after all! In [7]: t.strftime('%Y-%m-%dT%H:%M:%S%z') # ISO 8601 long format Out[7]: '2013-11-03T01:30:00-0500' In [8]: pd.Timestamp(t.strftime('%Y-%m-%dT%H:%M:%S%z')) Out[8]: Timestamp('2013-11-03 01:30:00-0500', tz='pytz.FixedOffset(-300)') # why pytz? In [9]: t == _ Out[9]: True In [10]: t.strftime('%Y%m%dT%H%M%SZ%z') # Not a real ISO 8601 format (note the 'Z') Out[10]: '20131103T013000Z-0500' In [11]: pd.Timestamp(t.strftime('%Y%m%dT%H%M%SZ%z')) Out[11]: Timestamp('2013-11-03 01:30:00+0500', tz='tzoffset(None, 18000)') # This is the behavior that inspired the bug report. I don't know if this is a valid parse # or not, but it sure is unexpected.It would be awfully embarrassing if I filed a bug report because I couldn't read ISO 8601...
However, in the process of investigating this issue, I encountered the following, which is definitely a bug:
In [12]: ts = ['2013-11-%s3 %s1:30:00' % (x, y) for x in ['', '0'] for y in ['', '0']]
In [13]: ts
Out[13]:
['2013-11-3 1:30:00',
'2013-11-3 01:30:00',
'2013-11-03 1:30:00',
'2013-11-03 01:30:00'] # only this one is ISO 8601
In [14]: tzs = ['America/%s' % s for s in ['Chicago', 'New_York', 'Havana']]
In [15]: [[pd.Timestamp(t, tz=tz) for t in ts] for tz in tzs]
Out[15]:
[[Timestamp('2013-11-03 01:30:00-0600', tz='America/Chicago'),
Timestamp('2013-11-03 01:30:00-0600', tz='America/Chicago'),
Timestamp('2013-11-03 01:30:00-0600', tz='America/Chicago'),
Timestamp('2013-11-03 01:30:00-0500', tz='America/Chicago')], # DST
[Timestamp('2013-11-03 01:30:00-0500', tz='America/New_York'),
Timestamp('2013-11-03 01:30:00-0500', tz='America/New_York'),
Timestamp('2013-11-03 01:30:00-0500', tz='America/New_York'),
Timestamp('2013-11-03 01:30:00-0400', tz='America/New_York')], # DST
[Timestamp('2013-11-03 01:30:00-0500', tz='America/Havana'),
Timestamp('2013-11-03 01:30:00-0500', tz='America/Havana'),
Timestamp('2013-11-03 01:30:00-0500', tz='America/Havana'),
Timestamp('2013-11-03 00:30:00-0500', tz='America/Havana')]] # Just plain wrong
I'm not sure what's going on here.