Skip to content

BUG: hash of Timestamp on fold=1 create a Segfault #33931

Closed
@hasB4K

Description

@hasB4K
  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandas.
  • (optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

import pandas as pd
america_chicago = "dateutil//usr/share/zoneinfo/America/Chicago"
transition_1 = pd.Timestamp(year=2013, month=11, day=3, hour=1, minute=0, tz=america_chicago)
transition_2 = pd.Timestamp(year=2013, month=11, day=3, hour=1, minute=0, fold=1, tz=america_chicago)

print(transition_1, transition_2)
print(hash(transition_1))
print(hash(transition_2))
$ python3 -q -X faulthandler hash_bug.py
2013-11-03 01:00:00-05:00 2013-11-03 01:00:00-06:00
780959649129526403
Fatal Python error: Segmentation fault

Current thread 0x00007f908caf1740 (most recent call first):
  File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 1814 in _datetime_to_timestamp
  File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 717 in _find_last_transition
  File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 809 in _resolve_ambiguous_time
  File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 739 in _find_ttinfo
  File "/usr/lib/python3.8/site-packages/dateutil/tz/tz.py", line 828 in utcoffset
  File "t.py", line 8 in <module>
[1]    2761996 segmentation fault (core dumped)  python3 -q -X faulthandler t.py

Problem description

It should return a correct hash value, and it should not Segfault.
This create issue when using a manipulating a Timestamp with a dictionary or a set.

Expected Output

I would have expected the same behavior than datetime in Python:

import datetime as dt
from dateutil.tz import gettz

america_chicago = gettz("America/Chicago")
transition_1 = dt.datetime(year=2013, month=11, day=3, hour=1, minute=0, tzinfo=america_chicago)
transition_2 = dt.datetime(year=2013, month=11, day=3, hour=1, minute=0, fold=1, tzinfo=america_chicago)

print(transition_1, transition_2)
print(hash(transition_1))
print(hash(transition_2))
2013-11-03 01:00:00-05:00 2013-11-03 01:00:00-06:00
780959649129526403
780959649129526403

So it seems that this bug is coming from pandas.

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : 1c88e6aff94cc9183909b7c110f554df42509073
python           : 3.8.2.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.5.13-arch2-1
Version          : #1 SMP PREEMPT Mon, 30 Mar 2020 20:42:41 +0000
machine          : x86_64
processor        : 
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.1.0.dev0+1446.g1c88e6aff
numpy            : 1.18.2
pytz             : 2019.3
dateutil         : 2.8.1
pip              : 20.0.2
setuptools       : 46.1.3
Cython           : 0.29.16
pytest           : 5.4.1
hypothesis       : None
sphinx           : 3.0.1
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.5.0
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : 7.13.0
pandas_datareader: None
bs4              : 4.8.2
bottleneck       : None
fastparquet      : None
gcsfs            : None
matplotlib       : 3.2.1
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pyxlsb           : None
s3fs             : None
scipy            : 1.4.1
sqlalchemy       : 1.3.16
tables           : None
tabulate         : None
xarray           : None
xlrd             : None
xlwt             : None
numba            : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugSegfaultNon-Recoverable ErrorTimezonesTimezone data dtype

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions