Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample
import pandas as pd
pd.read_sas('dates_null.sas7bdat')
Problem description
When a column datetime contains null values it cannot be converted to a dataframe
Traceback
Traceback (most recent call last):
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/io/sas/sas7bdat.py", line 52, in _convert_datetimes
return pd.to_datetime(sas_datetimes, unit=unit, origin="1960-01-01")
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 805, in to_datetime
values = convert_listlike(arg._values, format)
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 345, in _convert_listlike_datetimes
result, tz_parsed = tslib.array_with_unit_to_datetime(
File "pandas/_libs/tslib.pyx", line 249, in pandas._libs.tslib.array_with_unit_to_datetime
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: cannot convert input with unit 'd'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/io/sas/sasreader.py", line 152, in read_sas
return reader.read()
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/io/sas/sas7bdat.py", line 723, in read
rslt = self._chunk_to_dataframe()
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/io/sas/sas7bdat.py", line 771, in _chunk_to_dataframe
rslt[name] = _convert_datetimes(rslt[name], "d")
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/io/sas/sas7bdat.py", line 59, in _convert_datetimes
return sas_datetimes.apply(
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/core/series.py", line 4135, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/lib.pyx", line 2467, in pandas._libs.lib.map_infer
File "/home/wertha/source/pandas/pandas/tests/io/sas/data/.test/lib/python3.9/site-packages/pandas/io/sas/sas7bdat.py", line 60, in <lambda>
lambda sas_float: datetime(1960, 1, 1) + timedelta(days=sas_float)
ValueError: cannot convert float NaN to integer
Expected Output
A dataframe with NaT when nulls are found.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 7d32926
python : 3.9.1.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.14-arch1-1
Version : #1 SMP PREEMPT Sun, 07 Feb 2021 22:42:17 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.2.2
numpy : 1.20.1
pytz : 2021.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 49.2.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None