Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import io
t = io.StringIO('''\
event,timestamp
a,1556559573141592653
b,1556559573141592654
c,
d,1556559573141592655
''')
# Reading the timestamps as strings works fine
print("\nExpected output:")
print(pd.read_csv(t, dtype={'timestamp': object}))
# Now with Int64Dtype
t.seek(0)
print("\nActual output:")
print(pd.read_csv(t, dtype={'timestamp': pd.Int64Dtype()}))
Problem description
I would like to read csv files with nullable (big) integers into a dataframe. The integers represent nanoseconds since the UNIX epoch 1970. Using the Int64Dtype introduced in 0.24.0 seems like the way to go. I quote from the FAQ:
If you need to represent integers with possibly missing values, use one of the nullable- integer extension dtypes provided by pandas
Expected Output
event timestamp
0 a 1556559573141592653
1 b 1556559573141592654
2 c NaN
3 d 1556559573141592655
Actual Output
event timestamp
0 a 1556559573141592576
1 b 1556559573141592576
2 c NaN
3 d 1556559573141592576
Output of pd.show_versions()
pandas: 0.24.2
pytest: None
pip: 19.1
setuptools: 41.0.1
Cython: None
numpy: 1.16.3
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None