Description
to_dict()
extracts the elements from a Series as different types depending on whether or not the series was accessed by, e.g. loc[0]
on a DataFrame or not:
>>> s = pd.Series({'a': None, 'b': 99, 'c': 'hello'})
>>> df = pd.DataFrame([s])
>>> [type(v) for k, v in s.to_dict().items()]
[NoneType, str, int]
>>> [type(v) for k, v in df.loc[0].to_dict().items()]
[NoneType, str, numpy.int64]
Note that the number is a base int when extracted with s.to_dict()
, but it's a numpy.int64
when extracted from df.loc[0]
. The same inconsistency applies to tolist()
.
Is this inconsistency a feature or a bug? And if it's a feature, does anyone know how do I reliably extract the values of a row from a DataFrame in base python types, using either to_dict()
or tolist()
?
output of pd.show_versions()
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-56-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: None
pip: 1.5.6
setuptools: 12.2
Cython: 0.24
numpy: 1.11.0
scipy: 0.17.1
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None