Skip to content

Inconsistent types in output of series.to_dict() and DataFrame([series]).loc[0].to_dict() #13830

Closed
@mikepqr

Description

@mikepqr

to_dict() extracts the elements from a Series as different types depending on whether or not the series was accessed by, e.g. loc[0] on a DataFrame or not:

>>> s = pd.Series({'a': None, 'b': 99, 'c': 'hello'})
>>> df = pd.DataFrame([s])
>>> [type(v) for k, v in s.to_dict().items()]
[NoneType, str, int]
>>> [type(v) for k, v in df.loc[0].to_dict().items()]
[NoneType, str, numpy.int64]

Note that the number is a base int when extracted with s.to_dict(), but it's a numpy.int64 when extracted from df.loc[0]. The same inconsistency applies to tolist().

Is this inconsistency a feature or a bug? And if it's a feature, does anyone know how do I reliably extract the values of a row from a DataFrame in base python types, using either to_dict() or tolist()?

output of pd.show_versions()

commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-56-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 1.5.6
setuptools: 12.2
Cython: 0.24
numpy: 1.11.0
scipy: 0.17.1
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions