Skip to content

json_normalize has inconsistent behaviors while flattening nested array elements #21537

Closed
@vuminhle

Description

@vuminhle

Code Sample, a copy-pastable example if possible

from pandas.io.json import json_normalize

df1 = json_normalize([{'A': {'B': 1}}])
df2 = json_normalize({'dummy': [{'A': {'B': 1}}]}, 'dummy')
print(df1)
print(df2)

Problem description

The above code produces:

A.B
0 1
A
0 {'B':1}

Looks like json_normalize recursively flattens only the top-level array (in the first call).
In the second call, it only flattens to the first level. I think it should have the same behavior as that in the first call and produces the same output.

Expected Output

A.B
0 1
A.B
0 1

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.1
pytest: 3.6.1
pip: 10.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.14.2
scipy: None
pyarrow: None
xarray: None
IPython: 6.3.1
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions