TRACKER: Simple analysis of `IO JSON` open issues

### Research

- [X] I have searched the [[pandas] tag](https://stackoverflow.com/questions/tagged/pandas) on StackOverflow for similar questions.

- [X] I have asked my usage related question on [StackOverflow](https://stackoverflow.com).


### Link to question on StackOverflow

 `IO JSON` open issues

### Question about pandas

You can find below a quick analysis of open Json issues:
- the third column is a (personal!) classification to mainly identify 'type' or 'dtype' problems
- the fourth column is a subcategory only of the 'type' category
- the fifth column identifies the issues that proposal PDEP0012 solves or provides an alternative solution

My first summary is as follows:
- the first twelve issues will be impacted by PDEP0012
- five issues are with numeric column name -> maybe they can be grouped together
- three issues seem closed to me (can anyone check?)
- eight issues concern None, NA, NaN or NaT values
- ten issues concern the json_normalize function


|n°|Label|category|sub-category|including PDEP0012|
|----|----|----|----|----|
|16492|No way with to_json to write only date out of datetime|type|date|ok|
|49585|BUG: Series read_json tries to convert all column values to dates even when using keep_default_dates=True - if one column has an na value|type|datetime - NA|ok|
|12997|to_json converts to UTC when encoding ISO formatted datetimes|type|datetime - tz|ok|
|53252|ENH: simple - compact and reversible JSON interface|type|extend type|ok|
|14358|read_json Raises AttributeError with Valid JSON as Input|type|null - NA|ok|
|35464|BUG: Type mismatch in read_json|type|null - NA|ok|
|51375|"BUG: to_json/read_json with orient=""table"" does not preserve types with pd.NA"|type|null - NA|ok|
|36211|BUG: to_json for DataFrame containing Path objects crash with infinite recursion|type|Path|ok|
|50782|BUG: Complex Numbers Not Imported Correctly Under JSON Read|type|table - complex|ok|
|35420|to_json/read_json can't handle interval index|type|table - interval|ok|
|39537|Error when converting df to json table (utc timezone date time object causes the error)|type|table - tz|ok|
|52595|BUG: json that could be read by pandas 1.5.3 cannot be read by 2.0.0|type|table - tz|ok|
|16848|UnicodeDecodeError with html.table_schema = True|type|binary| |
|25336|[BUG?] pd.read_json does not convert date before 1971-01-01|type|datetime - conversion| |
|22317|Request to add more date formats in to_json method|type|datetime - format| |
|47930|ENH: Add new date_format option to_json matching datetime.isoformat exactly|type|datetime - tz| |
|21454|pd.read_json converts large floats to inf|type|float - conversion| |
|23328|inconsistent float rounding in to_json|type|float - conversion| |
|44684|BUG: the precision of big integer in read_json|type|int - conversion| |
|28609|OverflowError on using to_json to serialize NaN value with type Decimal|type|null - NA| |
|31801|to_json index with Null Value Broken in 1.0|type|null - NA| |
|44693|BUG: dtypes cast when reading JSON|type|null - NA| |
|46627|BUG: Pandas's ujson module incorrectly returns None when it reads NaN|type|null - NA| |
|20608|read_json reads large integers as strings incorrectly if dtype not explicitly mentioned|type|str - conversion| |
|42471|BUG: read_json converts Numeric Strings to Numbers|type|str - conversion| |
|29025|Incorrect json round-trip with orient='table' when dataframe contains duplicate index values|type|table - index| |
|19129|Raise ValueError for read_json and orient='table' With Numeric Column Names|type|table - int col| |
|38256|"BUG: pandas to_json with orient ""table"" returns wrong schema & data string"|type|table - int col| |
|40674|BUG: pd.read_json sets wrong value for numeric column names|type|table - int col| |
|46392|BUG: Integer column index breaks json roundtrip with orient=table|type|table - int col| |
|32037(44705)|JSON table orient not roundtripping extension types|type|table - int col| |
|26692|If tuples used as index pd.read_json( orient='split') does not read file saved by df.to_json(orient='split)|type|tuple index| |
|21140|"Add Timedelta Support to JSON Reader with orient=""table"""|to close| | |
|23584|Series to_json Docstring Updates|to close| | |
|31917|to_json of Series with period dtype results in AttributeError|to close| | |
|37100|BUG: Series.to_json produces incorrect json format|to be completed| | |
|45959|QST: Why to_json defaults to force_ascii=True|question| | |
|27241|ENH: Ignore flattening certain keys in json_normalize|normalize| | |
|33414|ENH: Optionally pass dtypes as a dict into json_normalize|normalize| | |
|34028|What is the best way to normalize_json before read_json for the file with gigabytes size? |normalize| | |
|34465|BUG: unexpected behavior of json_normalize meta arg|normalize| | |
|36245|BUG: pd.json_normalize on a column loses rows that have an empty list for that column|normalize| | |
|42311|ENH: json_normalize flatten lists as well|normalize| | |
|44329|ENH: errors='ignore' should work for record_path for pandas.json_normalize function|normalize| | |
|51452|pd.json_normalize doesn't return data with index from series|normalize| | |
|53126|BUG: json_normalize does not parse nested lists consistently|normalize| | |
|54121|DOC: description of record_prefix param for json_normalize is wrong|normalize| | |
|29928|Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work|multiindex| | |
|50456|BUG: JSON serialization with orient split fails roundtrip with MultiIndex|multiindex| | |
|42582|ENH: col descriptions that'd save in df schemas - helping users avoid creating separate documentation?|metadata| | |
|51012|ENH: Include df.attrs in to_json output |metadata| | |
|19261|Standardize pandas metadata for table schema and parquet|internal| | |
|20599|OverflowError: Python int too large to convert to C long|internal| | |
|28180|to_iso methods for DatetimeLikeArray|internal| | |
|32326|"Unexpected behaviour of df.to_json(compression=""gzip"")"|internal| | |
|33014|to_json should make separators configurable (similar to json.dump)|internal| | |
|33877|BUG: weird interaction between pyslurm - ujson that changes function signature of ujson.dumps|internal| | |
|35279|pandas/tests/io/json/test_pandas.py::TestPandasContainer::test_read_json_large_numbers failing for 32-bit system|internal| | |
|39135|ENH: Add support for date_unit to be specified per column in to_json |internal| | |
|41521|ENH: Add support to read_json to encode character escape hex codes to utf-8 characters|internal| | |
|44881|ENH: change pd.read_json kwarg to rtype or return_type?|internal| | |
|49604|BUG/CLN: Vendored ujson Module|internal| | |
|54865|BUG: LSAN Detected Memory Leaks|internal| | |
|17220|Enhancement: to_json and read_json for DataFrame should have option to output/parse values by column|format| | |
|39913|ENH: new orient setting for read_json to support common API format|format| | |
|46571|ENH: Allow usage of custom library to serialize with to_json method|format| | |
|12286|Feature suggestion: flexible hierarchical data (json) importer (will implement if interest exists)|extension| | |
|22853|Add chunksize support to to_json|chunk| | |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

TRACKER: Simple analysis of `IO JSON` open issues #55046

Research

Link to question on StackOverflow

Question about pandas

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

n°	Label	category	sub-category	including PDEP0012
16492	No way with to_json to write only date out of datetime	type	date	ok
49585	BUG: Series read_json tries to convert all column values to dates even when using keep_default_dates=True - if one column has an na value	type	datetime - NA	ok
12997	to_json converts to UTC when encoding ISO formatted datetimes	type	datetime - tz	ok
53252	ENH: simple - compact and reversible JSON interface	type	extend type	ok
14358	read_json Raises AttributeError with Valid JSON as Input	type	null - NA	ok
35464	BUG: Type mismatch in read_json	type	null - NA	ok
51375	"BUG: to_json/read_json with orient=""table"" does not preserve types with pd.NA"	type	null - NA	ok
36211	BUG: to_json for DataFrame containing Path objects crash with infinite recursion	type	Path	ok
50782	BUG: Complex Numbers Not Imported Correctly Under JSON Read	type	table - complex	ok
35420	to_json/read_json can't handle interval index	type	table - interval	ok
39537	Error when converting df to json table (utc timezone date time object causes the error)	type	table - tz	ok
52595	BUG: json that could be read by pandas 1.5.3 cannot be read by 2.0.0	type	table - tz	ok
16848	UnicodeDecodeError with html.table_schema = True	type	binary
25336	[BUG?] pd.read_json does not convert date before 1971-01-01	type	datetime - conversion
22317	Request to add more date formats in to_json method	type	datetime - format
47930	ENH: Add new date_format option to_json matching datetime.isoformat exactly	type	datetime - tz
21454	pd.read_json converts large floats to inf	type	float - conversion
23328	inconsistent float rounding in to_json	type	float - conversion
44684	BUG: the precision of big integer in read_json	type	int - conversion
28609	OverflowError on using to_json to serialize NaN value with type Decimal	type	null - NA
31801	to_json index with Null Value Broken in 1.0	type	null - NA
44693	BUG: dtypes cast when reading JSON	type	null - NA
46627	BUG: Pandas's ujson module incorrectly returns None when it reads NaN	type	null - NA
20608	read_json reads large integers as strings incorrectly if dtype not explicitly mentioned	type	str - conversion
42471	BUG: read_json converts Numeric Strings to Numbers	type	str - conversion
29025	Incorrect json round-trip with orient='table' when dataframe contains duplicate index values	type	table - index
19129	Raise ValueError for read_json and orient='table' With Numeric Column Names	type	table - int col
38256	"BUG: pandas to_json with orient ""table"" returns wrong schema & data string"	type	table - int col
40674	BUG: pd.read_json sets wrong value for numeric column names	type	table - int col
46392	BUG: Integer column index breaks json roundtrip with orient=table	type	table - int col
32037(44705)	JSON table orient not roundtripping extension types	type	table - int col
26692	If tuples used as index pd.read_json( orient='split') does not read file saved by df.to_json(orient='split)	type	tuple index
21140	"Add Timedelta Support to JSON Reader with orient=""table"""	to close
23584	Series to_json Docstring Updates	to close
31917	to_json of Series with period dtype results in AttributeError	to close
37100	BUG: Series.to_json produces incorrect json format	to be completed
45959	QST: Why to_json defaults to force_ascii=True	question
27241	ENH: Ignore flattening certain keys in json_normalize	normalize
33414	ENH: Optionally pass dtypes as a dict into json_normalize	normalize
34028	What is the best way to normalize_json before read_json for the file with gigabytes size?	normalize
34465	BUG: unexpected behavior of json_normalize meta arg	normalize
36245	BUG: pd.json_normalize on a column loses rows that have an empty list for that column	normalize
42311	ENH: json_normalize flatten lists as well	normalize
44329	ENH: errors='ignore' should work for record_path for pandas.json_normalize function	normalize
51452	pd.json_normalize doesn't return data with index from series	normalize
53126	BUG: json_normalize does not parse nested lists consistently	normalize
54121	DOC: description of record_prefix param for json_normalize is wrong	normalize
29928	Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work	multiindex
50456	BUG: JSON serialization with orient split fails roundtrip with MultiIndex	multiindex
42582	ENH: col descriptions that'd save in df schemas - helping users avoid creating separate documentation?	metadata
51012	ENH: Include df.attrs in to_json output	metadata
19261	Standardize pandas metadata for table schema and parquet	internal
20599	OverflowError: Python int too large to convert to C long	internal
28180	to_iso methods for DatetimeLikeArray	internal
32326	"Unexpected behaviour of df.to_json(compression=""gzip"")"	internal
33014	to_json should make separators configurable (similar to json.dump)	internal
33877	BUG: weird interaction between pyslurm - ujson that changes function signature of ujson.dumps	internal
35279	pandas/tests/io/json/test_pandas.py::TestPandasContainer::test_read_json_large_numbers failing for 32-bit system	internal
39135	ENH: Add support for date_unit to be specified per column in to_json	internal
41521	ENH: Add support to read_json to encode character escape hex codes to utf-8 characters	internal
44881	ENH: change pd.read_json kwarg to rtype or return_type?	internal
49604	BUG/CLN: Vendored ujson Module	internal
54865	BUG: LSAN Detected Memory Leaks	internal
17220	Enhancement: to_json and read_json for DataFrame should have option to output/parse values by column	format
39913	ENH: new orient setting for read_json to support common API format	format
46571	ENH: Allow usage of custom library to serialize with to_json method	format
12286	Feature suggestion: flexible hierarchical data (json) importer (will implement if interest exists)	extension
22853	Add chunksize support to to_json	chunk

Uh oh!

TRACKER: Simple analysis of IO JSON open issues #55046

Description

Research

Link to question on StackOverflow

Question about pandas

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

TRACKER: Simple analysis of `IO JSON` open issues #55046