Open
Description
Research
-
I have searched the [pandas] tag on StackOverflow for similar questions.
-
I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
IO JSON
open issues
Question about pandas
You can find below a quick analysis of open Json issues:
- the third column is a (personal!) classification to mainly identify 'type' or 'dtype' problems
- the fourth column is a subcategory only of the 'type' category
- the fifth column identifies the issues that proposal PDEP0012 solves or provides an alternative solution
My first summary is as follows:
- the first twelve issues will be impacted by PDEP0012
- five issues are with numeric column name -> maybe they can be grouped together
- three issues seem closed to me (can anyone check?)
- eight issues concern None, NA, NaN or NaT values
- ten issues concern the json_normalize function
n° | Label | category | sub-category | including PDEP0012 |
---|---|---|---|---|
16492 | No way with to_json to write only date out of datetime | type | date | ok |
49585 | BUG: Series read_json tries to convert all column values to dates even when using keep_default_dates=True - if one column has an na value | type | datetime - NA | ok |
12997 | to_json converts to UTC when encoding ISO formatted datetimes | type | datetime - tz | ok |
53252 | ENH: simple - compact and reversible JSON interface | type | extend type | ok |
14358 | read_json Raises AttributeError with Valid JSON as Input | type | null - NA | ok |
35464 | BUG: Type mismatch in read_json | type | null - NA | ok |
51375 | "BUG: to_json/read_json with orient=""table"" does not preserve types with pd.NA" | type | null - NA | ok |
36211 | BUG: to_json for DataFrame containing Path objects crash with infinite recursion | type | Path | ok |
50782 | BUG: Complex Numbers Not Imported Correctly Under JSON Read | type | table - complex | ok |
35420 | to_json/read_json can't handle interval index | type | table - interval | ok |
39537 | Error when converting df to json table (utc timezone date time object causes the error) | type | table - tz | ok |
52595 | BUG: json that could be read by pandas 1.5.3 cannot be read by 2.0.0 | type | table - tz | ok |
16848 | UnicodeDecodeError with html.table_schema = True | type | binary | |
25336 | [BUG?] pd.read_json does not convert date before 1971-01-01 | type | datetime - conversion | |
22317 | Request to add more date formats in to_json method | type | datetime - format | |
47930 | ENH: Add new date_format option to_json matching datetime.isoformat exactly | type | datetime - tz | |
21454 | pd.read_json converts large floats to inf | type | float - conversion | |
23328 | inconsistent float rounding in to_json | type | float - conversion | |
44684 | BUG: the precision of big integer in read_json | type | int - conversion | |
28609 | OverflowError on using to_json to serialize NaN value with type Decimal | type | null - NA | |
31801 | to_json index with Null Value Broken in 1.0 | type | null - NA | |
44693 | BUG: dtypes cast when reading JSON | type | null - NA | |
46627 | BUG: Pandas's ujson module incorrectly returns None when it reads NaN | type | null - NA | |
20608 | read_json reads large integers as strings incorrectly if dtype not explicitly mentioned | type | str - conversion | |
42471 | BUG: read_json converts Numeric Strings to Numbers | type | str - conversion | |
29025 | Incorrect json round-trip with orient='table' when dataframe contains duplicate index values | type | table - index | |
19129 | Raise ValueError for read_json and orient='table' With Numeric Column Names | type | table - int col | |
38256 | "BUG: pandas to_json with orient ""table"" returns wrong schema & data string" | type | table - int col | |
40674 | BUG: pd.read_json sets wrong value for numeric column names | type | table - int col | |
46392 | BUG: Integer column index breaks json roundtrip with orient=table | type | table - int col | |
32037(44705) | JSON table orient not roundtripping extension types | type | table - int col | |
26692 | If tuples used as index pd.read_json( orient='split') does not read file saved by df.to_json(orient='split) | type | tuple index | |
21140 | "Add Timedelta Support to JSON Reader with orient=""table""" | to close | ||
23584 | Series to_json Docstring Updates | to close | ||
31917 | to_json of Series with period dtype results in AttributeError | to close | ||
37100 | BUG: Series.to_json produces incorrect json format | to be completed | ||
45959 | QST: Why to_json defaults to force_ascii=True | question | ||
27241 | ENH: Ignore flattening certain keys in json_normalize | normalize | ||
33414 | ENH: Optionally pass dtypes as a dict into json_normalize | normalize | ||
34028 | What is the best way to normalize_json before read_json for the file with gigabytes size? | normalize | ||
34465 | BUG: unexpected behavior of json_normalize meta arg | normalize | ||
36245 | BUG: pd.json_normalize on a column loses rows that have an empty list for that column | normalize | ||
42311 | ENH: json_normalize flatten lists as well | normalize | ||
44329 | ENH: errors='ignore' should work for record_path for pandas.json_normalize function | normalize | ||
51452 | pd.json_normalize doesn't return data with index from series | normalize | ||
53126 | BUG: json_normalize does not parse nested lists consistently | normalize | ||
54121 | DOC: description of record_prefix param for json_normalize is wrong | normalize | ||
29928 | Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work | multiindex | ||
50456 | BUG: JSON serialization with orient split fails roundtrip with MultiIndex | multiindex | ||
42582 | ENH: col descriptions that'd save in df schemas - helping users avoid creating separate documentation? | metadata | ||
51012 | ENH: Include df.attrs in to_json output | metadata | ||
19261 | Standardize pandas metadata for table schema and parquet | internal | ||
20599 | OverflowError: Python int too large to convert to C long | internal | ||
28180 | to_iso methods for DatetimeLikeArray | internal | ||
32326 | "Unexpected behaviour of df.to_json(compression=""gzip"")" | internal | ||
33014 | to_json should make separators configurable (similar to json.dump) | internal | ||
33877 | BUG: weird interaction between pyslurm - ujson that changes function signature of ujson.dumps | internal | ||
35279 | pandas/tests/io/json/test_pandas.py::TestPandasContainer::test_read_json_large_numbers failing for 32-bit system | internal | ||
39135 | ENH: Add support for date_unit to be specified per column in to_json | internal | ||
41521 | ENH: Add support to read_json to encode character escape hex codes to utf-8 characters | internal | ||
44881 | ENH: change pd.read_json kwarg to rtype or return_type? | internal | ||
49604 | BUG/CLN: Vendored ujson Module | internal | ||
54865 | BUG: LSAN Detected Memory Leaks | internal | ||
17220 | Enhancement: to_json and read_json for DataFrame should have option to output/parse values by column | format | ||
39913 | ENH: new orient setting for read_json to support common API format | format | ||
46571 | ENH: Allow usage of custom library to serialize with to_json method | format | ||
12286 | Feature suggestion: flexible hierarchical data (json) importer (will implement if interest exists) | extension | ||
22853 | Add chunksize support to to_json | chunk |