Skip to content

BUG: Pandas 1.0.x read_json unable to convert lists with date values #33787

Closed
@reneeburton

Description

@reneeburton
  • [ X] I have checked that this issue has not already been reported.

  • [ X] I have confirmed this bug exists on the latest version of pandas.

  • [ X] (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd

# this works fine in 0.25.x and 1.0.x 
json_line= pd.DataFrame([([1, 2],  "hector")], columns=['accounts',  'name']).to_json(lines=True, orient='records')
data = pd.read_json(json_line)

# this works fine in 0.25.x and 1.0.x --- convert_dates = False (non-default behavior)
json_line= pd.DataFrame([([1, 2], ['2020-03-05', '2020-04-08T09:58:49+00:00'], "hector")], columns=['accounts', 'date', 'name']).to_json(lines=True, orient='records')
json_line
data = pd.read_json(json_line, convert_dates=False)

# this does not error in 0.25.x but errors in 1.0.x  -- error list is unhashable 
json_line= pd.DataFrame([([1, 2], ['2020-03-05', '2020-04-08T09:58:49+00:00'], "hector")], columns=['accounts', 'date', 'name']).to_json(lines=True, orient='records')
json_line
data = pd.read_json(json_line) 

Problem description

In pandas 0.25.x (and below), pandas read_json(path, lines=True) was able to read newline-delimited json files with blobs with entries that contained lists of all types. In Pandas 1.0.x, this same data causes a unhashable object error. The error appears to be due to lists that contain date-like objects that are converted by default with convert_dates=True.

In pandas 0.25.x, it handles the lists of dates, but does not convert those items. they are kept as strings.

Expected Output

pandas dataframe containing lists without error.

df.to_json()

returns

'{"accounts":{"0":[1,2]},"event_time":{"0":["2020-04-08T09:50:49+00:00","2020-04-08T09:58:49+00:00"]},"name":{"0":"hector"}}'

Output of 1.0.3

Using pd.read_json(json_line, lines=True)
Errors vary slightly depending on context:

  • TypeError: unhashable type: 'list'
  • TypeError: <class 'list'> is not convertible to datetime. in the small example above

using pd.read_json(json_line, lines=True, convert_dates=False) returns expected output consistent with pandas 0.25.x.

[paste the output of 0.25.3 here leaving a blank line after the details tag]

'{"accounts":{"0":[1,2]},"event_time":{"0":["2020-04-08T09:50:49+00:00","2020-04-08T09:58:49+00:00"]},"name":{"0":"hector"}}'

Metadata

Metadata

Assignees

Labels

IO JSONread_json, to_json, json_normalizeNeeds TestsUnit test(s) needed to prevent regressionsgood first issue

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions