Skip to content

read_json: ValueError: Value is too big #26068

Closed
@jmh045000

Description

@jmh045000

Reopening issue #14530. The close description is incorrect. The JSON specification explicitly states that limits are not in the specification.

From https://tools.ietf.org/html/rfc7159#section-6

This specification allows implementations to set limits on the range and precision of numbers accepted

The standard json library in python supports large numbers, meaning the language supports JSON with these values.

Python 3.6.8 (default, Apr  7 2019, 21:09:51)
[GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> j = """{"id": 9253674967913938907}"""
>>> json.loads(j)
{'id': 9253674967913938907}

Loading a json file with large integers (> 2^32), results in "Value is too big". I have tried changing the orient to "records" and also passing in dtype={'id': numpy.dtype('uint64')}. The error is the same.

import pandas
data = pandas.read_json('''{"id": 10254939386542155531}''')
print(data.describe())

Expected Output

                          id
count                      1
unique                     1
top     10254939386542155531
freq                       1

Actual Output (even with dtype passed in)

 File "./parse_dispatch_table.py", line 34, in <module>
    print(pandas.read_json('''{"id": 10254939386542155531}''', dtype=dtype_conversions).describe())
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 234, in read_json
    date_unit).parse()
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 302, in parse
    self._parse_no_numpy()
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 519, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Value is too big

No problem using read_csv:

import pandas
import io
print(pandas.read_csv(io.StringIO('''id\n10254939386542155531''')).describe())

Output using read_csv

                          id
count                      1
unique                     1
top     10254939386542155531
freq                       1

Output of pd.show_versions()

Metadata

Metadata

Assignees

Labels

IO JSONread_json, to_json, json_normalize

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions