Skip to content

df.to_json() slower in 0.13.x vs 0.12.0 #5765

Closed
@acowlikeobject

Description

@acowlikeobject

df.to_json() method seems consistently ~1.8x slower in version 0.13.x (and a few older 0.12.x versions in the master branch on git) than in 0.12.0.

Version 0.12.0:

Python 2.7.5+ (default, Sep 17 2013, 15:31:50) 
In [1]: import pandas as pd, numpy as np

In [2]: df = pd.DataFrame(np.random.rand(100000,10))

In [3]: %timeit df.to_json(orient='split')
10 loops, best of 3: 96.1 ms per loop

In [4]: pd.__version__, np.__version__
Out[4]: ('0.12.0', '1.7.1') 

Version 0.13.0rc1:

Python 2.7.5+ (default, Sep 17 2013, 15:31:50) 
In [1]: import pandas as pd, numpy as np

In [2]: df = pd.DataFrame(np.random.rand(100000,10))

In [3]: %timeit df.to_json(orient='split')
10 loops, best of 3: 172 ms per loop

In [4]: pd.__version__, np.__version__
Out[4]: ('0.13.0rc1-119-g2485e09', '1.8.0')

The 1.8x factor seems to hold on my machine across Python versions (2.7.5 vs 3.3.2), dataframe sizes, orient values and dtypes (only tried floats and DatetimeIndex).

Was there some change in to_json() or have I goofed something up in my environment?

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO JSONread_json, to_json, json_normalizePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions