Closed
Description
Hey all,
Just upgraded my pandas version from 0.12 to 0.13.1 and noticed a significant performance regression for indexing operations (get, set, and windowing).
Here is my test setup code:
import pandas as pd
ts1 = pd.TimeSeries(data=100.0, index=pd.date_range('2000-01-01', periods=1000))
ts2 = pd.TimeSeries(data=200.0, index=pd.date_range('2000-01-01', periods=1000))
ts3 = pd.TimeSeries(data=300.0, index=pd.date_range('2000-01-01', periods=1000))
df = pd.DataFrame({'ts1': ts1, 'ts2': ts2, 'ts3': ts3})
dt = ts1.index[500]
Here is a table showing the results of IPython's %timeit function.
test | 0.12 | 0.13.1 |
---|---|---|
ts1[dt] | 3.78 | 8.5 |
ts1.ix[dt] | 11.8 | 30.7 |
ts1.loc[dt] | 12.7 | 37.7 |
ts1[dt]=1 | 1.86 | 4.32 |
ts1.ix[dt]=1 | 12.5 | 65.9 |
ts1.loc[df]=1 | 36.2 | 65.7 |
ts1[:dt] | 78.2 | 101 |
ts1.ix [:dt] | 53.1 | 106 |
ts1.loc [:dt] | 59.5 | 101 |
df.ix[dt] | 45.3 | 77.9 |
df.ix [:dt] | 63.3 | 85.9 |
I did not see up-to-date data on http://pandas.pydata.org/pandas-docs/vbench/vb_indexing.html - am I looking at the right benchmark data? Most charts end in June 2012.
Can someone confirm this slowdown?
I am using numpy 1.8.1 by the way - let me know if you need any other version numbers.
Thanks in advance!