FIX: Bug whereby array_equivalent was not correctly comparing Float64Ind... #6597

unutbu · 2014-03-11T12:26:58Z

Currently,

>>> import pandas.core.common as com
>>> com.array_equivalent(Float64Index([0, np.nan]), Float64Index([0, np.nan]))
False

Although the current pandas code base does not use array_equivalent to compare Float64Indexes, leaving array_equivalent in its current state may be a bug waiting to happen.

This PR attempts to fix the problem by using pd.isnull for all arrays of dtype object. In a previous PR I tried this and got terrible perf results. Since then I've discovered that my machine does not have enough memory to run the full perf test suit without page faults. If I rerun test_perf.sh for just a few Benchmarks, I can avoid the page faults and get consistent results.

Running /usr/bin/time -v ./test_perf.sh -b master -t fix-equivalent yielded two tests with ratio > 1.1.

reindex_fillna_pad                           |   0.5784 |   0.5034 |   1.1490 |
packers_write_pack                           |  15.2360 |   7.1851 |   2.1205 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

which I believe were due to page faults. When I reran perf on just these tests using
/usr/bin/time -v ./test_perf.sh -b master -t fix-equivalent -r "reindex_fillna_pad|packers_write_pack"

I got

Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
reindex_fillna_pad_float32                   |   0.4633 |   0.4590 |   1.0093 |
packers_write_pack                           |   7.9544 |   7.8390 |   1.0147 |
reindex_fillna_pad                           |   0.7290 |   0.7180 |   1.0154 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

…Indexes with NaNs.

FIX: Bug whereby array_equivalent was not correctly comparing Float64Ind...

jreback · 2014-03-11T12:56:45Z

thank you sir!

FIX: Bug whereby array_equivalent was not correctly comparing Float64…

c567de4

…Indexes with NaNs.

jreback added a commit that referenced this pull request Mar 11, 2014

Merge pull request #6597 from unutbu/fix-equivalent

45009f0

FIX: Bug whereby array_equivalent was not correctly comparing Float64Ind...

jreback merged commit 45009f0 into pandas-dev:master Mar 11, 2014

jreback added Performance labels Mar 11, 2014

jreback added this to the 0.14.0 milestone Mar 11, 2014

unutbu deleted the fix-equivalent branch March 11, 2014 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX: Bug whereby array_equivalent was not correctly comparing Float64Ind... #6597

FIX: Bug whereby array_equivalent was not correctly comparing Float64Ind... #6597

Uh oh!

unutbu commented Mar 11, 2014

Uh oh!

jreback commented Mar 11, 2014

Uh oh!

Uh oh!

Uh oh!

FIX: Bug whereby array_equivalent was not correctly comparing Float64Ind... #6597

FIX: Bug whereby array_equivalent was not correctly comparing Float64Ind... #6597

Uh oh!

Conversation

unutbu commented Mar 11, 2014

Uh oh!

jreback commented Mar 11, 2014

Uh oh!

Uh oh!