Skip to content

saving on memory can cost both memory and performance #9073

Closed
@behzadnouri

Description

@behzadnouri

xref: #8676 (comment)

memory cost:

with int16 labels:

$ python -m memory_profiler mem-profile.py 
Filename: mem-profile.py

Line #    Mem usage    Increment   Line Contents
================================================
     4   80.156 MiB    0.000 MiB   @profile
     5                             def ix(obj):
     6   87.809 MiB    7.652 MiB       obj.ix[999]

with int64 labels:

$ python -m memory_profiler mem-profile.py 
Filename: mem-profile.py

Line #    Mem usage    Increment   Line Contents
================================================
     4   79.387 MiB    0.000 MiB   @profile
     5                             def ix(obj):
     6   79.387 MiB    0.000 MiB       obj.ix[999]

performance cost:

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_xs_mi_ix                               |   8.5303 |   0.6206 |  13.7452 |
series_xs_mi_ix                              |   8.0659 |   0.5600 |  14.4023 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [c11e75c] : PERF: set multiindex labels with coerced dtype (GH8456)
Base   [6bbb39e] : Merge pull request #8675 from pydata/setitem

the mem-profile.py used for memory profiling:

@profile
def ix(obj):
    obj.ix[999]

if __name__ == '__main__':
    import numpy as np
    from pandas import MultiIndex, Series
    mi = MultiIndex.from_tuples([(x,y) for x in range(1000) for y in range(1000)])
    ts =  Series(np.random.randn(1000000), index=mi)
    ix(ts)

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions