Skip to content

Optionally use sys.getsizeof in DataFrame.memory_usage #11595

Closed
@mrocklin

Description

@mrocklin

I would like to know how many bytes my dataframe takes up in memory. The standard way to do this is the memory_usage method

df.memory_usage(index=True)

For object dtype columns this measures 8 bytes per element, the size of the reference not the size of the full object. In some cases this significantly underestimates the size of the dataframe.

It might be nice to optionally map sys.getsizeof on object dtype columns to get a better estimate of the size. If this ends up being expensive then it might be good to have this as an optional keyword argument.

df.memory_usage(index=True, measure_object=True)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions