Closed
Description
I would like to know how many bytes my dataframe takes up in memory. The standard way to do this is the memory_usage
method
df.memory_usage(index=True)
For object dtype columns this measures 8 bytes per element, the size of the reference not the size of the full object. In some cases this significantly underestimates the size of the dataframe.
It might be nice to optionally map sys.getsizeof
on object dtype columns to get a better estimate of the size. If this ends up being expensive then it might be good to have this as an optional keyword argument.
df.memory_usage(index=True, measure_object=True)