Closed
Description
So it seems by using a full Int64 array for the codes, plus the categorires we are actually using MORE memory to store a Categorical. Because the pointers are the same sized as an object array (plus have the categories).
So need to change the codes store to use a smaller dtype of int. Maybe switch this to a plain ndarray, and use dtype=uint8
. Would provide a lot of benefit