Closed
Description
Is your feature request related to a problem?
In pandas version 1.2.5., using groupby.max() on a large matrix of int8 datatype 0/1 values, pandas casts the dataframe to int64, resulting in
MemoryError: Unable to allocate 76.4 GiB for an array with shape (1915674, 5356) and data type int64
Traceback:
/python3.9/site-packages/pandas/core/dtypes/common.py in ensure_int_or_float(arr, copy)
143 try:
144 # error: Unexpected keyword argument "casting" for "astype"
--> 145 return arr.astype("int64", copy=copy, casting="safe") # type: ignore[call-arg]
146 except TypeError:
147 pass
Describe the solution you'd like
Keep the original datatype, in this case int8.