Skip to content

BUG: astyping Categorical to nullable integer dtype #39616

Closed
@arw2019

Description

@arw2019

This astyping op works for int64 but throws for Int64. It should work the same for both

In [16]: import numpy as np
    ...: import pandas as pd
    ...: 
    ...: dtype = pd.CategoricalDtype([str(i) for i in range(5)])
    ...: arr = pd.Categorical.from_codes(np.random.randint(5, size=20), dtype=dtype)
    ...: arr
Out[16]: 
['4', '1', '0', '1', '4', ..., '3', '0', '1', '1', '0']
Length: 20
Categories (5, object): ['0', '1', '2', '3', '4']

In [17]: arr.astype("int64")
Out[17]: array([4, 1, 0, 1, 4, 0, 1, 0, 3, 0, 3, 0, 1, 4, 2, 3, 0, 1, 1, 0])

In [18]: arr.astype("Int64")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-18-440524b22c6b> in <module>
----> 1 arr.astype("Int64")

~/repos/pandas/pandas/core/arrays/categorical.py in astype(self, dtype, copy)
    456         # TODO: consolidate with ndarray case?
    457         elif is_extension_array_dtype(dtype):
--> 458             result = array(self, dtype=dtype, copy=copy)
    459 
    460         elif is_integer_dtype(dtype) and self.isna().any():

~/repos/pandas/pandas/core/construction.py in array(data, dtype, copy)
    292     if is_extension_array_dtype(dtype):
    293         cls = cast(ExtensionDtype, dtype).construct_array_type()
--> 294         return cls._from_sequence(data, dtype=dtype, copy=copy)
    295 
    296     if dtype is None:

~/repos/pandas/pandas/core/arrays/integer.py in _from_sequence(cls, scalars, dtype, copy)
    306         cls, scalars, *, dtype: Optional[Dtype] = None, copy: bool = False
    307     ) -> IntegerArray:
--> 308         values, mask = coerce_to_array(scalars, dtype=dtype, copy=copy)
    309         return IntegerArray(values, mask)
    310 

~/repos/pandas/pandas/core/arrays/integer.py in coerce_to_array(values, dtype, mask, copy)
    170             "mixed-integer-float",
    171         ]:
--> 172             raise TypeError(f"{values.dtype} cannot be converted to an IntegerDtype")
    173 
    174     elif is_bool_dtype(values) and is_integer_dtype(dtype):

TypeError: object cannot be converted to an IntegerDtype

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugExtensionArrayExtending pandas with custom dtypes or arrays.NA - MaskedArraysRelated to pd.NA and nullable extension arrays

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions