Skip to content

API: ExtensionArrays and conversion to "native" types (eg in tolist, to_dict, iteration, ..) #29738

Open
@jorisvandenbossche

Description

@jorisvandenbossche

We try to consistently return python objects (instead of numpy scalars) in certain functions like tolist, to_dict, itertuples/items, .. (we have had quite some issues fixing this in several cases).

However, currently we don't do that for extension dtypes (and don't have any mechanism to ask for this):

In [33]: type(pd.Series([1, 2], dtype='int64').tolist()[0]) 
Out[33]: int

In [34]: type(pd.Series([1, 2], dtype='Int64').tolist()[0])  
Out[34]: numpy.int64

In [36]: type(pd.Series([1, 2], dtype='int64').to_dict()[0]) 
Out[36]: int

In [37]: type(pd.Series([1, 2], dtype='Int64').to_dict()[0])
Out[37]: numpy.int64
In [45]: s = pd.Series([1, 2], dtype='int64') 

In [46]: type(list(s.iteritems())[0][1])  
Out[46]: int

In [47]: s = pd.Series([1, 2], dtype='Int64')      

In [48]: type(list(s.iteritems())[0][1])  
Out[48]: numpy.int64

Should we add some API to ExtensionArray to provide this? Eg a method to iterate through the elements that returns "native" objects?

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugDtype ConversionsUnexpected or buggy dtype conversionsExtensionArrayExtending pandas with custom dtypes or arrays.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions