Skip to content

ENH: bool upcast to Int64 #57298

Open
Open
@VladimirFokow

Description

@VladimirFokow

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I wish I could upcast bool to Int64

Feature Description

I have data which can be either bool or NaN.

So a simple int(val) wouldn't work (due to NaNs producing an error).

I don't want a boolean output array, I want Int64. Is there a reason not to allow casting of bools to Int64?

# Toy example
import pandas as pd

data = [True, False, pd.NA]
out = pd.Series(dtype='Int64', index=np.arange(5))

# Desired:
for val in data:
    out[0] = val  # TypeError: Invalid value 'True' for dtype Int64

Alternative Solutions

# Current workaround:
for val in data:
    out[0] = pd.Series(val, dtype='Int64')[0]

Additional Context

  • Would the integration with Arrow be a good opportunity to implement these?

  • because here it says:

    Upcasting is always according to the NumPy rules. If two different dtypes are involved in an operation, then the more general one will be used as the result of the operation.

    .. maybe now it's not going to be "always according to the NumPy rules" but also to some Arrow rules?

  • Are there any other not-implemented upcasting rules which should be?

However, this proves that such upcasting already works, so I'm puzzled why my example above doesn't:

s = pd.Series(data)
s.astype('Int64')

is the tag BUG: more appropriate here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNA - MaskedArraysRelated to pd.NA and nullable extension arraysNeeds DiscussionRequires discussion from core team before further actionPDEP6-relatedrelated to PDEP6 (not upcasting during setitem-like Series operations)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions