Skip to content

REGR: casting datetime strings with offzet to tz-naive datetime64 fails #50140

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

On pandas 1.5 (so no deprecation warning):

>>> pd.Index(['2021-01-01 00:00:00+01:00', '2021-01-02 00:00:00+01:00']).astype("datetime64[ns]")
DatetimeIndex(['2020-12-31 23:00:00', '2021-01-01 23:00:00'], dtype='datetime64[ns]', freq=None)

On the main branch:

>>> pd.Index(['2021-01-01 00:00:00+01:00', '2021-01-02 00:00:00+01:00']).astype("datetime64[ns]")
...
TypeError: Cannot use .astype to convert from timezone-aware dtype to timezone-naive dtype. 
Use obj.tz_localize(None) or obj.tz_convert('UTC').tz_localize(None) instead.

This started to fail a few days ago on pyarrow's CI. This comes up if you roundtrip a pandas DataFrame where the columns are a tz-aware DatetimeIndex (in Arrow they will be string columns, and then in the arrow->pandas conversion we try to cast the string to datetime64 and then localize. We should probably directly cast to the tz-aware dtype).

From a quick look at the recent commits and the code path this takes, might be caused by #50015 (cc @jbrockmendel)

Metadata

Metadata

Assignees

No one assigned

    Labels

    AstypeBlockerBlocking issue or pull request for an upcoming releaseRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions