Skip to content

Performance regression in gil.ParallelDatetimeFields.time_period_to_datetime #33919

Closed
@TomAugspurger

Description

@TomAugspurger
import pandas as pd
import numpy as np
from pandas._testing import test_parallel

N = 10 ** 6
dti = pd.date_range("1900-01-01", periods=N, freq="T")
period = dti.to_period("D")



@test_parallel(num_threads=2)
def run(period):
    period.to_timestamp()

%timeit run(period)
# 1.0.2
96.8 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# master
129 ms ± 1.28 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

https://pandas.pydata.org/speed/pandas/index.html#gil.ParallelDatetimeFields.time_period_to_datetime?commits=d106b81ce532bc71ec6cced944ddb751a4b0e5a3-577de1c6b5cda7f5ae0e4832c2bc3f97ca186e9b points to d106b81...577de1c, perhaps #33491 or #33047 (cc @jbrockmendel)

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypePerformanceMemory or execution speed performanceRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions