Skip to content

PERF: performance problem when comparing timestamp to datetimindex  #52080

Closed
@phofl

Description

@phofl

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this issue exists on the latest version of pandas.

  • I have confirmed this issue exists on the main branch of pandas.

Reproducible Example

rg = pd.date_range("2020-01-01", periods=100_000, freq="s")

ts_ns = pd.Timestamp("1996-01-01 00:00:00.00000000000")
ts_s = pd.Timestamp("1996-01-01")

Following timings:

%timeit rg < ts_s
2.27 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit rg < ts_ns
108 µs ± 572 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

I guess a bunch of users will define timestamps not up to the nanosecond and hence getting mismatched resolutions which causes a really big slowdown. Can we fix this somehow for 2.0?

Time is almost exclusively spent in

{pandas._libs.tslibs.np_datetime.compare_mismatched_resolutions}

cc @jbrockmendel @MarcoGorelli

Installed Versions

main

Prior Performance

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Non-Nanodatetime64/timedelta64 with non-nanosecond resolutionPerformanceMemory or execution speed performanceTimestamppd.Timestamp and associated methods

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions