Skip to content

Refactor DatetimeArray._generate_range #24562

Closed
@TomAugspurger

Description

@TomAugspurger

Currently, DatetimeArray._generate_range calls DatetimeArray._simple_new 4 times, 3 times with i8 values and ones with M8[ns] values.

diff --git a/pandas/core/arrays/datetimes.py b/pandas/core/arrays/datetimes.py
index 8b0565a36..a7f8e303a 100644
--- a/pandas/core/arrays/datetimes.py
+++ b/pandas/core/arrays/datetimes.py
@@ -213,12 +213,16 @@ class DatetimeArrayMixin(dtl.DatetimeLikeArrayMixin,
     _dtype = None  # type: Union[np.dtype, DatetimeTZDtype]
     _freq = None
 
+    i = 0
+
     @classmethod
     def _simple_new(cls, values, freq=None, tz=None):
         """
         we require the we have a dtype compat for the values
         if we are passed a non-dtype compat, then coerce using the constructor
         """
+        cls.i += 1
+        print(f"DTA._simple_new: {cls.i}")
         assert isinstance(values, np.ndarray), type(values)
         if values.dtype == 'i8':
             # for compat with datetime/timedelta/period shared methods,
diff --git a/pandas/core/indexes/datetimes.py b/pandas/core/indexes/datetimes.py
index 690a3db28..ab08bbf6f 100644
--- a/pandas/core/indexes/datetimes.py
+++ b/pandas/core/indexes/datetimes.py
@@ -273,6 +273,7 @@ class DatetimeIndex(DatetimeIndexOpsMixin, Int64Index, DatetimeDelegateMixin):
                 dayfirst=False, yearfirst=False, dtype=None,
                 copy=False, name=None, verify_integrity=None):
 
+        print("hi")
         if verify_integrity is not None:
             warnings.warn("The 'verify_integrity' argument is deprecated, "
                           "will be removed in a future version.",
In [2]: idx = pd.date_range('2014-01-02', '2014-04-30', freq='M', tz='UTC')
DTA._simple_new: 1
DTA._simple_new: 2
DTA._simple_new: 3
DTA._simple_new: 4

I'm not familiar with this code, but I would naively hope for a function that

  1. Extracts there correct freq / tz from all the arguments (start, end, etc.)
  2. Generates the correct i8 values for start, end, tz
  3. Wraps those i8 values in a DatetimeArray._simple_new at the end.

I'm investigating if this can be done.

I'm not sure if this applies to timedelta as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypePerformanceMemory or execution speed performanceRefactorInternal refactoring of code

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions