Skip to content

BUG: Cannot add non-vectorized DateOffset to empty DatetimeIndex #12724

Closed
@zhengyu-inboc

Description

@zhengyu-inboc

After the holidays refactor in pandas 0.17.0+, I'm seeing errors when calling the AbstractHolidayCalendar.holidays() method with a date range that is before the declared Holiday rule. Consider the example below, I'm declaring a custom MLK holiday for CME equities market with a specific start date of 1998-01-01, with a non-vectorised DateOffset (third monday of January).

If I call calendar.holidays(start='2015-01-01', end='2015-12-31'), it blows up with an exception:

Traceback (most recent call last):
  File "/tests/test_pandas_features.py", line 35, in test_no_holidays_generated_if_not_in_range
    calendar.holidays(start='1995-01-01', end='1995-12-31'))
  File "/python3/lib/python3.5/site-packages/pandas/tseries/holiday.py", line 377, in holidays
    rule_holidays = rule.dates(start, end, return_name=True)
  File "/python3/lib/python3.5/site-packages/pandas/tseries/holiday.py", line 209, in dates
    holiday_dates = self._apply_rule(dates)
  File "/python3/lib/python3.5/site-packages/pandas/tseries/holiday.py", line 278, in _apply_rule
    dates += offset
  File "/python3/lib/python3.5/site-packages/pandas/tseries/base.py", line 412, in __add__
    return self._add_delta(other)
  File "/python3/lib/python3.5/site-packages/pandas/tseries/index.py", line 736, in _add_delta
    result = DatetimeIndex(new_values, tz=tz, name=name, freq='infer')
  File "/python3/lib/python3.5/site-packages/pandas/util/decorators.py", line 89, in wrapper
    return func(*args, **kwargs)
  File "/python3/lib/python3.5/site-packages/pandas/tseries/index.py", line 231, in __new__
    raise ValueError("Must provide freq argument if no data is "
ValueError: Must provide freq argument if no data is supplied

After some digging, it appears that if the supplied date range is a year before the declared holiday rule start date, the DatetimeIndex constructed in tseries/holiday.py is an empty index (a reasonable optimization), but when a non-vectorized DateOffset is being applied to an empty DatetimeIndex, the _add_offset method in tseries/index.py returns a plain empty Index, rather than DatetimeIndex, on which .asi8 is subsequently called and returns None instead of an empty numpy i8 array.

So the underlying problem is that an empty DatetimeIndex doesn't like a non-vectorized DateOffset applied to it.

Observe the testcase below, which reproduces the issue.

Test case

import unittest

import pandas as pd
import pandas.util.testing as pdt
from dateutil.relativedelta import MO
from pandas.tseries.holiday import AbstractHolidayCalendar, Holiday
from pandas.tseries.offsets import DateOffset

CME1998StartUSMartinLutherKingJr = Holiday('Dr. Martin Luther King Jr.', month=1, day=1, offset=DateOffset(weekday=MO(3)), start_date='1998-01-01')


class CustomHolidayCalendar(AbstractHolidayCalendar):
    rules = [CME1998StartUSMartinLutherKingJr]


class TestPandasIndexFeatures(unittest.TestCase):
    def test_empty_datetime_index_added_to_non_vectorized_date_offset(self):
        empty_ts_index = pd.DatetimeIndex([], freq='infer')
        # This blows up
        new_ts_index = empty_ts_index + DateOffset(weekday=MO(3))
        self.assertEqual(0, len(new_ts_index))

    def test_no_holidays_generated_if_not_in_range(self):
        calendar = CustomHolidayCalendar()

        # This blows up
        pdt.assert_index_equal(
            pd.DatetimeIndex([]),
            calendar.holidays(start='1995-01-01', end='1995-12-31'))

    def test_holidays_generated_if_in_range(self):
        calendar = CustomHolidayCalendar()

        pdt.assert_index_equal(
            pd.DatetimeIndex(['2015-01-19']),
            calendar.holidays(start='2015-01-01', end='2015-12-31'))

Output

The following tests fail:

  • test_empty_datetime_index_added_to_non_vectorized_date_offset
  • test_no_holidays_generated_if_not_in_range

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: 5870731f32ae569e01e3c0a8972cdd2c6e0301f8
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8

pandas: 0.18.0+44.g5870731.dirty
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3.1
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: None
patsy: 0.4.1
dateutil: 2.5.1
pytz: 2016.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions