Closed
Description
Code Sample, a copy-pastable example if possible
Basic example of the issue, specific to TimedeltaIndex
, xref #20408 (comment)
In [2]: s = pd.Series(list('abcde'), pd.timedelta_range(0, 4, freq='ns'))
In [3]: s.loc[True]
Out[3]: 'b'
In [4]: s.loc[False:True]
Out[4]:
00:00:00 a
00:00:00.000000 b
Freq: N, dtype: object
Indexing with both boolean labels and slices was successful, which doesn't seem right.
I investigated this same behavior across various index types for both Series
and DataFrame
, and produced the summary below.
Summary
- 'raises' column indicates if the indexing operation raised an exception
- 'exception' column indicates the type of exception raised
raises exception
CategoricalIndex DataFrame label True KeyError
slice True KeyError
Series label False NaN
slice True KeyError
DatetimeIndex DataFrame label False NaN
slice False NaN
Series label False NaN
slice False NaN
Float64Index DataFrame label False NaN
slice False NaN
Series label False NaN
slice False NaN
Index DataFrame label False NaN
slice False NaN
Series label False NaN
slice False NaN
Int64Index DataFrame label True KeyError
slice False NaN
Series label True KeyError
slice False NaN
IntervalIndex DataFrame label True TypeError
slice True TypeError
Series label True TypeError
slice True TypeError
MultiIndex DataFrame label True KeyError
slice True KeyError
Series label True KeyError
slice True KeyError
PeriodIndex DataFrame label True KeyError
slice False NaN
Series label True KeyError
slice False NaN
RangeIndex DataFrame label True KeyError
slice False NaN
Series label True KeyError
slice False NaN
TimedeltaIndex DataFrame label False NaN
slice False NaN
Series label False NaN
slice False NaN
UInt64Index DataFrame label True KeyError
slice False NaN
Series label True KeyError
slice False NaN
Code to produce summary
indexes = [
pd.RangeIndex(4),
pd.Int64Index(range(4)),
pd.UInt64Index(range(4)),
pd.Float64Index(range(4)),
pd.CategoricalIndex(range(4)),
pd.date_range(0, periods=4, freq='ns'),
pd.timedelta_range(0, periods=4, freq='ns'),
pd.interval_range(0, periods=4),
pd.Index([0, 1, 2, 3], dtype=object),
pd.MultiIndex.from_product([[0, 1], [0, 1]]),
pd.period_range('2018Q1', freq='Q', periods=4), # need better example here
]
result = {}
for index in indexes:
index_name = type(index).__name__
s = pd.Series(list('abcd'), index=index)
for obj in (s, s.to_frame()):
obj_name = type(obj).__name__
# check single label
key = (index_name, obj_name, 'label')
try:
obj.loc[True]
result[key] = {'raises': False}
except Exception as e:
result[key] = {'raises': True, 'exception': type(e).__name__}
# check slice
key = (index_name, obj_name, 'slice')
try:
obj.loc[False:True]
result[key] = {'raises': False}
except Exception as e:
result[key] = {'raises': True, 'exception': type(e).__name__}
result = pd.DataFrame.from_dict(result, orient='index')
Expected Output
I'd generally expect all of these operations to raise a KeyError
, which a couple potential exceptions:
- I'd be open to an argument for numeric indexes casting to integer equivalent. Seems like this should at least be consistent for labels vs slices, which it is not right now.
- Maybe we should allow conversion for the
object
dtypeIndex
?