Closed
Description
Found a bug in a corner case while working on expanded test coverage in #23167:
>>> # the following three have the same result
>>> pd.Series(['a', 'b', 'aa'], dtype='category').str.extractall(r'(a)')
>>> pd.Series(['a', 'b', 'aa']).str.extractall(r'(a)')
>>> pd.Index(['a', 'b', 'aa']).str.extractall(r'(a)')
0
match
0 0 a
2 0 a
1 a
>>> pd.Index(['a', 'b', 'aa'], dtype='category').str.extractall(r'(a)')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\ProgramData\Miniconda3\envs\pandas-dev\lib\site-packages\pandas\core\strings.py", line 2567, in extractall
return str_extractall(self._orig, pat, flags=flags)
File "C:\ProgramData\Miniconda3\envs\pandas-dev\lib\site-packages\pandas\core\strings.py", line 1012, in str_extractall
is_mi = arr.index.nlevels > 1
AttributeError: 'CategoricalIndex' object has no attribute 'index'
So Series
, categorical Series
and Index
work with extractall
, but not CategoricalIndex
.