Skip to content

BUG: reset_index of MultiIndex with CategoricalIndex levels with missing values fails #24206

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

MultiIndex with categorical levels without missing values, this works:

In [23]: idx = pd.MultiIndex([pd.CategoricalIndex(['A', 'B']), pd.CategoricalIndex(['a', 'b'])], [[0, 0, 1, 1], [0, 1, 0, 1]])


In [25]: df = pd.DataFrame({'col': range(len(idx))}, index=idx)

In [26]: df
Out[26]:
     col
A a    0
  b    1
B a    2
  b    3

In [28]: df.reset_index()
Out[28]:
  level_0 level_1  col
0       A       a    0
1       A       b    1
2       B       a    2
3       B       b    3

Now with a missing value (note the last -1 in the labels, that's the only difference):

In [29]: idx = pd.MultiIndex([pd.CategoricalIndex(['A', 'B']), pd.CategoricalIndex(['a', 'b'])], [[0, 0, 1, 1], [0, 1, 0, -1]])

In [30]: df = pd.DataFrame({'col': range(len(idx))}, index=idx)

In [31]: df
Out[31]:
       col
A a      0
  b      1
B a      2
  NaN    3

In [32]: df.reset_index()
/home/joris/miniconda3/lib/python3.5/site-packages/pandas/core/frame.py:4091: FutureWarning: Interpreting negative values in 'indexer' as missing values.
In the future, this will change to meaning positional indicies
from the right.

Use 'allow_fill=True' to retain the previous behavior and silence this
warning.

Use 'allow_fill=False' to accept the new behavior.
  values = values.take(labels)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/miniconda3/lib/python3.5/site-packages/pandas/core/dtypes/cast.py in maybe_upcast_putmask(result, mask, other)
    249         try:
--> 250             np.place(result, mask, other)
    251         except Exception:

~/miniconda3/lib/python3.5/site-packages/numpy/lib/function_base.py in place(arr, mask, vals)
   2371         raise TypeError("argument 1 must be numpy.ndarray, "
-> 2372                         "not {name}".format(name=type(arr).__name__))
   2373

TypeError: argument 1 must be numpy.ndarray, not Categorical

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-32-6983677cc901> in <module>()
----> 1 df.reset_index()

~/miniconda3/lib/python3.5/site-packages/pandas/core/frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
   4136                     name = tuple(name_lst)
   4137                 # to ndarray and maybe infer different dtype
-> 4138                 level_values = _maybe_casted_values(lev, lab)
   4139                 new_obj.insert(0, name, level_values)
   4140

~/miniconda3/lib/python3.5/site-packages/pandas/core/frame.py in _maybe_casted_values(index, labels)
   4092                     if mask.any():
   4093                         values, changed = maybe_upcast_putmask(
-> 4094                             values, mask, np.nan)
   4095             return values
   4096

~/miniconda3/lib/python3.5/site-packages/pandas/core/dtypes/cast.py in maybe_upcast_putmask(result, mask, other)
    250             np.place(result, mask, other)
    251         except Exception:
--> 252             return changeit()
    253
    254     return result, False

~/miniconda3/lib/python3.5/site-packages/pandas/core/dtypes/cast.py in changeit()
    222             # isn't compatible
    223             r, _ = maybe_upcast(result, fill_value=other, copy=True)
--> 224             np.place(r, mask, other)
    225
    226             return r, True

~/miniconda3/lib/python3.5/site-packages/numpy/lib/function_base.py in place(arr, mask, vals)
   2370     if not isinstance(arr, np.ndarray):
   2371         raise TypeError("argument 1 must be numpy.ndarray, "
-> 2372                         "not {name}".format(name=type(arr).__name__))
   2373
   2374     return _insert(arr, mask, vals)

TypeError: argument 1 must be numpy.ndarray, not Categorical

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions