Closed
Description
make_axis_dummies
has some problems with an axis containing a CategoricalIndex
with extra categories.
Code Sample, a copy-pastable example if possible
# category `z` is not used
cidx = pd.CategoricalIndex(list("xy"), categories=list("xyz"))
df = pd.DataFrame([[10, 11]], columns=cidx)
ldf = pd.Panel({'A': df, 'B': df}).to_frame()
from pandas.core.reshape import make_axis_dummies
make_axis_dummies(ldf)
Out[9]:
minor x y
major minor
0 x 1.0 0.0
y 0.0 1.0
make_axis_dummies(ldf, transform=lambda x: x)
Out[10]:
x y z
major minor
0 x 1.0 0.0 0.0
y 0.0 1.0 0.0
Expected Output
I believe make_axis_dummies(ldf)
and make_axis_dummies(ldf, transform=lambda x: x)
should be equal.
output of pd.show_versions()
pd.show_versions()
INSTALLED VERSIONS
------------------
commit: 5d791cc7d955c0b074ad602eb03fa32bd3e17503
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.1.20-1
machine: x86_64
processor: Intel(R)_Core(TM)_i5-2520M_CPU_@_2.50GHz
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.18.1+368.g5d791cc
nose: 1.3.7
pip: 8.1.2
setuptools: 21.0.0
Cython: 0.24
numpy: 1.11.0
...
In fact, this may be an issue with Panel.to_frame()
rather than make_axis_dummies
:
ldf.index.levels[1]
Out[13]: CategoricalIndex(['x', 'y'], categories=['x', 'y', 'z'], ordered=False, name='minor', dtype='category')
I'd expect this level should contain all categories: CategoricalIndex(['x', 'y', 'z'], categories=['x', 'y', 'z'], ...)
- even if 'z'
is not used. If it had then the both outputs of make_axis_dummies
would be as in Out[10]
.
(Somewhat related to #13854.)