Description
Let's say we have two Series
s1 and s2, which can be the output of the pd.value_counts()
function, and we want to combine them into one DataFrame
s1 = pd.Series([39,6,4], index=pd.CategoricalIndex(['female','male','unknown']))
s2 = pd.Series([2,152,2,242,150], index=pd.CategoricalIndex(['f','female','m','male','unknown']))
pd.DataFrame([s1,s2])
The result is
female male unknown f m
0 NaN 39.0 NaN 6.0 4.0
1 2.0 152.0 2.0 242.0 150.0
where the order of categories in the first row is changed.
And the current workaround is
pd.DataFrame([pd.Series(s1.values,index=s1.index.astype(list)),pd.Series(s2.values,index=s2.index.astype(list))])
which gives the correct result
f female m male unknown
0 NaN 39.0 NaN 6.0 4.0
1 2.0 152.0 2.0 242.0 150.0
Output of pd.show_versions()
pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 27.2.0
Cython: None
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None