Skip to content

pd.concat reordering categorical levels lexically #7864

Closed
@has2k1

Description

@has2k1

Look at dfx

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"id":[1,2,3,4,5,6], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']})
   ...: df["grade"] = pd.Categorical(df["raw_grade"])
   ...: df['grade'].cat.reorder_levels(['e', 'a', 'b'])
   ...: 

In [3]: df1 = df[0:3]
   ...: df2 = df[3:]
   ...: 

In [4]: df['grade'].cat.levels
Out[4]: Index([u'e', u'a', u'b'], dtype='object')

In [5]: df1['grade'].cat.levels
Out[5]: Index([u'e', u'a', u'b'], dtype='object')

In [6]: df2['grade'].cat.levels
Out[6]: Index([u'e', u'a', u'b'], dtype='object')

In [7]: dfx = pd.concat([df1, df2])

In [8]: dfx['grade'].cat.levels
Out[8]: Index([u'a', u'b', u'e'], dtype='object')

version: pandas: 0.14.1-78-g24b309f

This is still the case after either of PR #7768 and PR #7850.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions