Skip to content

What to do with future Ambiguity Error about overlapping names when sorting? #21081

Closed
@TomAugspurger

Description

@TomAugspurger

xref #17361

In [24]: df = pd.DataFrame({"a": [1, 2], "b": [3, 4]}, index=pd.Index([1, 2], name='a'))

In [25]: df.sort_values(['a', 'b'])
/Users/taugspurger/.virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: 'a' is both an index level and a column label.
Defaulting to column, but this will raise an ambiguity error in a future version
  #!/Users/taugspurger/Envs/pandas-dev/bin/python3
Out[25]:
   a  b
a
1  1  3
2  2  4

What should the user do in this situation? Should we provide a keyword to disambiguate? A literal like pd.ColumnLabel('a') or pd.IndexName('a')? Or do we require that they rename an index or column? Right now, they're essentially stuck with the last one. If we want to discourage that, then I suppose that's OK. But it's somewhat common to end up with overlapping names, from e.g. a groupby.

cc @jmmease

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions