Skip to content

BUG: Index with duplicate labels raises ValueError in Dataframe.query #51815

Open
@ZhaJiMan

Description

@ZhaJiMan

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import numpy as np
import pandas as pd

data = [1, 2, 3]
index = [1, 1, 2]
df = pd.Series(data, index=index, name='col').to_frame()
df.query('index == 1 and col == 2')

Issue Description

When the index of a dataframe has duplicate labels, multiple conditions involving index in Dataframe.query method will raise an error:

ValueError: cannot reindex on an axis with duplicate labels

Expected Behavior

The dataframe is like

    col
1     1
1     2
2     3

The query should return

    col
1     2

But query with engine='python' and df.loc[(df.index == 1) & (df['col'] == 2)] give expected result.

Installed Versions

pandas : 1.5.1
numpy : 1.23.4
numexpr : 2.8.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions