Skip to content

BUG: in _nsorted for frame with duplicated values index #13412

Closed
@Tux1

Description

@Tux1

The function below has been incorrectly implemented. If the frame has an index with duplicated values, you will get a result with more than n rows and not properly sorted. So nsmallest and nlargest for DataFrame doesn't return a correct frame in this particular case.

def _nsorted(self, columns, n, method, keep):
    if not com.is_list_like(columns):
        columns = [columns]
    columns = list(columns)
    ser = getattr(self[columns[0]], method)(n, keep=keep)
    ascending = dict(nlargest=False, nsmallest=True)[method]
    return self.loc[ser.index].sort_values(columns, ascending=ascending,
                                           kind='mergesort')

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions