Skip to content

BUG: Runtime warning with groupby/tail when None appears in group column #46814

Closed
@ian-r-rose

Description

@ian-r-rose

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas
df = pandas.DataFrame(
    [
        ["a", 1],
        ["a", 2],
        [None, 0],
        ["b", 2],
        ["b", 3],

    ],
    columns=["x", "y"],
)
df.groupby("x", dropna=False).tail(1)   # Succeeds
df.groupby("x", dropna=True).tail(1)   # Produces warning

Issue Description

👋 In pandas main, a groupby/tail on a DataFrame which contains nulls in the grouped column produces a RuntimeWarning suggesting a mishandled case in indexing logic:

/lib/python3.8/site-packages/pandas/core/groupby/indexing.py:217: RuntimeWarning: invalid value encountered in remainder
  mask &= offset_array % step == 0

If I set dropna=False, the snippet succeeds without producing a warning.

Based on the warning location and git blame, it may be related to #42947

Expected Behavior

The above groupby should succeed without producing a warning.

Installed Versions

Pandas main, as well as 1.4.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugClosing CandidateMay be closeable, needs more eyeballsGroupbyUpstream issueIssue related to pandas dependencyWarningsWarnings that appear or should be added to pandas

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions