Skip to content

ENH: series.is_constant #54033

Closed
Closed
@sbrugman

Description

@sbrugman

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Pandas Series have a property is_unique that returns true iff all values are distinct.

In practice, one often needs to know if a Series is constant. Getting a functional result when series do not contain nans is straight forward: (series.values[0] == series.values).all(). However, in the wild we observe that users often simply write series.nunique() == 1. This approach is more costly for larger series, as it first computes the number of unique values before comparing.

My assumption is that introducing a dedicated is_constant property will help users choose the more performant option. Another assumption is that its acceptable that all-nan series are somewhat slower.

perf_short_circuit

def setup(n):
    return pd.Series(list(range(n)))

perf_worst

def setup(n):
    return pd.Series([1] * (n - 1) + [2])

Feature Description

@property
def is_constant():
    if v.shape[0] == 0:
        return False
    return (v[0] == v).all()

(based on is_unique and nunique, in the absence of nans)

To extend to NA values:
If dropna=True: add v = remove_na_arraylike(v)
If dropna=False: add or not pd.notna(v).any() (note: differs from np.unique for pd.NA and np.nan mixed series by lack of disambiguation between the two)

Alternative Solutions

Recommend the approach above for series.nunique() == 1 patterns in the documentation (here)

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions