Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
Pandas Series have a property is_unique
that returns true iff all values are distinct.
In practice, one often needs to know if a Series is constant. Getting a functional result when series do not contain nans is straight forward: (series.values[0] == series.values).all()
. However, in the wild we observe that users often simply write series.nunique() == 1
. This approach is more costly for larger series, as it first computes the number of unique values before comparing.
My assumption is that introducing a dedicated is_constant
property will help users choose the more performant option. Another assumption is that its acceptable that all-nan series are somewhat slower.
def setup(n):
return pd.Series(list(range(n)))
def setup(n):
return pd.Series([1] * (n - 1) + [2])
Feature Description
@property
def is_constant():
if v.shape[0] == 0:
return False
return (v[0] == v).all()
(based on is_unique
and nunique
, in the absence of nans)
To extend to NA values:
If dropna=True
: add v = remove_na_arraylike(v)
If dropna=False
: add or not pd.notna(v).any()
(note: differs from np.unique
for pd.NA
and np.nan
mixed series by lack of disambiguation between the two)
Alternative Solutions
Recommend the approach above for series.nunique() == 1
patterns in the documentation (here)
Additional Context
No response