Closed
Description
Is your feature request related to a problem?
With pd.Series.value_counts()
it is possible to specify dropna=False
, but that argument does not exist in pd.DataFrame.value_counts()
. As a consequence, all rows that contain at least one NA
element is dropped when using df.value_counts()
.
Describe the solution you'd like
It should be possible to call df.value_counts()
with dropna=False
and get a count for each unique row, including rows that have NA
s in them.
API breaking implications
Like with pd.Series.value_counts()
the default should be dropna=True
. This will keep consistency between the two implementations, and leave current behavior unchanged.
Describe alternatives you've considered
Additional context
>>> import pandas as pd
>>> s1 = pd.Series([1, 2, 3, pd.NA, 3])
>>> s2 = pd.Series([pd.NA, 1, pd.NA, 4, 2])
>>> s1.value_counts(dropna=False)
3.0 2
NaN 1
1.0 1
2.0 1
dtype: int64
>>> df = pd.DataFrame(zip(s1, s2), columns=['s1', 's2'])
>>> df
s1 s2
0 1 <NA>
1 2 1
2 3 <NA>
3 <NA> 4
4 3 2
>>> df.value_counts()
s1 s2
2 1 1
3 2 1
dtype: int64