Description
Found on Stack Overflow post here
df=pd.DataFrame([[0,0],[1,100],[0,100],[2,0],[3,100],[4,100],[4,100],[4,100],[1,100],[3,100]],columns=['key','score'])
df.groupby('key')['score'].value_counts(bins=[0,20,40,60,80,100])
Problem description
Outputs counts at the with wrong index labels. Note: data only occurs in 0-20 and 80-100.
key score
0 (-0.001, 20.0] 1
(20.0, 40.0] 1
(40.0, 60.0] 0
(60.0, 80.0] 0
(80.0, 100.0] 0
1 (20.0, 40.0] 2
(-0.001, 20.0] 0
(40.0, 60.0] 0
(60.0, 80.0] 0
(80.0, 100.0] 0
2 (-0.001, 20.0] 1
(20.0, 40.0] 0
(40.0, 60.0] 0
(60.0, 80.0] 0
(80.0, 100.0] 0
3 (20.0, 40.0] 2
(-0.001, 20.0] 0
(40.0, 60.0] 0
(60.0, 80.0] 0
(80.0, 100.0] 0
4 (20.0, 40.0] 3
(-0.001, 20.0] 0
(40.0, 60.0] 0
(60.0, 80.0] 0
(80.0, 100.0] 0
Name: score, dtype: int64
Expected Output
df.groupby('key')['score'].apply(pd.Series.value_counts, bins=[0,20,40,60,80,100])
key
0 (80.0, 100.0] 1
(-0.001, 20.0] 1
(60.0, 80.0] 0
(40.0, 60.0] 0
(20.0, 40.0] 0
1 (80.0, 100.0] 2
(60.0, 80.0] 0
(40.0, 60.0] 0
(20.0, 40.0] 0
(-0.001, 20.0] 0
2 (-0.001, 20.0] 1
(80.0, 100.0] 0
(60.0, 80.0] 0
(40.0, 60.0] 0
(20.0, 40.0] 0
3 (80.0, 100.0] 2
(60.0, 80.0] 0
(40.0, 60.0] 0
(20.0, 40.0] 0
(-0.001, 20.0] 0
4 (80.0, 100.0] 3
(60.0, 80.0] 0
(40.0, 60.0] 0
(20.0, 40.0] 0
(-0.001, 20.0] 0
Name: score, dtype: int64
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 1.0.1
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.12
pytest : 5.0.1
hypothesis : None
sphinx : 2.1.2
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.4
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.7.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.1.0
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.0.1
pyxlsb : None
s3fs : None
scipy : 1.3.0
sqlalchemy : 1.3.5
tables : 3.5.2
tabulate : 0.8.6
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8
numba : 0.45.0