Skip to content

BUG: cut() precision at the left end does not appear as specified (3 digits by default) #33912

Open
@nnworkspace

Description

@nnworkspace
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

...

bins = 10

df_bins = pd.cut(fatalities_asc.fatality_rate, bins, include_lowest=True)

fatalities_asc['fatality_bin'] = df_bins.values

df_counts = fatalities_asc.groupby('fatality_bin', as_index = False).count()

df_counts

You can find the complete code here:
https://github.com/nnworkspace/covid19-insight/blob/master/covid19-insight.ipynb

Problem description

I expected all bounds of the bins appear to be a rounded float with 3 digits of precision. But the output of above code (please pay attention to the first interval, the left bound):

fatality_bin	Confirmed	Deaths	Recovered	Active	fatality_rate

0 (0.057999999999999996, 1.632] 39 39 39 39 39
1 (1.632, 3.19] 33 33 33 33 33
2 (3.19, 4.748] 27 27 27 27 27
3 (4.748, 6.305] 18 18 18 18 18
4 (6.305, 7.863] 12 12 12 12 12
5 (7.863, 9.421] 2 2 2 2 2
6 (9.421, 10.978] 3 3 3 3 3
7 (10.978, 12.536] 6 6 6 6 6
8 (12.536, 14.094] 2 2 2 2 2
9 (14.094, 15.652] 3 3 3 3 3

When the lower and upper bounds used as labels of a plot, it looks like this (pay attention to the first x-tick label )

image

Expected Output

(0.058, 1.632] should be the lower bound of the first interval, not an infinite number.

Output of pd.show_versions()

pandas version: 1.0.3

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions