Skip to content

BUG: groupby-nunique modifies null values #31950

Closed
@thomas-reineking-by

Description

@thomas-reineking-by

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np
df = pd.DataFrame({"GROUP": 0, "VALUE": [1.0, np.nan]})
df.groupby("GROUP")["VALUE"].nunique()
print(df)

Problem description

Original dataframe is modified:

   GROUP         VALUE
0      0  1.000000e+00
1      0 -9.223372e+18

Issue seems to have been introduced in version 1.0.0, 0.25.3 works as expected.

Expected Output

Original dataframe should not be modified.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.6.6.final.0 python-bits : 64 OS : Linux OS-release : 4.9.87-linuxkit-aufs machine : x86_64 processor : byteorder : little LC_ALL : en_US.UTF-8 LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.0.1
numpy : 1.17.2
pytz : 2019.3
dateutil : 2.8.1
pip : 19.2.3
setuptools : 41.2.0
Cython : 0.29.13
pytest : 5.2.1
hypothesis : 4.23.0
sphinx : 1.7.9
blosc : None
feather : None
xlsxwriter : 1.2.2
lxml.etree : 4.4.1
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.8.2 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : 7.9.0
pandas_datareader: None
bs4 : None
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.13.0
pytables : None
pytest : 5.2.1
pyxlsb : None
s3fs : None
scipy : 1.2.1
sqlalchemy : 1.3.11
tables : None
tabulate : 0.8.5
xarray : None
xlrd : None
xlwt : None
xlsxwriter : 1.2.2
numba : 0.45.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions