Closed
Description
Code Sample
import pandas as pd
import numpy as np
df = pd.DataFrame(data = {
"x": [1, 1, 1],
"y": [np.nan, np.nan, np.nan],
"z": [1, 2, 3]
})
# Works on SeriesGroupBy
df.groupby(["x", "y"])["z"].quantile(0.5)
# Segfault on DataFrameGroupBy
df.groupby(["x", "y"])[["z"]].quantile(0.5)
Problem description
Hello all,
I just noticed that there is still an issue with the quantile
function when used on a DataFrameGroupBy
object with Pandas 0.25.1. When the groupby operation is done with an empty column, a segfault occurs. Above is a code sample to reproduce the bug. This issue didn't occur with Pandas 0.24.2.
Expected Output
Pandas 0.24.2 gives this output :
Empty DataFrame
Columns: []
Index: []
Output of pd.show_versions()
My computer is running Ubuntu 18.04. Below are the installed packages (in a clean virtual env). The bug doesn't seem to be related to numpy as it also occured with version 1.16.4
.
>>> pd.show_versions()
INSTALLED VERSIONS
------------------
commit : None
python : 3.6.8.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-58-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.1
numpy : 1.17.1
pytz : 2019.2
dateutil : 2.8.0
pip : 19.2.3
setuptools : 41.2.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None