Description
Code Sample, a copy-pastable example if possible
Code sample:
# Import packages
import pandas as pd
import numpy as np
# Set up test DataFrame
test_array = np.arange(50) + 100
test_matrix = test_array.reshape((10,5))
test_df = pd.DataFrame(test_matrix).rename(columns={0:'shouldnt be index'})
test_df.loc[0:5,'shouldnt be index'] = 3
test_df.loc[5:8,'shouldnt be index'] = 4
# groupby and agg
end_result = test_df.groupby('shouldnt be index',as_index=False).agg(["min", "max", "count"])
print(end_result)
execution:
>>> # Import packages
... import pandas as pd
>>> import numpy as np
>>> # Set up test DataFrame
... test_array = np.arange(50) + 100
>>> test_matrix = test_array.reshape((10,5))
>>> test_df = pd.DataFrame(test_matrix).rename(columns={0:'shouldnt be index'})
>>> # Make groupby data more grouped
... test_df.loc[0:5,'shouldnt be index'] = 3
>>> test_df.loc[5:8,'shouldnt be index'] = 4
>>> # groupby and agg
... end_result = test_df.groupby('shouldnt be index',as_index=False).agg(["min", "max", "count"])
>>> print(end_result)
1 2 3 4
min max count min max count min max count min max count
shouldnt be index
3 101 121 5 102 122 5 103 123 5 104 124 5
4 126 141 4 127 142 4 128 143 4 129 144 4
145 146 146 1 147 147 1 148 148 1 149 149 1
Problem description
I'm trying to use groupby with as_index=False and then do an aggregate statement. I included an example where the groupby variable ends up as an index rather than staying as a column. My understanding is that this should result in the groupby variable being a column and not an index (as below), but perhaps I am mistaken.
This is my first time creating an issue, so my apologies if this is operator error or I didn't include important information. Please let me know if this is the case.
Maybe this is related to #22546?
Expected Output
You can see what the result should be when using "reset_index"
>>> end_result.reset_index()
shouldnt be index 1 2 3 4
min max count min max count min max count min max count
0 3 101 121 5 102 122 5 103 123 5 104 124 5
1 4 126 141 4 127 142 4 128 143 4 129 144 4
2 145 146 146 1 147 147 1 148 148 1 149 149 1
Output of pd.show_versions()
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.24.0
pytest: 3.8.0
pip: 19.0.1
setuptools: 40.2.0
Cython: 0.28.5
numpy: 1.15.1
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: 1.7.9
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.6
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.1.0
lxml.etree: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.11
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None