Skip to content

BUG: groupby.transform passing Series to transformation #13543

Closed
@bashtage

Description

@bashtage

Code Sample, a copy-pastable example if possible

np.random.seed(12345)
panel = pd.Panel(np.random.randn(125,200,10))
panel.iloc[:,:,0] = np.round(panel.iloc[:,:,0])
panel.iloc[:,:,1] = np.round(panel.iloc[:,:,1])
x = panel
cols = [0,1]
_x = x.swapaxes(0, 2).to_frame()
numeric_cols = []
for df_col in _x:
    if df_col not in cols and pd.core.common.is_numeric_dtype(_x[df_col].dtype):
        numeric_cols.append(df_col)

calls = 0


def _print_type(df):
    print(type(df))
    return df

index = _x.index
_x.index = pd.RangeIndex(0, _x.shape[0])
groups = _x.groupby(cols)
out = groups.transform(_print_type)

Comment

The slow_path operated series by series rather than on a group DataFrame. Once the slow path is accepted, it operated on the group DataFrames. I have 59 groups in my example with 8 columns, and so it runs 8 times with Series from the first group DataFrame and then, once happy, runs 58 more times on the DataFrames.

The description says that it onlly operated on the group DataFrames (which is the correct behavior IMO)

Expected Output

Many lines of

<class 'pandas.core.frame.DataFrame'>

Actual Output

<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
...
<class 'pandas.core.frame.DataFrame'>

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 23.0.0
Cython: 0.24
numpy: 1.11.0
scipy: 0.17.1
statsmodels: 0.8.0.dev0+f060948
xarray: 0.7.2
IPython: 4.2.0
sphinx: 1.3.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.6.0
matplotlib: 1.5.1
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: 0.2.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions