Skip to content

REGR: MultiIndex level names RuntimeError in groupby.apply #31068

Closed
@jorisvandenbossche

Description

@jorisvandenbossche
df = pd.DataFrame({
    'A': np.arange(10), 'B': [1, 2] * 5, 
    'C': np.random.rand(10), 'D': np.random.rand(10)}
).set_index(['A', 'B'])  
df.groupby('B').apply(lambda x: x.sum())

On master this gives an error:

In [40]: df.groupby('B').apply(lambda x: x.sum())
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-40-75bc1ff12251> in <module>
----> 1 df.groupby('B').apply(lambda x: x.sum())

~/scipy/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs)
    733         with option_context("mode.chained_assignment", None):
    734             try:
--> 735                 result = self._python_apply_general(f)
    736             except TypeError:
    737                 # gh-20949

~/scipy/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f)
    752 
    753         return self._wrap_applied_output(
--> 754             keys, values, not_indexed_same=mutated or self.mutated
    755         )
    756 

~/scipy/pandas/pandas/core/groupby/generic.py in _wrap_applied_output(self, keys, values, not_indexed_same)
   1200                 if len(keys) == ping.ngroups:
   1201                     key_index = ping.group_index
-> 1202                     key_index.name = key_names[0]
   1203 
   1204                     key_lookup = Index(keys)

~/scipy/pandas/pandas/core/indexes/base.py in name(self, value)
   1171             # Used in MultiIndex.levels to avoid silently ignoring name updates.
   1172             raise RuntimeError(
-> 1173                 "Cannot set name on a level of a MultiIndex. Use "
   1174                 "'MultiIndex.set_names' instead."
   1175             )

RuntimeError: Cannot set name on a level of a MultiIndex. Use 'MultiIndex.set_names' instead.

On 0.25.3 this works:

In [10]:  df.groupby('B').apply(lambda x: x.sum()) 
Out[10]: 
          C         D
B                    
1  2.761792  3.963817
2  1.040950  3.578762

It seems the additional MultiIndex level that is not used to group (['A', 'B'] are index levels, but only grouping by 'B').

Metadata

Metadata

Assignees

No one assigned

    Labels

    GroupbyMultiIndexRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions