Closed
Description
Code Sample, a copy-pastable example if possible
This is adapted from the docs, just replacing column 'a' with a list of tuples:
import pandas as pd
df = pd.DataFrame({
'a': [(0,), (0,), (0,), (0,), (1,), (1,), (1,), (1,), (2,), (2,), (2,), (2,)],
'b': [0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1],
'c': [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0],
'd': [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1],
})
def compute_metrics(x):
result = {'b_sum': x['b'].sum(), 'c_mean': x['c'].mean()}
return pd.Series(result, name='metrics')
df.groupby('a').apply(compute_metrics)
Problem description
Without the modification, the return value is a dataframe with the apply-returned Series objects concatenated. With the modification, it is a Series object filled with the individual Series objects.
Expected Output
The same behavior with and without modification.
Background
The divergence in the behavior is caused by the code in pandas/core/index.py
introduced in #10703, which was a reaction on #10697. Simply commenting out the if block if all( isinstance(e, tuple) for e in data ):
solves the issue.
Metadata
Metadata
Assignees
Labels
No labels