DOC: OrderedDict example in groupby aggregation

Hello,

It's me, again. ;-) I would like to submit to your attention another possible improvement your could implement. 

Let me start by saying that 90% of the time, data processing in my field involves some form of groupby at the time frequency, panel frequency, etc. 

So aggregating data efficiently is definitely something important and very useful. 

I clearly prefer pandas' `groupby` over stata `collapse` (or others) because it is so much faster.
However, a key functionality seems to be missing in Pandas. Usually, people want to apply functions to several columns, and be able to rename the output. In stata, you would write something like

`collapse (firstnm) jreback=pandas, by(time)`

to create a variable named `jreback`, that contains the first non missing value of the column `pandas` for every group in `time`.

In Pandas, a similar process seem unnecessarily complex.

I can only use the syntax `group=df.groupby('group').agg({'A':'mean', 'B':['mean','sum']})`
which has a **major disadvantage**
- **no predictability over the sorting order of the columns**. That is, there is no guarantee that in group, the first column will be A and the second one will be B. I need to `group.column.tolist()` manually to understand which column corresponds to what. That is clearly not efficient (or maybe I am missing something here)

It would be therefore useful to add an option `column_names` that allows the user to chose columns names at the `agg` level, with the guarantee that the first name correspond to the first column and so on. For instance, in the example above I could specify `col_names=['mean_A','mean_B','this is a sum']`

What do you think?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC: OrderedDict example in groupby aggregation #12879

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DOC: OrderedDict example in groupby aggregation #12879

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions