ENH: Method for selecting columns from DataFrameGroupBy (and maybe DataFrame)

#### Is your feature request related to a problem?

When using a "fluent"/"method chaining" style of programming with pandas, a common hiccup is the lack of a method for selecting columns from a `DataFrameGroupBy` object. For example, take this DataFrame

```python
import pandas as pd
df = pd.DataFrame(dict(
    x=[1, 2, 3, 4, 5, 6],
    y=[2, 4, 6, 1, 2, 3],
    a=["a", "b", "c", "a", "b", "c"],
    b=["x", "x", "y", "z", "z", "z"],
))
```

To group by one column and then return the mean of another column within each group, one has to use multiple syntactical constructs:


```python
(
    df
    .groupby("a")
    ["x"]
    .mean()
)
```

There is also dot-attribute access columns:

```python
(
    df
    .groupby("a")
    .x
    .mean()
)
```

but that is not fully general: it won't work if your column name collides with an existing method, and it won't work if the column is defined by a variable.

Another option is to select the column first then groupby using a series, not a name:

```python
(
    df["x"]
    .groupby(df["a"])
    .mean()
)
```

This is not so bad. Well, ideally each step in the pipeline would be on a new line, and it requires some duplicated typing. But the bigger issue is that it fails when you want to group by more than one column:

```python
(
    df["x"]
    .groupby(df[["a", "b"]])
    .mean()
)
```

<details><summary>Traceback</summary>

```python-traceback
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-edc38a1f02b1> in <module>
      1 (
----> 2     df["x"]
      3     .groupby(df[["a", "b"]])
      4     .mean()
      5 )

~/miniconda3/envs/py39/lib/python3.9/site-packages/pandas/core/series.py in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, observed, dropna)
   1689         axis = self._get_axis_number(axis)
   1690 
-> 1691         return SeriesGroupBy(
   1692             obj=self,
   1693             keys=by,

~/miniconda3/envs/py39/lib/python3.9/site-packages/pandas/core/groupby/groupby.py in __init__(self, obj, keys, axis, level, grouper, exclusions, selection, as_index, sort, group_keys, squeeze, observed, mutated, dropna)
    558             from pandas.core.groupby.grouper import get_grouper
    559 
--> 560             grouper, exclusions, obj = get_grouper(
    561                 obj,
    562                 keys,

~/miniconda3/envs/py39/lib/python3.9/site-packages/pandas/core/groupby/grouper.py in get_grouper(obj, key, axis, level, sort, observed, mutated, validate, dropna)
    826         # allow us to passing the actual Grouping as the gpr
    827         ping = (
--> 828             Grouping(
    829                 group_axis,
    830                 gpr,

~/miniconda3/envs/py39/lib/python3.9/site-packages/pandas/core/groupby/grouper.py in __init__(self, index, grouper, obj, name, level, sort, observed, in_axis, dropna)
    541                 if getattr(self.grouper, "ndim", 1) != 1:
    542                     t = self.name or str(type(self.grouper))
--> 543                     raise ValueError(f"Grouper for '{t}' not 1-dimensional")
    544                 self.grouper = self.index.map(self.grouper)
    545                 if not (

ValueError: Grouper for '<class 'pandas.core.frame.DataFrame'>' not 1-dimensional
```

</details>

(I actually feel like this really ought to work, independently of whether it's the best solution to this Feature Request, but let's leave it aside for now).

**So in summary:** There's no fully-general, non-awkward method for selecting a column from `DataFrameGroupBy`. While certainly not fatal, it does make fluent pandas harder to read[1] and write[2]. And this is not an esoteric application: groupby/select/apply must be one of the most common pandas workflows.

[1] The great advantage of this method chaining approach for understanding code is that you can read off the sequence of verbs that comprise the operation. Introducing the `[column]` syntax requires your brain to shift from comprehension mode to production mode — you need to generate a verb for what's happening — which is a costly operation that likely (on the margin) impairs comprehension.

[2] I really like fluent pandas but it currently can be a little bit annoying to fight with autoindent/autoformat tools when writing it. It likely would be easier to improve that tooling if one could assume each pipeline step begins with a dotted method access.

#### Describe the solution you'd like

The `DataFrameGroupBy` object could have a column selection method.

Possible names:

##### `DataFrameGroupBy.get`

This exists in `DataFrame` as a thin wrapper around `self[arg]`. Possibly adds confusion with `DataFrameGroupBy.get_group`

##### `DataFrameGroupBy.select` 

This is the method I have reached for more than once. I'm broadly aware of the history of [`NDFrame.select`](https://github.com/pandas-dev/pandas/pull/17633); this name should now be available, though it could cause some confusion to reintroduce it with different semantics (but xref https://github.com/pandas-dev/pandas/issues/26642). 

It probably wouldn't make sense to add `.select` *only* to `DataFrameGroupBy` and not to `DataFrame`/`NDFrame`. So this would be more work (but also more benefit?)

In code, that might look like

```python
(
    df
    .groupby("a")
    .select("x")
    .mean()
)
```

#### API breaking implications

No breakage, unless someone has very old code using `DataFrame.select` that completely missed the original deprecation cycle and gets resurrected now.

Adding new methods (especially new ways of indexing pandas objects) definitely has an API complexity cost. But I would argue that, by converging on "one way to do it" in method chains, it could provide a a net reduction in complexity from the user perspective.


#### Additional context

This originally got some discussion on [twitter](https://twitter.com/michaelwaskom/status/1369037397040726020), note especially @TomAugspurger's [response](https://twitter.com/TomAugspurger/status/1369073811035983879).

Other relevant context would be the [select verb](https://dplyr.tidyverse.org/reference/select.html) in `dplyr`. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Method for selecting columns from DataFrameGroupBy (and maybe DataFrame) #40322

Is your feature request related to a problem?

Describe the solution you'd like

`DataFrameGroupBy.get`

`DataFrameGroupBy.select`

API breaking implications

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: Method for selecting columns from DataFrameGroupBy (and maybe DataFrame) #40322

Description

Is your feature request related to a problem?

Describe the solution you'd like

DataFrameGroupBy.get

DataFrameGroupBy.select

API breaking implications

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`DataFrameGroupBy.get`

`DataFrameGroupBy.select`