Skip to content

DOC: Getting started example on using groupby().mean() alongside pivot throws error #55599

Closed
@ragibson

Description

@ragibson

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/getting_started/intro_tutorials/07_reshape_table_layout.html

Documentation problem

If you run this line discussing the connection between pivot tables and groupby(),

air_quality.groupby(["parameter", "location"]).mean()

you actually get TypeErrors related to string conversions and the aggregation function failing.

Suggested fix for documentation

Restricting to numeric values only shows the intended result that matches the pivot table. I.e.,

print(air_quality.pivot_table(
    values="value",
    index="location",
    columns="parameter",
    aggfunc="mean",
    margins=True
))

print(air_quality.groupby(["parameter", "location"]).mean(numeric_only=True))

shows

parameter                 no2       pm25        All
location                                           
BETR801             26.950920  23.169492  24.982353
FR04014             29.374284        NaN  29.374284
London Westminster  29.740050  13.443568  21.491708
All                 29.430316  14.386849  24.222743
                                  value
parameter location                     
no2       BETR801             26.950920
          FR04014             29.374284
          London Westminster  29.740050
pm25      BETR801             23.169492
          London Westminster  13.443568

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsGroupbyReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions