Skip to content

DOC: Inconsistencies in pandas.DataFrame.pivot_table parameter descriptions #53351

Closed
@tpaxman

Description

@tpaxman

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.pivot_table.html

Documentation problem

In the pandas.DataFrame.pivot_table function, there are issues with grammar and sentence structure in the descriptions for the index, columns, and aggfunc parameters, which may cause some confusion for users. The issues are described below:

  • index:
    • The grammar in the phrase "it is being used as the same manner as column values" is not correct.
    • It also uses a different tense than other descriptions (i.e., "is being used" rather than "will be", which is what other parameter descriptions are using (see, margins, dropna, and margins_name).
    • The sentence "Keys to group by on the pivot table index" seems like it should be the first sentence in the paragraph (i.e., before the 'if' condition descriptions).
    • There are two conditions for 'If an array is passed...'; however, they are split up in the paragraph as the first and last sentence. These could be combined for clarity.
    • The description for list input is somewhat unclear about what the list is, as it says "The list can contain..." in contrast with the specific phrasing used to refer to array input ("If an array is passed...")
  • columns:
    • Same issues as `index.
  • aggfunc:
    • indefinite articles are not used before data types, which is inconsistent with the descriptions for other parameters.
    • For example, it uses "If list of functions passed" and "If dict passed", whereas the descripions for index and columns use phrasing such as "If an array is passed"
    • A period is missing after "(inferred from the function objects themselves)"

Suggested fix for documentation

The following fixes are proposed for the descriptions of the index, columns, and aggfunc parameters:

index

Correct grammar and tense to be consistent with other parameter descriptions; rearrange sentence order to be more consistent and clear.

OLD:

If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as column values.

NEW:

Keys to group by on the pivot table index. If a list is passed, it can contain any of the other types (except list). If an array is passed, it must be the same length as the data and will be used in the same manner as column values.

columns

Correct grammar and tense to be consistent with other parameter descriptions; rearrange sentence order to be more consistent and clear.

OLD:

If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values.

NEW:

Keys to group by on the pivot table column. If a list is passed, it can contain any of the other types (except list). If an array is passed, it must be the same length as the data and will be used in the same manner as column values.

aggfunc

Add indefinite articles to be consistent with other parameter descriptions

OLD:

If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves) If dict is passed, the key is column to aggregate and value is function or list of functions. If margin=True, aggfunc will be used to calculate the partial aggregates.

NEW:

If a list of functions is passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves). If a dict is passed, the key is column to aggregate and the value is function or list of functions. If margin=True, aggfunc will be used to calculate the partial aggregates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions