Skip to content

DOC: pd.concat description of ordering when passing a mapping is unclear #58516

Closed
@wence-

Description

@wence-

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/reference/api/pandas.concat.html

Documentation problem

The objs parameter is described as:

objs: an iterable or mapping of Series or DataFrame objects

    If a mapping is passed, the sorted keys will be used as the keys argument, 
    unless it is passed, in which case the values will be selected (see below). 
    Any None objects will be dropped silently unless they are all None in 
    which case a ValueError will be raised.

My reading of this is that if I pass a dictionary of objects, then the order of the result of the concatenation will be given by sorted(objs.keys()) (if I don't pass keys=). However, it appears that the order is just objs.keys().

Example:

import pandas as pd
dfa = pd.DataFrame({"A": ["A"]})
dfb = pd.DataFrame({"B": ["B"]})

objs = {"Z": dfa, "Y": dfb}

# Expect these two to be the same
pd.concat(objs, axis=1)
#    Z  Y
#    A  B
# 0  A  B

pd.concat(objs, axis=1, keys=sorted(objs.keys())
#    Y  Z
#    B  A
# 0  B  A

Suggested fix for documentation

I think removed the statement of sortedness, not least because there is no requirement on the mapping input to have keys that admit a total order.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions