Skip to content

API: Remove 'codes' parameter from MultiIndex signature, and add 'data' #24323

Closed
@topper-123

Description

@topper-123

In #23752, the MultiIndex signature was changed. Compared to 0.23.4, the only change is that labels has been changed to codes.

Now that the signature is being changed anyway, I've started to think about if this is even the right signature:

I think codes should actually be an implementation detail, and an improved signature would be using data for the first parameter, similarly to how data is the first parameter in the signature for CategoricalIndex. So a signauture like this:

>>> inspect.signature(pd.MultiIndex)  # proposed signature
<Signature (data=None, levels=None, sortorder=None, names=None, dtype=None, copy=False, name=None, verify_integrity=True, _set_identity=True)>

I think would be better.

data could then accept codes, but could also accept other types of data, that could be used to construct a MultiIndex. For example:

>>> pd.MultiIndex(data=[[1,0, 1, 0], [0,1,0,1]], levels=[['a', 'b'], ['x', 'y']])
MultiIndex([('b', 'x'),  # repr after #22511
            ('a', 'y'),
            ('b', 'x'),
            ('a', 'y')],
           )
>>> pd.MultiIndex({'a': [1,2,3], 'v': ['a', 'd', 'q']})
MultiIndex([(1, 'a'),
            (2, 'd'),
            (3, 'q')],
            names=['a', 'b']
           )

In the first example, I use the current initalisation method, and in the second I show a initalisation with a dict, similar to how a DataFrame is initalized with a dict.

I think this could make the initialisation of MultiIndex more similar to the ones for the other pandas objects, and make MultiIndexes more friendly to use for users.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions