Skip to content

BUG: concat of MultiIndex with names passed #15787

Closed
@seth-a

Description

@seth-a

Code Sample, a copy-pastable example if possible

    import numpy as np
    import pandas as pd

    res = []
    for _ in range(2):
        res1 = []
        # Only occurs when dataframe is used with measure
        data = np.zeros((30, 21))
        idx = np.random.randint(0, 5, 30)
        df = pd.DataFrame(data, index=idx).loc[3]
        #df = pd.DataFrame(data[::5, :])  # Uncomment for example of correct behavior

        res1.append(pd.DataFrame(sum(data.dot(df.T))))
        tmp = pd.concat(res1, keys=[1], names=['level1'])

        res.append(tmp)
    final = pd.concat(res, keys=[i for i in range(2)], names=['level2'])
    print(final)

Problem description

In python, datatypes generally don't matter. A dataframe is a dataframe, but as shown in the example code concat'ing dataframes with an index does not have the same behavior as dataframes without an index. The label for a level of the index is dropped. This is a small bug. Run it several times (10-12 seems to do it) and you will see a much more worrisome issue: on occasion, the label is not dropped. Yes, the output of concat is random.

Expected Output

level2 level1       
0      1      0  0.0
              1  0.0
              2  0.0
              3  0.0
              4  0.0
              5  0.0
1      1      0  0.0
              1  0.0
              2  0.0
              3  0.0
              4  0.0
              5  0.0

Output of pd.show_versions()

level2         
0      1 0  0.0
         1  0.0
         2  0.0
         3  0.0
         4  0.0
         5  0.0
1      1 0  0.0
         1  0.0
         2  0.0
         3  0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions