Skip to content

BUG: read_excel() doesn't handle Multiindex correctly #47487

Closed
@0nelight

Description

@0nelight

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# Could not provide a copy-pastable Reproducible Example because of empty line bug in ".to_excel" https://github.com/pandas-dev/pandas/issues/27772
# Please use either Screenshot or attached Excel-File.

import pandas as pd

df = pd.read_excel('sample.xlsx', index_col=[0, 1, 2], header=[0, 1, 2])

print(df.index)

MultiIndex([('A', 'AA', 'AAA'),
            (nan, 'BB', 'BBB'),
            (nan, 'CC', 'CCC'),
            (nan, 'DD', 'DDD'),
            (nan, 'EE', 'EEE')],
           )

Issue Description

When importing a Excel-XSLX File (sample.xlsx) into pandas with .read_excel() the Multiindex is not handled properly.

Instead of copying the value of the first Index-Column-Cell into the Cells beneath it, it is pasting "NAN"s there.
This also doesn't change when merging the area to a merged cell.

Example Data in Excel/Calc:
pandas_bug

Expected Behavior

It should copy the first Value in the Index-Column into the cells beneath them:

    print(df.index)

    MultiIndex([('A', 'AA', 'AAA'),
                ('A', 'BB', 'BBB'),
                ('A', 'CC', 'CCC'),
                ('A', 'DD', 'DDD'),
                ('A', 'EE', 'EEE')],
               )

When commenting out the changes made with Issue:34673 it is working correctly:

# GH34673: if MultiIndex names present and not defined in the header,
# offset needs to be incremented so that forward filling starts
# from the first MI value instead of the name
if has_index_names:
offset += 1

Installed Versions

'1.4.2'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions