Skip to content

read_excel in version 0.25.0rc0 treats empty columns differently #27252

Closed
@snordhausen

Description

@snordhausen

I'm using this code to load an Excel file.

df = pandas.read_excel(
    "data.xlsx",
    sheet_name="sheet1",
    usecols=[0, 1], 
    header=None,
    names=["foo", "bar"]
)

print(df.head())

The Excel file has the cells A7=1, A8=2, A9=3, everything else is empty.

With pandas 0.24.2 I get this:

   foo  bar
0    1  NaN
1    2  NaN
2    3  NaN

With pandas 0.25.0rc0 I get:

Traceback (most recent call last):
  File "tester.py", line 8, in <module>
    names=["foo", "bar"]
  File "/home/me/.env/lib/python3.7/site-packages/pandas/util/_decorators.py", line 196, in wrapper
    return func(*args, **kwargs)
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 334, in read_excel
    **kwds
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 877, in parse
    **kwds
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 507, in parse
    **kwds
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/parsers.py", line 2218, in TextParser
    return TextFileReader(*args, **kwds)
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in __init__
    self._make_engine(self.engine)
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/parsers.py", line 1147, in _make_engine
    self._engine = klass(self.f, **self.options)
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/parsers.py", line 2305, in __init__
    ) = self._infer_columns()
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/parsers.py", line 2712, in _infer_columns
    _validate_usecols_names(self.usecols, range(ncols))
  File "/home/me/.env/lib/python3.7/site-packages/pandas/io/parsers.py", line 1255, in _validate_usecols_names
    "columns expected but not found: {missing}".format(missing=missing)
ValueError: Usecols do not match columns, columns expected but not found: [1]

The problem happens because the bar column does not contain any data. As soon as I put a value into it, both versions do the same thing.
I'm using Python 3.7.3 in Ubuntu 19.04.

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO Excelread_excel, to_excel

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions