Skip to content

read_excel failed with specific configuration #15835

Closed
@fortooon

Description

@fortooon

Code Sample to reproduce bug

from pandas import read_excel

pre_kwargs = {
  "sheetname" : "Sheet1",
  "parse_cols" : [0,3],
  "keep_default_na" : False,
  "header" : 0
}

file = "/path_to_file..../test12.xlsx"
header_v = read_excel(file, **pre_kwargs).columns.values

use attached file : test12.xlsx

Problem description

I can't read row as header with empty value cell with number formatting.
Parser just failed.

Traceback (most recent call last):
  File "/home/alexey.lisicyn/testPand.py", line 22, in <module>
    header_v = read_excel(file, **pre_kwargs).columns.values
  File "/tmp/opt/linux-CentOS_4.4-x64/P7/python-2.7.7-dbg/lib/python2.7/site-packages/pandas/io/excel.py", line 170, in read_excel
    skip_footer=skip_footer, converters=converters, **kwds)
  File "/tmp/opt/linux-CentOS_4.4-x64/P7/python-2.7.7-dbg/lib/python2.7/site-packages/pandas/io/excel.py", line 438, in _parse_excel
    output[asheetname] = parser.read()
  File "/tmp/opt/linux-CentOS_4.4-x64/P7/python-2.7.7-dbg/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in read
    ret = self._engine.read(nrows)
  File "/tmp/opt/linux-CentOS_4.4-x64/P7/python-2.7.7-dbg/lib/python2.7/site-packages/pandas/io/parsers.py", line 1611, in read
    index, columns = self._make_index(data, alldata, columns, indexnamerow)
  File "/tmp/opt/linux-CentOS_4.4-x64/P7/python-2.7.7-dbg/lib/python2.7/site-packages/pandas/io/parsers.py", line 920, in _make_index
    index = self._agg_index(index)
  File "/tmp/opt/linux-CentOS_4.4-x64/P7/python-2.7.7-dbg/lib/python2.7/site-packages/pandas/io/parsers.py", line 1012, in _agg_index
    arr, _ = self._convert_types(arr, col_na_values | col_na_fvalues)
TypeError: unsupported operand type(s) for |: 'list' and 'set'

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-66-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 7.0.3
setuptools: 1.1.6
Cython: None
numpy: 1.7.1
scipy: 0.14.0
statsmodels: None
IPython: 0.13.2
sphinx: None
patsy: None
dateutil: 2.3
pytz: 2014.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: 2.3.0
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvIO Excelread_excel, to_excel

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions