Skip to content

BUG: Inconsistent behaviour between excel engines when writing multiple times using the same ExcelWriter #40289

Open
@mirober

Description

@mirober
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

import pandas as pd
import io

df = pd.DataFrame({"A": [0, 1], "B": [10, 11]})

for engine in ["openpyxl", "xlsxwriter", "odf"]:
    excel = io.BytesIO()
    with pd.ExcelWriter(excel, engine=engine) as writer:
        df.to_excel(writer, sheet_name="foo", columns=["A"], index=False)
        df.to_excel(writer, sheet_name="foo", columns=["B"], startcol=1, index=False)

    result = pd.read_excel(excel, "foo")
    print(engine)
    print(result)
    print()

Produces:

openpyxl
   A   B
0  0  10
1  1  11

xlsxwriter
   A   B
0  0  10
1  1  11

odf
     A Unnamed: 1
0  0.0        NaN
1  1.0        NaN
2  NaN          B
3  NaN         10
4  NaN         11

Adding this test to test_writers.py (line 1327) will also catch the issue:

    def test_excel_multiple_writes_one_writer(self, path):
        df = DataFrame({"A": [0, 1], "B": [10, 11]})

        with ExcelWriter(path) as writer:
            df.to_excel(writer, columns=["A"], index=False)
            df.to_excel(writer, columns=["B"], startcol=1, index=False)

        result = pd.read_excel(path)
        tm.assert_frame_equal(result, df)

Problem description

This came up when implementing #40231, the behaviour between the engines here seems to be inconsistent.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 6baec67
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : None.None

pandas : 1.3.0.dev0+979.g6baec67643.dirty
numpy : 1.20.0
pytz : 2019.1
dateutil : 2.8.1
pip : 20.3.3
setuptools : 41.4.0
Cython : 0.29.21
pytest : 6.2.2
hypothesis : 6.1.1
sphinx : 2.4.0
blosc : None
feather : 0.4.0
xlsxwriter : 1.3.7
lxml.etree : 4.4.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.5.0
pandas_datareader: None
bs4 : 4.8.0
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.1.0
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.6
pandas_gbq : None
pyarrow : 0.13.0
pyxlsb : None
s3fs : None
scipy : 1.3.0
sqlalchemy : 1.3.13
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
numba : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions