-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DEPR: Deprecate using xlrd
engine for read_excel
#35029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jreback
merged 16 commits into
pandas-dev:master
from
roberthdevries:fix-28547-deprecate-xlrd
Dec 1, 2020
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
3a76a36
Deprecate using `xlrd` engine and change default engine to read excel…
cruzzoe 101aa97
Revert all changes related to switching to openpyxl as the default
roberthdevries 081ecf8
Reword whatsnew message for the benefit of end users.
roberthdevries ada4354
Merge branch 'master' of https://github.com/pandas-dev/pandas into 35029
rhshadrach 3233381
Fixed FutureWarning emitting logic, reverted openpyxl workaround
rhshadrach 0f4c8a1
Revert change to stacklevel
rhshadrach 499f9a0
-
rhshadrach 825c61c
Merge branch 'master' of https://github.com/pandas-dev/pandas into 35029
rhshadrach 88093f6
Changes from review
rhshadrach 44f157b
DeprecationWarning -> FutureWarning; added warning to io.rst/whatsnew
rhshadrach fffbacb
"to suppress this warning." -> "to avoid raising a FutureWarning."
rhshadrach bb53725
Changed engine=None to mostly using openpyxl
rhshadrach d8dcb04
Minor doc touchups
rhshadrach f9876dd
Re-added tests, minor doc touchups
rhshadrach bc3ec47
Test for no warning as well
rhshadrach fe10a89
Doc tweaks
rhshadrach File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,17 @@ | ||
import abc | ||
import datetime | ||
import inspect | ||
from io import BufferedIOBase, BytesIO, RawIOBase | ||
import os | ||
from textwrap import fill | ||
from typing import Any, Dict, Mapping, Union, cast | ||
import warnings | ||
|
||
from pandas._config import config | ||
|
||
from pandas._libs.parsers import STR_NA_VALUES | ||
from pandas._typing import Buffer, FilePathOrBuffer, StorageOptions | ||
from pandas.compat._optional import import_optional_dependency | ||
from pandas.errors import EmptyDataError | ||
from pandas.util._decorators import Appender, deprecate_nonkeyword_arguments | ||
|
||
|
@@ -99,12 +102,32 @@ | |
of dtype conversion. | ||
engine : str, default None | ||
If io is not a buffer or path, this must be set to identify io. | ||
Supported engines: "xlrd", "openpyxl", "odf", "pyxlsb", default "xlrd". | ||
Supported engines: "xlrd", "openpyxl", "odf", "pyxlsb". | ||
Engine compatibility : | ||
|
||
- "xlrd" supports most old/new Excel file formats. | ||
- "openpyxl" supports newer Excel file formats. | ||
- "odf" supports OpenDocument file formats (.odf, .ods, .odt). | ||
- "pyxlsb" supports Binary Excel files. | ||
|
||
.. versionchanged:: 1.2.0 | ||
The engine `xlrd <https://xlrd.readthedocs.io/en/latest/>`_ | ||
is no longer maintained, and is not supported with | ||
python >= 3.9. When ``engine=None``, the following logic will be | ||
used to determine the engine. | ||
|
||
- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt), | ||
then `odf <https://pypi.org/project/odfpy/>`_ will be used. | ||
- Otherwise if ``path_or_buffer`` is a bytes stream, the file has the | ||
extension ``.xls``, or is an ``xlrd`` Book instance, then ``xlrd`` will | ||
be used. | ||
- Otherwise if `openpyxl <https://pypi.org/project/openpyxl/>`_ is installed, | ||
then ``openpyxl`` will be used. | ||
- Otherwise ``xlrd`` will be used and a ``FutureWarning`` will be raised. | ||
|
||
Specifying ``engine="xlrd"`` will continue to be allowed for the | ||
indefinite future. | ||
|
||
converters : dict, default None | ||
Dict of functions for converting values in certain columns. Keys can | ||
either be integers or column labels, values are functions that take one | ||
|
@@ -877,13 +900,32 @@ class ExcelFile: | |
.xls, .xlsx, .xlsb, .xlsm, .odf, .ods, or .odt file. | ||
engine : str, default None | ||
If io is not a buffer or path, this must be set to identify io. | ||
Supported engines: ``xlrd``, ``openpyxl``, ``odf``, ``pyxlsb``, | ||
default ``xlrd``. | ||
Supported engines: ``xlrd``, ``openpyxl``, ``odf``, ``pyxlsb`` | ||
Engine compatibility : | ||
|
||
- ``xlrd`` supports most old/new Excel file formats. | ||
- ``openpyxl`` supports newer Excel file formats. | ||
- ``odf`` supports OpenDocument file formats (.odf, .ods, .odt). | ||
- ``pyxlsb`` supports Binary Excel files. | ||
|
||
.. versionchanged:: 1.2.0 | ||
|
||
The engine `xlrd <https://xlrd.readthedocs.io/en/latest/>`_ | ||
is no longer maintained, and is not supported with | ||
python >= 3.9. When ``engine=None``, the following logic will be | ||
used to determine the engine. | ||
|
||
- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. obviously as much of the formatting you can do here as well |
||
then `odf <https://pypi.org/project/odfpy/>`_ will be used. | ||
- Otherwise if ``path_or_buffer`` is a bytes stream, the file has the | ||
extension ``.xls``, or is an ``xlrd`` Book instance, then ``xlrd`` | ||
will be used. | ||
- Otherwise if `openpyxl <https://pypi.org/project/openpyxl/>`_ is installed, | ||
then ``openpyxl`` will be used. | ||
- Otherwise ``xlrd`` will be used and a ``FutureWarning`` will be raised. | ||
|
||
Specifying ``engine="xlrd"`` will continue to be allowed for the | ||
indefinite future. | ||
""" | ||
|
||
from pandas.io.excel._odfreader import ODFReader | ||
|
@@ -902,14 +944,59 @@ def __init__( | |
self, path_or_buffer, engine=None, storage_options: StorageOptions = None | ||
): | ||
if engine is None: | ||
engine = "xlrd" | ||
# Determine ext and use odf for ods stream/file | ||
if isinstance(path_or_buffer, (BufferedIOBase, RawIOBase)): | ||
ext = None | ||
if _is_ods_stream(path_or_buffer): | ||
engine = "odf" | ||
else: | ||
ext = os.path.splitext(str(path_or_buffer))[-1] | ||
if ext == ".ods": | ||
engine = "odf" | ||
|
||
WillAyd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if ( | ||
import_optional_dependency( | ||
"xlrd", raise_on_missing=False, on_version="ignore" | ||
) | ||
is not None | ||
): | ||
from xlrd import Book | ||
|
||
if isinstance(path_or_buffer, Book): | ||
engine = "xlrd" | ||
|
||
# GH 35029 - Prefer openpyxl except for xls files | ||
if engine is None: | ||
if ext is None or isinstance(path_or_buffer, bytes) or ext == ".xls": | ||
engine = "xlrd" | ||
elif ( | ||
import_optional_dependency( | ||
"openpyxl", raise_on_missing=False, on_version="ignore" | ||
) | ||
is not None | ||
): | ||
engine = "openpyxl" | ||
else: | ||
caller = inspect.stack()[1] | ||
if ( | ||
caller.filename.endswith("pandas/io/excel/_base.py") | ||
and caller.function == "read_excel" | ||
): | ||
stacklevel = 4 | ||
else: | ||
stacklevel = 2 | ||
warnings.warn( | ||
"The xlrd engine is no longer maintained and is not " | ||
"supported when using pandas with python >= 3.9. However, " | ||
"the engine xlrd will continue to be allowed for the " | ||
"indefinite future. Beginning with pandas 1.2.0, the " | ||
"openpyxl engine will be used if it is installed and the " | ||
"engine argument is not specified. Either install openpyxl " | ||
"or specify engine='xlrd' to silence this warning.", | ||
FutureWarning, | ||
stacklevel=stacklevel, | ||
) | ||
engine = "xlrd" | ||
if engine not in self._engines: | ||
raise ValueError(f"Unknown engine: {engine}") | ||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.