Skip to content

BUG: read_csv raises an error when both prefix and names are set to None #42387

Closed
@lhoestq

Description

@lhoestq
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Hi everyone, I'm running into this issue since pandas 1.3.0:

Code Sample, a copy-pastable example

import pandas as pd

pd.read_csv("path/to/any/csv", names=None, prefix=None)

raises

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-681798f605ab> in <module>()
----> 1 pd.read_csv("/content/sample_data/mnist_test.csv", names=None, prefix=None)

2 frames
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers/readers.py in _refine_defaults_read(dialect, delimiter, delim_whitespace, engine, sep, error_bad_lines, warn_bad_lines, on_bad_lines, names, prefix, defaults)
   1304 
   1305     if names is not lib.no_default and prefix is not lib.no_default:
-> 1306         raise ValueError("Specified named and prefix; you can only specify one.")
   1307 
   1308     kwds["names"] = None if names is lib.no_default else names

ValueError: Specified named and prefix; you can only specify one.

Problem description

With names=None and prefix=None those parameters shouldn't be considered as specified, and the code should run as if they were not passed as keyword arguments.

This is due to the changes in this PR #41446 that changed the default values of those two parameters from None to no_default

Expected Output

The code should load the csv using the default behavior as if names and prefix were not passed as keyword arguments

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : f00ed8f python : 3.7.10.final.0 python-bits : 64 OS : Linux OS-release : 5.4.104+ Version : #1 SMP Sat Jun 5 09:50:34 PDT 2021 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.3.0
numpy : 1.19.5
pytz : 2018.9
dateutil : 2.8.1
pip : 19.3.1
setuptools : 57.0.0
Cython : 0.29.23
pytest : 3.6.4
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.2.6
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 5.5.0
pandas_datareader: 0.9.0
bs4 : 4.6.3
bottleneck : 1.3.2
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.7.3
odfpy : None
openpyxl : 2.5.9
pandas_gbq : 0.13.3
pyarrow : 3.0.0
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.4.18
tables : 3.4.4
tabulate : 0.8.9
xarray : 0.18.2
xlrd : 1.1.0
xlwt : 1.3.0
numba : 0.51.2

Metadata

Metadata

Assignees

Labels

BugIO CSVread_csv, to_csvRegressionFunctionality that used to work in a prior pandas version

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions