Description
Code Sample
import pandas as pd
if __name__ == "__main__":
with open('test.csv', 'w') as f:
f.write('1,2,3\n4,5,6')
with open('test.csv', 'rt') as f:
pd.read_csv(f, header=None)
with open('test.csv', 'rb') as f:
pd.read_csv(f, header=None)
with open('test.csv', 'rt') as f:
pd.read_csv(f, header=None, engine='python')
with open('test.csv', 'rb') as f:
pd.read_csv(f, header=None, engine='python')
Problem description
The second read_csv call (using the C engine and a file opened in binary mode) will correctly read the csv. The fourth read_csv call (using the Python engine and a file opened in binary mode) will throw an exception stating it needs to be in text mode:
pandas.errors.ParserError: iterator should return strings, not bytes (did you open the file in text mode?)
Perhaps this is intended behavior, but I found this difference in behavior between the engines surprising, as well as that binary mode was accepted at all.
Expected Output
Either the C engine rejecting binary mode files or the Python engine accepting them.
Output of pd.show_versions()
pandas: 0.23.4
pytest: None
pip: 10.0.1
setuptools: 39.1.0
Cython: None
numpy: 1.15.4
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None