Skip to content

read_fwf/table on py3 has trouble with BytesIO #4785

Closed
@ghost

Description

>>> import pandas as pd
>>> from io import BytesIO
>>> pd.read_fwf(BytesIO("שלום".encode('utf8')),widths=[2])
>>>pandas/io/parsers.py", line 1944, in <listcomp>
>>>    for (fromm, to) in self.colspecs]
>>>TypeError: Type str doesn't support the buffer API

By another path:

>>> from io import BytesIO
>>> pd.read_table(BytesIO("שלום::1234\n".encode('cp1255')),sep="::", engine='python', encoding='cp1255')
  File "/usr/local/lib/python3.3/dist-packages/pandas-0.12.0_357_g218f334-py3.3-linux-x86_64.egg/pandas/io/parsers.py", line 1324, in _read
    yield pat.split(line.decode('utf-8').strip())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf9 in position 0: invalid start byte

is broken. Note that len(sep)>1 activates the python engine anyway right now.

related #4784

Edit: fixed incorrect encoding and updated error
Edit: Updated examples

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions