Closed
Description
Problem description
The pandas.read_parquet
and pandas.to_parquet
methods defer operation to either pyarrow
(first priority) or fastparquet
. The issue is that these libraries have a slightly different interface:
For reading:
pyarrow
acceptsIOBase
and notbytes
fastparquet
acceptsbytes
and notIOBase
Pandas should support one or both and do the conversion automatically.
For writing:
pyarrow
acceptsIOBase
fastparquet
doesn't really support writing to an ephemeral buffer because the stream is closed when using theopen_with
argument (see FR: Accept a file-like object in addition to a path infastparquet.write
dask/fastparquet#408)