Skip to content

Interface mismatch between fastparquet and pyarrow for reading and writing Parquet #30081

Closed
@malthe

Description

@malthe

Problem description

The pandas.read_parquet and pandas.to_parquet methods defer operation to either pyarrow (first priority) or fastparquet. The issue is that these libraries have a slightly different interface:

For reading:

  • pyarrow accepts IOBase and not bytes
  • fastparquet accepts bytes and not IOBase

Pandas should support one or both and do the conversion automatically.

For writing:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions