Skip to content

ENH: read parquet files in chunks using to_parquet and chunksize #55973

Open
@match-gabeflores

Description

@match-gabeflores

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Similar to how read_csv has chunksize parameter, can read_parquet function have chunksize?

Seems like it's possible using pyarrow via iter_batches.
https://stackoverflow.com/questions/59098785/is-it-possible-to-read-parquet-files-in-chunks

Is this something feasible within pandas?

Feature Description

add a new parameter chunksize to read_parquet

Alternative Solutions

use pyarrow iter_batches

Additional Context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions