Skip to content

DOC: Provide examples of using read_parquet #49739

Open
@wjones127

Description

@wjones127

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_parquet.html

Documentation problem

For the pyarrow engine, there are some important features behind the kwargs that aren't aren't described here, and it might not be obvious to users where to look in PyArrow. For example:

  • Using filters, users can prune which files and/or row groups are read.
  • Using filesystem, users can configure a filesystem such as S3

Suggested fix for documentation

At the very least, we should document for each engine where those kwargs are passed. But it might even be worthwhile to provide examples of filters, reading partitioned datasets, and configuring remote filesystems. Does that seem reasonable?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions