DOC/TST: update the parquet (pyarrow >= 0.15) docs and tests regarding Categorical support 

Wes is doing great work in Apache Arrow on parquet's categorical support, which means that roundtripping to parquet with `to_parquet`/`read_parquet` will preserve categorical dtypes (and with a much better performance as before). 

See https://issues.apache.org/jira/browse/ARROW-3246 (and linked issues), https://github.com/apache/arrow/pull/5110

We will need to:

- update the tests for pyarrow to test this faithful roundtrip (depending on the pyarrow version): https://github.com/pandas-dev/pandas/blob/802f67046bbae0a815b2fe9d20d2217485bbc942/pandas/tests/io/test_parquet.py#L409, https://github.com/pandas-dev/pandas/blob/802f67046bbae0a815b2fe9d20d2217485bbc942/pandas/tests/io/test_parquet.py#L451
- update the documentation. Eg the caveats section at https://dev.pandas.io/user_guide/io.html#parquet


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC/TST: update the parquet (pyarrow >= 0.15) docs and tests regarding Categorical support #27955

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DOC/TST: update the parquet (pyarrow >= 0.15) docs and tests regarding Categorical support #27955

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions