Skip to content

ENH: Allow to_parquet to save the metadata from DataFrame.attrs and load it back #54321

Closed
@xiki-tempula

Description

@xiki-tempula

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Pandas allow one to store metadata in DataFrame.attrs. Similarly pyarrow allow one to store meta data via the metadata option in pa.schema.
However, if one try to use DataFrame.to_parquet to convert a pandas dataframe into a parquet file via pyarrow and then load it back. The metadata has been lost.

Feature Description

import pandas as pd
df = pd.DataFrame(data={1:[1]})
df.attrs={1:1}
df.to_parquet('test.p')
new_df = pd.read_parquet('test.p')
new_df.attrs == {1:1}

Alternative Solutions

N/A

Additional Context

No response

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions