Skip to content

BUG: wrong alignment of column names in the DataFrame repr when using pyarrow-backed string dtype #54797

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

When enabling the future string options, I see a wrong alignment of the column names:

In [10]: pd.options.future.infer_string = True

In [11]: df = pd.DataFrame({"long_column_name": [1, 2, 3], "col2": [4, 5, 6]})

In [12]: df
Out[12]: 
   long_column_name  col2            
0                 1                 4
1                 2                 5
2                 3                 6

The names seem left aligned instead of right? And every column uses the width needed for the widest column.

This happens when the Index object backing the columns uses the string dtype.
It seem this also happens for the the other String dtype instantiations (eg after doing df.columns = df.columns.astype("string[python]"), i.e. not Arrow specific), but in general it's harder to get a DataFrame with such column names with using pd.options.mode.string_storage, since you still need to manually specify the dtype.

cc @phofl

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugStringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions