Skip to content

Handling of duplicate columns in pandas.io.sql.read_frame #2738

Closed
@eingerman

Description

@eingerman

Calling pandas.io.sql.read_frame can results in data frame with duplicate column names. For example when SQL query contains joins on tables with duplicate columns.

Data frames with duplicate column names cause errors in many pandas functions. I can't even rename columns as df.columns = new_columns generates errors.

I think correct behavior would be for pandas.io.sql.read_frame have an option to "deduplicate" column names (for example by adding a number) or generate an error with duplicate column names.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementIO DataIO issues that don't fit into a more specific labelIO SQLto_sql, read_sql, read_sql_query

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions