Skip to content

BUG: read_csv with trailing comma, specified header names and usecols gives confusing error #29042

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

Encountered this somewhat strange case: in case you have a malformed file (trailing comma's), we typically set the first column as the index. But if you also pass custom names, and use usecols, in that case you get a cryptic error message:

import io

s = """a, b, c, d 
1,2,3,4, 
5,6,7,8,""" 
>>> pd.read_csv(io.StringIO(s), header=0, names=['A', 'B', 'C', 'D'], usecols=[2,3]) 
...
ValueError: Passed header names mismatches usecols

I am not fully sure what the behaviour should be, but the current error message is not very helpful (since the usecols argument is only using integers, they are positional, and don't need to match the header names)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Error ReportingIncorrect or improved errors from pandasIO CSVread_csv, to_csv

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions