Skip to content

Dates are parsed with read_csv thousand seperator #4678

Closed
@hayd

Description

@hayd

When reading a csv with a date column, the date is sometimes parsed as a number:

In [1]: s = '06.02.2013;13:00;1.000,215;0,215;0,185;0,205;0,00'

In [2]: pd.read_csv(StringIO(s), sep=';', header=None, parse_dates={'Dates': [0, 1]}, index_col=0, decimal=',', thousands='.')
Out[2]:
                        2      3      4      5  6
Dates
6022013 13:00   1.000,215  0.215  0.185  0.205  0

Here 06.02.2013 is read as a number 0602013 before the date is parsed (which fails)... I think dates are sometimes written this way on the continent (along with . thousands).

This was found in #4322 (but that issue was more about . being ignored), I guess another test case would be with -:

In [3]: s = '06-02-2013;13:00;1.000,215;0,215;0,185;0,205;0,00'

In [4]: pd.read_csv(StringIO(s), sep=';', header=None, parse_dates={'Dates': [0, 1]}, decimal=',', thousands='-')
Out[4]: 
           Dates          2      3      4      5  6
0  6022013 13:00  1.000,215  0.215  0.185  0.205  0

@jreback suggests:

but it should ignore dates columns entirely (for thousands parsing...)

cc #4598 @guyrt

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions