Skip to content

BUG: Skipfooter disables decimal parameter #6971

Closed
@GHPS

Description

@GHPS

I ran into a bug in the read_csv importer when trying to read in a file with
a European style decimal encoding (e.g. 8.1 -> 8,2). Setting the decimal-parameter
appropriately should make this easy but in my case pandas refused to accepts any different data type than a simple object.

After a few attempts with various files and snipplets of code I nailed down the problem to the skipfooter parameter. As far as I can judge skipfooter causes the decimal parameter to be ignored. Take the following example:

In [44]:
data = 'a;b;c\n1,1;2,2;3,3\n4;5;6\n7;8;9'
data

Out[44]:
'a;b;c\n1,1;2,2;3,3\n4;5;6\n7;8;9'

In [45]:
df = pd.read_csv(io.StringIO(data), sep=";",decimal=",",dtype=np.float64)
df

Out[45]:
a b c
0 1.1 2.2 3.3
1 4.0 5.0 6.0
2 7.0 8.0 9.0

3 rows × 3 columns
In [46]:
df.dtypes

Out[46]:
a float64
b float64
c float64
dtype: object

Perfect - the behaviour I expected. Now let’s add as single line a an arbitrary footer and ignore this line in the import.

In [47]:
data = data+'\nFooter'
data

Out[47]:
'a;b;c\n1,1;2,2;3,3\n4;5;6\n7;8;9\nFooter'

In [48]:
df = pd.read_csv(io.StringIO(data), sep=";",decimal=",",dtype=np.float64,skipfooter=1)
df

Out[48]:
a b c
0 1,1 2,2 3,3
1 4 5 6
2 7 8 9
3 rows × 3 columns

In [49]:
df.dtypes

Out[49]:
a object
b object
c object
dtype: object

Now all data type information is lost supposingly because the conversion from the comma-separated to the dot-separated values failed. Adding an additional converter to the import (converters={'Rate': lambda x: float(x.replace('.','').replace(',','.'))}) fixes the problem and makes it more likely that the skipfooter routine is faulty.

System: iPython 2.0.0, Python 3.3.5, pandas 0.13.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions