Skip to content

PythonParser::_check_thousands appears broken #4596

Closed
@cancan101

Description

@cancan101

This code appears broken:

    def _check_thousands(self, lines):
        if self.thousands is None:
            return lines
        nonnum = re.compile('[^-^0-9^%s^.]+' % self.thousands)
        ret = []
        for l in lines:
            rl = []
            for x in l:
                if (not isinstance(x, compat.string_types) or
                    self.thousands not in x or
                        nonnum.search(x.strip())):
                    rl.append(x)
                else:
                    rl.append(x.replace(',', ''))
            ret.append(rl)
        return ret

It looks like the thousands argument to the class is used to check if the value is "non numeric" but then a hard coded comma is used when actually performing the cleaning.

In addition to fixing this, I would recommend factoring out this method so that it can be used elsewhere.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO CSVread_csv, to_csvIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions