Open
Description
On master
(commit 40b4bb4):
>>> data = """A
1
CAT
3"""
>>> f = lambda x: x
>>> read_csv(StringIO(data), na_values='CAT', converters={'A': f}, engine='c')
A
0 1
1 CAT
2 3
>>> read_csv(StringIO(data), na_values='CAT', converters={'A': f}, engine='python')
A
0 1
1 NaN
2 3
I expect both to give the same output, though I believe the Python output is more correct because it respects na_values
unlike the C engine. I thought the simple fix would be to remove the continue
statement here, but that causes test failures, so probably a more involved refactoring might be needed to align the order of converter application, NaN
value conversion, and dtype
conversion.
IMO this should be added to #12686, as this is a difference in behaviour between the two engines.
xref #5232