read_csv() ignores na_filter=False for index columns

Using 0.14.0. `pandas.io.parsers.read_csv` is supposed to ignore blank-looking values if `na_filter=False`, but it **does not do this** for `index_col` columns.

foo.csv:

```
fruit,size,sugar
apples,medium,2
pear,medium,3
grape,small,4
durian,,1
```

The default behavior gives a dataframe with a NaN in place of the empty value from this last row:

```
df = pd.io.parsers.read_csv("foo.csv")
```

This gives the same dataframe with a blank string instead of a NaN. So far so good:

```
df = pd.io.parsers.read_csv("foo.csv", na_filter=False)
```

My expectation was that this next version would give a dataframe with no NaN values in the index, but it does **not**:

```
df = pd.io.parsers.read_csv("foo.csv", index_col=['fruit','size'], na_filter=False)
print df
=>                sugar
   fruit  size         
   apples medium      2
   pear   medium      3
   grape  small       4
   durian NaN         1
```

Because it unexpectedly includes NaNs, I've been fighting with [issue 4862 in `unstack`](//github.com/pydata/pandas/issues/4862) for hours :-(.

In order to get the desired behavior, a DF with no NaNs in the index, I have to read the data without a multi-index, then `set_index` afterwards:

```
df = pd.io.parsers.read_csv("foo.csv", na_filter=False)
df.set_index(['fruit','size'])
```

As a temporary fix, perhaps the documentation ought to clarify the behavior of `na_filter` with respect to `index_col`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

read_csv() ignores na_filter=False for index columns #7518

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

read_csv() ignores na_filter=False for index columns #7518

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions