Skip to content

MultiIndex.dropna() does not always drop NANs #19387

Closed
@jzwinck

Description

@jzwinck

A MultiIndex "label" of -1 means NAN (though I have found no documentation of this). For example:

>>> pd.MultiIndex.from_arrays([['a', 'a'], ['x', np.nan]])
MultiIndex(levels=[['a'], ['x']],
           labels=[[0, 0], [0, -1]])

A MultiIndex can also be constructed with NAN values in levels:

>>> pd.MultiIndex(levels=[['a'], ['x', np.nan]], labels=[[0, 0], [0, 1]])
MultiIndex(levels=[['a'], ['x', nan]],
           labels=[[0, 0], [0, 1]])

MultiIndex.dropna() works for the first case, but does nothing for the second:

>>> pd.MultiIndex(levels=[['a'], ['x', np.nan]], labels=[[0,0], [0,1]]).dropna()
MultiIndex(levels=[['a'], ['x', nan]],
           labels=[[0, 0], [0, 1]])

It appears that MultiIndex.dropna() only drops rows whose label is -1, but not rows whose level is actually NAN. It should drop both types of rows, so the result should be:

MultiIndex(levels=[['a'], ['x']],
           labels=[[0], [0]])

I am using Pandas 0.20.3, NumPy 1.13.1, and Python 3.5.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions