Closed
Description
When (at least) one element in a MultiIndex contains a NaN, has_duplicates starts to behave strangely:
>>> idx = pd.MultiIndex.from_arrays([[101, 102], [3.5, np.nan]])
>>> idx
MultiIndex
[(101, 3.5), (102, nan)]
>>> idx.has_duplicates
True
>>> idx.get_duplicates()
[]
I would expect has_duplicates to return False here, because 102 is not the same as 101.
I would also expect it to return false for the MultiIndex
MultiIndex
[(101, 3.5), (101, nan)]
since 3.5 != NaN, but this case is more debatable.
This is important because you can't call .unstack() on a series with a MultiIndex for which has_duplicates is True, even if the MultiIndex is of high dimension and the dimensions containing the NaN(s) are not involved in the operation.
This is with pandas 0.12.0