Open
Description
import pandas as pd
arr = pd.array(["a", "b", pd.NA], dtype="string")
s = pd.Series(["a", "b", "c"])
print(s.isin(arr))
# 0 True
# 1 True
# 2 False
# dtype: bool
print(pd.Series(arr).isin(["a", "b"]))
# 0 True
# 1 True
# 2 False
# dtype: bool
I think a case could be made that the actual output is not correct in these cases, and that both should return the nullable boolean pd.Series(pd.array([True, True, pd.NA]))
. In the first case we don't know that "c"
is not in arr
, and in the second case we don't know if pd.NA
happens to be "a"
or "b"
, so again we should have pd.NA
.
This is obviously an edge case but may be worth considering for the sake of consistency with the other three-valued logic operations (since isin
is essentially an "or" statement).