Description
From #8946:
If cat > scalar
is allowed and cat == list
also because it basically is doing a comparison of each line as if it was the scalar case, then by that logic, cat > list
should also be allowed: each row in that comparison would treat the element from the list as a scalar.
On the other hand a scalar comparison with the categorical makes only sense if the scalar can be treated as a category (for any other value, it's basically a "not of the same type" comparison, which would raise on python3), so the scalar must be in categories
and this should not work:
In[4]: df = pd.DataFrame({"a":[1,3,3,3,np.nan]})
In[6]: df["b"] = df.a.astype("category")
In[7]: df.b
Out[7]:
0 1
1 3
2 3
3 3
4 NaN
Name: b, dtype: category
Categories (2, float64): [1 < 3]
In[8]: df.b > 2
Out[8]:
0 False
1 True
2 True
3 True
4 False
Name: b, dtype: bool
Oh, one more thing: according to that thought, df.b == 2
(-> The "equality" case) should also NOT work, because 2
is not in categories
and therefore a "different type".
Current code results in this:
In [5]: df.b==2
Out[5]:
0 False
1 False
2 False
3 False
4 False
Name: b, dtype: bool
this is actually consistent (e.g. it returns False). On a comparison it shouldn't raise so this is a reasonable result. I think this is de-facto like the following and is useful.
In [7]: Series(['a','b','c'])==2
Out[7]:
0 False
1 False
2 False
dtype: bool