From 703d547ccc4d33cc5553609bd0971637a0d5561c Mon Sep 17 00:00:00 2001 From: GYHHAHA <1801214626@qq.com> Date: Sat, 25 Apr 2020 11:01:00 +0800 Subject: [PATCH 1/3] fix doc for crosstab with Categorical data input --- doc/source/user_guide/reshaping.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/source/user_guide/reshaping.rst b/doc/source/user_guide/reshaping.rst index 7e890962d8da1..a987ee258d76d 100644 --- a/doc/source/user_guide/reshaping.rst +++ b/doc/source/user_guide/reshaping.rst @@ -472,14 +472,15 @@ If ``crosstab`` receives only two Series, it will provide a frequency table. pd.crosstab(df['A'], df['B']) Any input passed containing ``Categorical`` data will have **all** of its -categories included in the cross-tabulation, even if the actual data does -not contain any instances of a particular category. +categories included in the cross-tabulation while setting ``dropna=False``, +even if the actual data does not contain any instances of a particular category. .. ipython:: python foo = pd.Categorical(['a', 'b'], categories=['a', 'b', 'c']) bar = pd.Categorical(['d', 'e'], categories=['d', 'e', 'f']) pd.crosstab(foo, bar) + pd.crosstab(foo, bar, dropna=False) Normalization ~~~~~~~~~~~~~ From 4d8611e815cf74ed527b8f5e384416811a1e3d8a Mon Sep 17 00:00:00 2001 From: GYHHAHA <1801214626@qq.com> Date: Sun, 26 Apr 2020 10:52:28 +0800 Subject: [PATCH 2/3] fix doc for crosstab with Categorical data input --- doc/source/user_guide/reshaping.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/user_guide/reshaping.rst b/doc/source/user_guide/reshaping.rst index a987ee258d76d..28e3f3bd0aeca 100644 --- a/doc/source/user_guide/reshaping.rst +++ b/doc/source/user_guide/reshaping.rst @@ -472,7 +472,7 @@ If ``crosstab`` receives only two Series, it will provide a frequency table. pd.crosstab(df['A'], df['B']) Any input passed containing ``Categorical`` data will have **all** of its -categories included in the cross-tabulation while setting ``dropna=False``, +categories included in the cross-tabulation while setting ``dropna=False``, even if the actual data does not contain any instances of a particular category. .. ipython:: python From c905e18ce329b2cc89b8629643029707271e37f6 Mon Sep 17 00:00:00 2001 From: GYHHAHA <1801214626@qq.com> Date: Mon, 27 Apr 2020 08:41:36 +0800 Subject: [PATCH 3/3] put dropna=False in a separate ipython block --- doc/source/user_guide/reshaping.rst | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/doc/source/user_guide/reshaping.rst b/doc/source/user_guide/reshaping.rst index 28e3f3bd0aeca..c476e33b8ddde 100644 --- a/doc/source/user_guide/reshaping.rst +++ b/doc/source/user_guide/reshaping.rst @@ -471,15 +471,22 @@ If ``crosstab`` receives only two Series, it will provide a frequency table. pd.crosstab(df['A'], df['B']) -Any input passed containing ``Categorical`` data will have **all** of its -categories included in the cross-tabulation while setting ``dropna=False``, -even if the actual data does not contain any instances of a particular category. +``crosstab`` can also be implemented +to ``Categorical`` data. .. ipython:: python foo = pd.Categorical(['a', 'b'], categories=['a', 'b', 'c']) bar = pd.Categorical(['d', 'e'], categories=['d', 'e', 'f']) pd.crosstab(foo, bar) + +If you want to include **all** of data categories even if the actual data does +not contain any instances of a particular category, you should set ``dropna=False``. + +For example: + +.. ipython:: python + pd.crosstab(foo, bar, dropna=False) Normalization