From 705e1a583f70d84b509ddb53e02ba01e766f08e6 Mon Sep 17 00:00:00 2001 From: Amol Date: Wed, 25 May 2016 10:10:03 +0530 Subject: [PATCH 1/6] DOC: Added an example of pitfalls when using astype --- doc/source/basics.rst | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/doc/source/basics.rst b/doc/source/basics.rst index e3b0915cd571d..f61c88b85ceb9 100644 --- a/doc/source/basics.rst +++ b/doc/source/basics.rst @@ -1726,6 +1726,23 @@ then the more *general* one will be used as the result of the operation. # conversion of dtypes df3.astype('float32').dtypes +When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~numpy.ndarray.loc`, utilizing **:** as mask, upcasting occurs. +:meth:`~numpy.ndarray.loc` tries to fit in what we are assigning to the current dtypes, while [ ] will overwrite them taking the dtype from the right hand side. + +.. ipython:: python + + df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) + print df.loc[:, ['a', 'b']].astype(np.uint8).dtypes + df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8) + +To avoid this please take the following approach. + +.. ipython:: python + + df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) + df[['a','b']] = df[['a','b']].astype(np.uint8) + df + df.dtypes object conversion ~~~~~~~~~~~~~~~~~ From f394045f6f7d2bcb3d8573e0e18760138e750698 Mon Sep 17 00:00:00 2001 From: Amol Date: Wed, 25 May 2016 15:44:42 +0530 Subject: [PATCH 2/6] DOC: Cleaned up the documentation --- doc/source/basics.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/doc/source/basics.rst b/doc/source/basics.rst index f61c88b85ceb9..d80d14001e7ee 100644 --- a/doc/source/basics.rst +++ b/doc/source/basics.rst @@ -1726,13 +1726,13 @@ then the more *general* one will be used as the result of the operation. # conversion of dtypes df3.astype('float32').dtypes -When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~numpy.ndarray.loc`, utilizing **:** as mask, upcasting occurs. -:meth:`~numpy.ndarray.loc` tries to fit in what we are assigning to the current dtypes, while [ ] will overwrite them taking the dtype from the right hand side. +When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~DataFrame.loc`, utilizing **:** as mask, upcasting occurs. +:meth:`~DataFrame.loc` tries to fit in what we are assigning to the current dtypes, while [ ] will overwrite them taking the dtype from the right hand side. .. ipython:: python df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) - print df.loc[:, ['a', 'b']].astype(np.uint8).dtypes + df.loc[:, ['a', 'b']].astype(np.uint8).dtypes df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8) To avoid this please take the following approach. @@ -1743,6 +1743,7 @@ To avoid this please take the following approach. df[['a','b']] = df[['a','b']].astype(np.uint8) df df.dtypes + object conversion ~~~~~~~~~~~~~~~~~ From e1877bfe410d47b8e685ffb866664b3c6a36c344 Mon Sep 17 00:00:00 2001 From: Amol Date: Wed, 25 May 2016 20:11:06 +0530 Subject: [PATCH 3/6] DOC: Restructured the documentation --- doc/source/basics.rst | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/doc/source/basics.rst b/doc/source/basics.rst index d80d14001e7ee..f7e311af89f3d 100644 --- a/doc/source/basics.rst +++ b/doc/source/basics.rst @@ -1726,22 +1726,26 @@ then the more *general* one will be used as the result of the operation. # conversion of dtypes df3.astype('float32').dtypes -When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~DataFrame.loc`, utilizing **:** as mask, upcasting occurs. -:meth:`~DataFrame.loc` tries to fit in what we are assigning to the current dtypes, while [ ] will overwrite them taking the dtype from the right hand side. +Convert a subset of columns to a specified type using :meth:`~DataFrame.astype` .. ipython:: python df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) - df.loc[:, ['a', 'b']].astype(np.uint8).dtypes - df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8) + df[['a','b']] = df[['a','b']].astype(np.uint8) + df + df.dtypes + +.. note:: + + When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~DataFrame.loc`, utilizing **:** as mask, upcasting occurs. -To avoid this please take the following approach. + :meth:`~DataFrame.loc` tries to fit in what we are assigning to the current dtypes, while [ ] will overwrite them taking the dtype from the right hand side. Therefore the following piece of code produces the unintended result. .. ipython:: python df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) - df[['a','b']] = df[['a','b']].astype(np.uint8) - df + df.loc[:, ['a', 'b']].astype(np.uint8).dtypes + df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8) df.dtypes object conversion From 278e922ecc250fc8769486c7dbbc502742e3a90b Mon Sep 17 00:00:00 2001 From: Amol Date: Wed, 25 May 2016 23:40:42 +0530 Subject: [PATCH 4/6] DOC: Some cleaning up --- doc/source/basics.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/doc/source/basics.rst b/doc/source/basics.rst index f7e311af89f3d..428fb0d3c0f2e 100644 --- a/doc/source/basics.rst +++ b/doc/source/basics.rst @@ -1730,23 +1730,23 @@ Convert a subset of columns to a specified type using :meth:`~DataFrame.astype` .. ipython:: python - df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) - df[['a','b']] = df[['a','b']].astype(np.uint8) - df - df.dtypes + dft = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) + dft[['a','b']] = dft[['a','b']].astype(np.uint8) + dft + dft.dtypes .. note:: When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~DataFrame.loc`, utilizing **:** as mask, upcasting occurs. - :meth:`~DataFrame.loc` tries to fit in what we are assigning to the current dtypes, while [ ] will overwrite them taking the dtype from the right hand side. Therefore the following piece of code produces the unintended result. + :meth:`~DataFrame.loc` tries to fit in what we are assigning to the current dtypes, while ``[]`` will overwrite them taking the dtype from the right hand side. Therefore the following piece of code produces the unintended result. .. ipython:: python - df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) - df.loc[:, ['a', 'b']].astype(np.uint8).dtypes - df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8) - df.dtypes + dft = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) + dft.loc[:, ['a', 'b']].astype(np.uint8).dtypes + dft.loc[:, ['a', 'b']] = dft.loc[:, ['a', 'b']].astype(np.uint8) + dft.dtypes object conversion ~~~~~~~~~~~~~~~~~ From c30209d3abf4c4669ee40c959c361be6e053fdb6 Mon Sep 17 00:00:00 2001 From: Amol Date: Thu, 26 May 2016 22:41:31 +0530 Subject: [PATCH 5/6] DOC: Cleaning up --- doc/source/basics.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/basics.rst b/doc/source/basics.rst index 428fb0d3c0f2e..c18816fa3f5e1 100644 --- a/doc/source/basics.rst +++ b/doc/source/basics.rst @@ -1737,7 +1737,7 @@ Convert a subset of columns to a specified type using :meth:`~DataFrame.astype` .. note:: - When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~DataFrame.loc`, utilizing **:** as mask, upcasting occurs. + When trying to convert a subset of columns to a specified type using :meth:`~DataFrame.astype` and :meth:`~DataFrame.loc`, upcasting occurs. :meth:`~DataFrame.loc` tries to fit in what we are assigning to the current dtypes, while ``[]`` will overwrite them taking the dtype from the right hand side. Therefore the following piece of code produces the unintended result. From 035a17751a9fc2242933052152daf03bf8d566f8 Mon Sep 17 00:00:00 2001 From: Amol Date: Thu, 26 May 2016 23:23:27 +0530 Subject: [PATCH 6/6] DOC: Final touches --- doc/source/basics.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/doc/source/basics.rst b/doc/source/basics.rst index c18816fa3f5e1..917d2f2bb8b04 100644 --- a/doc/source/basics.rst +++ b/doc/source/basics.rst @@ -1741,12 +1741,12 @@ Convert a subset of columns to a specified type using :meth:`~DataFrame.astype` :meth:`~DataFrame.loc` tries to fit in what we are assigning to the current dtypes, while ``[]`` will overwrite them taking the dtype from the right hand side. Therefore the following piece of code produces the unintended result. -.. ipython:: python + .. ipython:: python - dft = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) - dft.loc[:, ['a', 'b']].astype(np.uint8).dtypes - dft.loc[:, ['a', 'b']] = dft.loc[:, ['a', 'b']].astype(np.uint8) - dft.dtypes + dft = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]}) + dft.loc[:, ['a', 'b']].astype(np.uint8).dtypes + dft.loc[:, ['a', 'b']] = dft.loc[:, ['a', 'b']].astype(np.uint8) + dft.dtypes object conversion ~~~~~~~~~~~~~~~~~