From 6857e0a20d2001cf6585bb312b7c5667bc690f39 Mon Sep 17 00:00:00 2001
From: Patrick Hoefler <patrick_hoefler@gmx.net>
Date: Sat, 30 Sep 2023 00:13:41 +0100
Subject: [PATCH 1/5] DOC: Adjust user guide for CoW docs

---
 doc/source/user_guide/copy_on_write.rst | 128 ++++++++++++++----------
 1 file changed, 73 insertions(+), 55 deletions(-)

diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst
index 59bdb1926895f..88acd0bd43b49 100644
--- a/doc/source/user_guide/copy_on_write.rst
+++ b/doc/source/user_guide/copy_on_write.rst
@@ -7,8 +7,8 @@ Copy-on-Write (CoW)
 *******************
 
 Copy-on-Write was first introduced in version 1.5.0. Starting from version 2.0 most of the
-optimizations that become possible through CoW are implemented and supported. A complete list
-can be found at :ref:`Copy-on-Write optimizations <copy_on_write.optimizations>`.
+optimizations that become possible through CoW are implemented and supported. All possible
+optimizations are supported starting from pandas 2.1.
 
 We expect that CoW will be enabled by default in version 3.0.
 
@@ -154,6 +154,75 @@ With copy on write this can be done by using ``loc``.
 
     df.loc[df["bar"] > 5, "foo"] = 100
 
+Read-only NumPy arrays
+----------------------
+
+Accessing the underlying NumPy array of a DataFrame will return a read-only array if the array
+shares data with the initial DataFrame:
+
+The array is a copy if the initial DataFrame consists of more than one array:
+
+
+.. ipython:: python
+
+    df = pd.DataFrame({"a": [1, 2], "b": [1.5, 2.5]})
+    df.to_numpy()
+
+The array shares data with the DataFrame if the DataFrame consists of only one NumPy array:
+
+.. ipython:: python
+
+    df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
+    df.to_numpy()
+
+This array is read-only, which means that it can't be modified inplace:
+
+.. ipython:: python
+    :okexcept:
+
+    arr = df.to_numpy()
+    arr[0, 0] = 100
+
+There are two potential solution to this:
+
+- Trigger a copy manually if you want to avoid updating DataFrames that share memory with your array.
+- Make the array writeable. This is a more performant solution but circumvents Copy-on-Write rules, so
+  it should be used with caution.
+
+.. ipython:: python
+
+    arr = df.to_numpy()
+    arr.flags.writeable = True
+    arr[0, 0] = 100
+    arr
+
+Patterns to avoid
+-----------------
+
+No defensiv copy will be performed if two objects share the same data while
+you are modifying one object inplace.
+
+.. ipython:: python
+
+    df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
+    df2 = df.reset_index()
+    df2.iloc[0, 0] = 100
+
+This creates two objects that share data and thus the setitem operation will trigger a
+copy. This is not necessary if you the initial object ``df`` isn't needed anymore.
+Simply reassigning to the same variable will invalidate the reference that is
+held by the object.
+
+.. ipython:: python
+
+    df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
+    df = df.reset_index()
+    df.iloc[0, 0] = 100
+
+No copy is necessary in this example.
+Creating multiple references in the same method keeps unnecessary references alive
+and thus will hurt performance with Copy-on-Write.
+
 .. _copy_on_write.optimizations:
 
 Copy-on-Write optimizations
@@ -161,59 +230,8 @@ Copy-on-Write optimizations
 
 A new lazy copy mechanism that defers the copy until the object in question is modified
 and only if this object shares data with another object. This mechanism was added to
-following methods:
-
-  - :meth:`DataFrame.reset_index` / :meth:`Series.reset_index`
-  - :meth:`DataFrame.set_index`
-  - :meth:`DataFrame.set_axis` / :meth:`Series.set_axis`
-  - :meth:`DataFrame.set_flags` / :meth:`Series.set_flags`
-  - :meth:`DataFrame.rename_axis` / :meth:`Series.rename_axis`
-  - :meth:`DataFrame.reindex` / :meth:`Series.reindex`
-  - :meth:`DataFrame.reindex_like` / :meth:`Series.reindex_like`
-  - :meth:`DataFrame.assign`
-  - :meth:`DataFrame.drop`
-  - :meth:`DataFrame.dropna` / :meth:`Series.dropna`
-  - :meth:`DataFrame.select_dtypes`
-  - :meth:`DataFrame.align` / :meth:`Series.align`
-  - :meth:`Series.to_frame`
-  - :meth:`DataFrame.rename` / :meth:`Series.rename`
-  - :meth:`DataFrame.add_prefix` / :meth:`Series.add_prefix`
-  - :meth:`DataFrame.add_suffix` / :meth:`Series.add_suffix`
-  - :meth:`DataFrame.drop_duplicates` / :meth:`Series.drop_duplicates`
-  - :meth:`DataFrame.droplevel` / :meth:`Series.droplevel`
-  - :meth:`DataFrame.reorder_levels` / :meth:`Series.reorder_levels`
-  - :meth:`DataFrame.between_time` / :meth:`Series.between_time`
-  - :meth:`DataFrame.filter` / :meth:`Series.filter`
-  - :meth:`DataFrame.head` / :meth:`Series.head`
-  - :meth:`DataFrame.tail` / :meth:`Series.tail`
-  - :meth:`DataFrame.isetitem`
-  - :meth:`DataFrame.pipe` / :meth:`Series.pipe`
-  - :meth:`DataFrame.pop` / :meth:`Series.pop`
-  - :meth:`DataFrame.replace` / :meth:`Series.replace`
-  - :meth:`DataFrame.shift` / :meth:`Series.shift`
-  - :meth:`DataFrame.sort_index` / :meth:`Series.sort_index`
-  - :meth:`DataFrame.sort_values` / :meth:`Series.sort_values`
-  - :meth:`DataFrame.squeeze` / :meth:`Series.squeeze`
-  - :meth:`DataFrame.swapaxes`
-  - :meth:`DataFrame.swaplevel` / :meth:`Series.swaplevel`
-  - :meth:`DataFrame.take` / :meth:`Series.take`
-  - :meth:`DataFrame.to_timestamp` / :meth:`Series.to_timestamp`
-  - :meth:`DataFrame.to_period` / :meth:`Series.to_period`
-  - :meth:`DataFrame.truncate`
-  - :meth:`DataFrame.iterrows`
-  - :meth:`DataFrame.tz_convert` / :meth:`Series.tz_localize`
-  - :meth:`DataFrame.fillna` / :meth:`Series.fillna`
-  - :meth:`DataFrame.interpolate` / :meth:`Series.interpolate`
-  - :meth:`DataFrame.ffill` / :meth:`Series.ffill`
-  - :meth:`DataFrame.bfill` / :meth:`Series.bfill`
-  - :meth:`DataFrame.where` / :meth:`Series.where`
-  - :meth:`DataFrame.infer_objects` / :meth:`Series.infer_objects`
-  - :meth:`DataFrame.astype` / :meth:`Series.astype`
-  - :meth:`DataFrame.convert_dtypes` / :meth:`Series.convert_dtypes`
-  - :meth:`DataFrame.join`
-  - :meth:`DataFrame.eval`
-  - :func:`concat`
-  - :func:`merge`
+methods that don't require a copy of the underlying data. Popular examples are :meth:`DataFrame.drop` for ``axis=1``
+and :meth:`DataFrame.rename`.
 
 These methods return views when Copy-on-Write is enabled, which provides a significant
 performance improvement compared to the regular execution.

From 70c8846d286419be887201da94de86f18cd8ab50 Mon Sep 17 00:00:00 2001
From: Patrick Hoefler <61934744+phofl@users.noreply.github.com>
Date: Tue, 3 Oct 2023 11:49:28 +0200
Subject: [PATCH 2/5] Update doc/source/user_guide/copy_on_write.rst

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
---
 doc/source/user_guide/copy_on_write.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst
index 88acd0bd43b49..4f5bac15db8f8 100644
--- a/doc/source/user_guide/copy_on_write.rst
+++ b/doc/source/user_guide/copy_on_write.rst
@@ -199,7 +199,7 @@ There are two potential solution to this:
 Patterns to avoid
 -----------------
 
-No defensiv copy will be performed if two objects share the same data while
+No defensive copy will be performed if two objects share the same data while
 you are modifying one object inplace.
 
 .. ipython:: python

From 1b5396dc06fb98488750830828c076333573122a Mon Sep 17 00:00:00 2001
From: Patrick Hoefler <61934744+phofl@users.noreply.github.com>
Date: Tue, 3 Oct 2023 11:49:41 +0200
Subject: [PATCH 3/5] Update doc/source/user_guide/copy_on_write.rst

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
---
 doc/source/user_guide/copy_on_write.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst
index 4f5bac15db8f8..1bf4e35c89cfd 100644
--- a/doc/source/user_guide/copy_on_write.rst
+++ b/doc/source/user_guide/copy_on_write.rst
@@ -209,7 +209,7 @@ you are modifying one object inplace.
     df2.iloc[0, 0] = 100
 
 This creates two objects that share data and thus the setitem operation will trigger a
-copy. This is not necessary if you the initial object ``df`` isn't needed anymore.
+copy. This is not necessary if the initial object ``df`` isn't needed anymore.
 Simply reassigning to the same variable will invalidate the reference that is
 held by the object.
 

From 521a4989c5362481db3a7156b02986fd9bbf7145 Mon Sep 17 00:00:00 2001
From: Patrick Hoefler <61934744+phofl@users.noreply.github.com>
Date: Tue, 3 Oct 2023 11:50:13 +0200
Subject: [PATCH 4/5] Update copy_on_write.rst

---
 doc/source/user_guide/copy_on_write.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst
index 1bf4e35c89cfd..94545015b9492 100644
--- a/doc/source/user_guide/copy_on_write.rst
+++ b/doc/source/user_guide/copy_on_write.rst
@@ -220,7 +220,7 @@ held by the object.
     df.iloc[0, 0] = 100
 
 No copy is necessary in this example.
-Creating multiple references in the same method keeps unnecessary references alive
+Creating multiple references keeps unnecessary references alive
 and thus will hurt performance with Copy-on-Write.
 
 .. _copy_on_write.optimizations:

From cf9307329a6b40c320bed622c268738f3bc63ad6 Mon Sep 17 00:00:00 2001
From: Patrick Hoefler <61934744+phofl@users.noreply.github.com>
Date: Tue, 3 Oct 2023 11:51:22 +0200
Subject: [PATCH 5/5] Update copy_on_write.rst

---
 doc/source/user_guide/copy_on_write.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst
index 94545015b9492..d0c57b56585db 100644
--- a/doc/source/user_guide/copy_on_write.rst
+++ b/doc/source/user_guide/copy_on_write.rst
@@ -183,6 +183,8 @@ This array is read-only, which means that it can't be modified inplace:
     arr = df.to_numpy()
     arr[0, 0] = 100
 
+The same holds true for a Series, since a Series always consists of a single array.
+
 There are two potential solution to this:
 
 - Trigger a copy manually if you want to avoid updating DataFrames that share memory with your array.