Skip to content

Commit 98ce484

Browse files
author
datajanko
committed
populates dsintro and frame.py with examples and warning
- adds example to frame.py - reworked warning in dsintro - reworked Notes in frame.py - additional fixups
1 parent 08ebc18 commit 98ce484

File tree

3 files changed

+76
-32
lines changed

3 files changed

+76
-32
lines changed

doc/source/dsintro.rst

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -505,9 +505,41 @@ of one argument to be called on the ``DataFrame``. A *copy* of the original
505505
DataFrame is returned, with the new values inserted.
506506

507507
.. warning::
508+
Starting from Python 3.6 ``**kwargs`` is an ordered dictionary and ``assign``
509+
respects the order of the keyword arguments. It is allowed to write
508510

509-
Since the function signature of ``assign`` is ``**kwargs``, a dictionary,
510-
the order of the new columns in the resulting DataFrame cannot be guaranteed
511+
.. ipython::
512+
:verbatim:
513+
514+
In [1]: # Allowed for Python 3.6 and later
515+
df.assign(C = lambda x: x['A'] + x['B'],
516+
D = lambda x: x['A'] + x['C'])
517+
518+
This may subtly change the behavior of your code when you're
519+
using ``.assign()`` to update an existing column. Prior to Python 3.6,
520+
callables referring to other variables being updated would get the "old" values
521+
522+
Previous Behaviour:
523+
524+
.. code-block:: ipython
525+
526+
In [2]: df = pd.DataFrame({"A": [1, 2, 3]})
527+
528+
In [3]: df.assign(A=lambda df: df.A + 1, C=lambda df: df.A * -1)
529+
Out[3]:
530+
A C
531+
0 2 -1
532+
1 3 -2
533+
2 4 -3
534+
535+
New Behaviour:
536+
537+
.. ipython:: python
538+
539+
df.assign(A=df.A+1, C= lambda df: df.A* -1)
540+
541+
For Python 3.5 and earlier the function signature of ``assign`` is ``**kwargs``,
542+
a dictionary, the order of the new columns in the resulting DataFrame cannot be guaranteed
511543
to match the order you pass in. To make things predictable, items are inserted
512544
alphabetically (by key) at the end of the DataFrame.
513545

doc/source/whatsnew/v0.23.0.txt

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -125,39 +125,39 @@ Current Behavior
125125
``.assign()`` accepts dependent arguments
126126
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
127127

128-
The :func:`DataFrame.assign()` now accepts dependent kwargs for python version later than 3.6 (see also `PEP 468
129-
<https://www.python.org/dev/peps/pep-0468/>`_). Now the keyword-value pairs passed to `.assign()` may depend on their predecessors if the values are callables. (:issue:`14207`)
128+
The :func:`DataFrame.assign` now accepts dependent keyword arguments for python version later than 3.6 (see also `PEP 468
129+
<https://www.python.org/dev/peps/pep-0468/>`_). Later keyword arguments may now refer to earlier ones if the argument is a callable. (:issue:`14207`)
130130

131131
.. ipython:: python
132132

133133
df = pd.DataFrame({'A': [1, 2, 3]})
134134
df
135-
df.assign(B=df.A, C=lambda x:x['A']+ x['B'])
135+
df.assign(B=df.A, C=lambda x:x['A']+ x['B'])
136136

137137
.. warning::
138138

139-
This may subtly change the behavior of your code when you're
140-
using ``.assign()`` to update an existing column. Previously, callables
141-
referring to other variables being updated would get the "old" values
139+
This may subtly change the behavior of your code when you're
140+
using ``.assign()`` to update an existing column. Previously, callables
141+
referring to other variables being updated would get the "old" values
142142

143-
Previous Behaviour:
143+
Previous Behaviour:
144144

145-
.. code-block:: ipython
145+
.. code-block:: ipython
146146

147-
In [2]: df = pd.DataFrame({"A": [1, 2, 3]})
147+
In [2]: df = pd.DataFrame({"A": [1, 2, 3]})
148148

149-
In [3]: df.assign(A=lambda df: df.A + 1, C=lambda df: df.A * -1)
150-
Out[3]:
151-
A C
152-
0 2 -1
153-
1 3 -2
154-
2 4 -3
149+
In [3]: df.assign(A=lambda df: df.A + 1, C=lambda df: df.A * -1)
150+
Out[3]:
151+
A C
152+
0 2 -1
153+
1 3 -2
154+
2 4 -3
155155

156-
New Behaviour:
156+
New Behaviour:
157157

158-
.. ipython:: python
158+
.. ipython:: python
159159

160-
df.assign(A=df.A+1, C= lambda df: df.A* -1)
160+
df.assign(A=df.A+1, C= lambda df: df.A* -1)
161161

162162

163163
.. _whatsnew_0230.enhancements.other:

pandas/core/frame.py

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2655,15 +2655,17 @@ def assign(self, **kwargs):
26552655
26562656
Notes
26572657
-----
2658-
For python 3.6 and above, the columns are inserted in the order of
2659-
\*\*kwargs. For python 3.5 and earlier, since \*\*kwargs is unordered,
2660-
the columns are inserted in alphabetical order at the end of your
2661-
DataFrame. Assigning multiple columns within the same ``assign``
2662-
is possible, but for python 3.5 and earlier, you cannot reference
2663-
other columns created within the same ``assign`` call.
2664-
For python 3.6 and above it is possible to reference columns created
2665-
in an assignment. To this end you have to respect the order of kwargs
2666-
and use callables referencing the assigned columns.
2658+
Assigning multiple columns within the same ``assign`` is possible.
2659+
For Python 3.6 and above, later items in '\*\*kwargs' may refer to
2660+
newly created or modified columns in 'df'; items are computed and
2661+
assigned into 'df' in order. For Python 3.5 and below, the order of
2662+
keyword arguments is not specified, you cannot refer to newly created
2663+
or modified columns. All items are computed first, and then assigned
2664+
in alphabetical order.
2665+
2666+
.. versionmodified :: 0.23.0
2667+
2668+
Keyword argument order is maintained for Python 3.6 and later.
26672669
26682670
Examples
26692671
--------
@@ -2699,20 +2701,30 @@ def assign(self, **kwargs):
26992701
7 8 -1.495604 2.079442
27002702
8 9 0.549296 2.197225
27012703
9 10 -0.758542 2.302585
2704+
2705+
Where the keyword arguments depend on each other
2706+
2707+
>>> df = pd.DataFrame({'A': [1, 2, 3]})
2708+
2709+
>>> df.assign(B=df.A, C=lambda x:x['A']+ x['B'])
2710+
A B C
2711+
0 1 1 2
2712+
1 2 2 4
2713+
2 3 3 6
27022714
"""
27032715
data = self.copy()
27042716

2705-
# for 3.6 preserve order of kwargs
2717+
# >= 3.6 preserve order of kwargs
27062718
if PY36:
27072719
for k, v in kwargs.items():
27082720
data[k] = com._apply_if_callable(v, data)
27092721
else:
2710-
# for 3.5 or earlier: do all calculations first...
2722+
# <= 3.5: do all calculations first...
27112723
results = OrderedDict()
27122724
for k, v in kwargs.items():
27132725
results[k] = com._apply_if_callable(v, data)
27142726

2715-
# sort by key for 3.5 and earlier
2727+
# <= 3.5 and earlier
27162728
results = sorted(results.items())
27172729
# ... and then assign
27182730
for k, v in results:

0 commit comments

Comments
 (0)