@@ -78,7 +78,7 @@ some configurable handling of "what to do with the other axes":
78
78
::
79
79
80
80
pd.concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False,
81
- keys=None, levels=None, names=None, verify_integrity=False)
81
+ keys=None, levels=None, names=None, verify_integrity=False)
82
82
83
83
- ``objs ``: a sequence or mapping of Series, DataFrame, or Panel objects. If a
84
84
dict is passed, the sorted keys will be used as the `keys ` argument, unless
@@ -510,48 +510,45 @@ standard database join operations between DataFrame objects:
510
510
511
511
::
512
512
513
- merge(left, right, how='inner', on=None, left_on=None, right_on=None,
514
- left_index=False, right_index=False, sort=True,
515
- suffixes=('_x', '_y'), copy=True, indicator=False)
516
-
517
- Here's a description of what each argument is for:
518
-
519
- - ``left ``: A DataFrame object
520
- - ``right ``: Another DataFrame object
521
- - ``on ``: Columns (names) to join on. Must be found in both the left and
522
- right DataFrame objects. If not passed and ``left_index `` and
523
- ``right_index `` are ``False ``, the intersection of the columns in the
524
- DataFrames will be inferred to be the join keys
525
- - ``left_on ``: Columns from the left DataFrame to use as keys. Can either be
526
- column names or arrays with length equal to the length of the DataFrame
527
- - ``right_on ``: Columns from the right DataFrame to use as keys. Can either be
528
- column names or arrays with length equal to the length of the DataFrame
529
- - ``left_index ``: If ``True ``, use the index (row labels) from the left
530
- DataFrame as its join key(s). In the case of a DataFrame with a MultiIndex
531
- (hierarchical), the number of levels must match the number of join keys
532
- from the right DataFrame
533
- - ``right_index ``: Same usage as ``left_index `` for the right DataFrame
534
- - ``how ``: One of ``'left' ``, ``'right' ``, ``'outer' ``, ``'inner' ``. Defaults
535
- to ``inner ``. See below for more detailed description of each method
536
- - ``sort ``: Sort the result DataFrame by the join keys in lexicographical
537
- order. Defaults to ``True ``, setting to ``False `` will improve performance
538
- substantially in many cases
539
- - ``suffixes ``: A tuple of string suffixes to apply to overlapping
540
- columns. Defaults to ``('_x', '_y') ``.
541
- - ``copy ``: Always copy data (default ``True ``) from the passed DataFrame
542
- objects, even when reindexing is not necessary. Cannot be avoided in many
543
- cases but may improve performance / memory usage. The cases where copying
544
- can be avoided are somewhat pathological but this option is provided
545
- nonetheless.
546
- - ``indicator ``: Add a column to the output DataFrame called ``_merge ``
547
- with information on the source of each row. ``_merge `` is Categorical-type
548
- and takes on a value of ``left_only `` for observations whose merge key
549
- only appears in ``'left' `` DataFrame, ``right_only `` for observations whose
550
- merge key only appears in ``'right' `` DataFrame, and ``both `` if the
551
- observation's merge key is found in both.
552
-
553
- .. versionadded :: 0.17.0
554
-
513
+ pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None,
514
+ left_index=False, right_index=False, sort=True,
515
+ suffixes=('_x', '_y'), copy=True, indicator=False)
516
+
517
+ - ``left ``: A DataFrame object
518
+ - ``right ``: Another DataFrame object
519
+ - ``on ``: Columns (names) to join on. Must be found in both the left and
520
+ right DataFrame objects. If not passed and ``left_index `` and
521
+ ``right_index `` are ``False ``, the intersection of the columns in the
522
+ DataFrames will be inferred to be the join keys
523
+ - ``left_on ``: Columns from the left DataFrame to use as keys. Can either be
524
+ column names or arrays with length equal to the length of the DataFrame
525
+ - ``right_on ``: Columns from the right DataFrame to use as keys. Can either be
526
+ column names or arrays with length equal to the length of the DataFrame
527
+ - ``left_index ``: If ``True ``, use the index (row labels) from the left
528
+ DataFrame as its join key(s). In the case of a DataFrame with a MultiIndex
529
+ (hierarchical), the number of levels must match the number of join keys
530
+ from the right DataFrame
531
+ - ``right_index ``: Same usage as ``left_index `` for the right DataFrame
532
+ - ``how ``: One of ``'left' ``, ``'right' ``, ``'outer' ``, ``'inner' ``. Defaults
533
+ to ``inner ``. See below for more detailed description of each method
534
+ - ``sort ``: Sort the result DataFrame by the join keys in lexicographical
535
+ order. Defaults to ``True ``, setting to ``False `` will improve performance
536
+ substantially in many cases
537
+ - ``suffixes ``: A tuple of string suffixes to apply to overlapping
538
+ columns. Defaults to ``('_x', '_y') ``.
539
+ - ``copy ``: Always copy data (default ``True ``) from the passed DataFrame
540
+ objects, even when reindexing is not necessary. Cannot be avoided in many
541
+ cases but may improve performance / memory usage. The cases where copying
542
+ can be avoided are somewhat pathological but this option is provided
543
+ nonetheless.
544
+ - ``indicator ``: Add a column to the output DataFrame called ``_merge ``
545
+ with information on the source of each row. ``_merge `` is Categorical-type
546
+ and takes on a value of ``left_only `` for observations whose merge key
547
+ only appears in ``'left' `` DataFrame, ``right_only `` for observations whose
548
+ merge key only appears in ``'right' `` DataFrame, and ``both `` if the
549
+ observation's merge key is found in both.
550
+
551
+ .. versionadded :: 0.17.0
555
552
556
553
The return type will be the same as ``left ``. If ``left `` is a ``DataFrame ``
557
554
and ``right `` is a subclass of DataFrame, the return type will still be
@@ -573,11 +570,11 @@ terminology used to describe join operations between two SQL-table like
573
570
structures (DataFrame objects). There are several cases to consider which are
574
571
very important to understand:
575
572
576
- - **one-to-one ** joins: for example when joining two DataFrame objects on
577
- their indexes (which must contain unique values)
578
- - **many-to-one ** joins: for example when joining an index (unique) to one or
579
- more columns in a DataFrame
580
- - **many-to-many ** joins: joining columns on columns.
573
+ - **one-to-one ** joins: for example when joining two DataFrame objects on
574
+ their indexes (which must contain unique values)
575
+ - **many-to-one ** joins: for example when joining an index (unique) to one or
576
+ more columns in a DataFrame
577
+ - **many-to-many ** joins: joining columns on columns.
581
578
582
579
.. note ::
583
580
@@ -714,15 +711,15 @@ The merge indicator
714
711
715
712
.. ipython :: python
716
713
717
- df1 = DataFrame({' col1' :[0 ,1 ], ' col_left' :[' a' ,' b' ]})
718
- df2 = DataFrame({' col1' :[1 ,2 , 2 ],' col_right' :[2 ,2 , 2 ]})
719
- merge(df1, df2, on = ' col1' , how = ' outer' , indicator = True )
714
+ df1 = pd. DataFrame({' col1' : [0 , 1 ], ' col_left' :[' a' , ' b' ]})
715
+ df2 = pd. DataFrame({' col1' : [1 , 2 , 2 ],' col_right' :[2 , 2 , 2 ]})
716
+ pd. merge(df1, df2, on = ' col1' , how = ' outer' , indicator = True )
720
717
721
718
The ``indicator `` argument will also accept string arguments, in which case the indicator function will use the value of the passed string as the name for the indicator column.
722
719
723
720
.. ipython :: python
724
721
725
- merge(df1, df2, on = ' col1' , how = ' outer' , indicator = ' indicator_column' )
722
+ pd. merge(df1, df2, on = ' col1' , how = ' outer' , indicator = ' indicator_column' )
726
723
727
724
728
725
.. _merging.join.index :
@@ -924,7 +921,7 @@ a level name of the multi-indexed frame.
924
921
925
922
left = pd.DataFrame({' A' : [' A0' , ' A1' , ' A2' ],
926
923
' B' : [' B0' , ' B1' , ' B2' ]},
927
- index = Index([' K0' , ' K1' , ' K2' ], name = ' key' ))
924
+ index = pd. Index([' K0' , ' K1' , ' K2' ], name = ' key' ))
928
925
929
926
index = pd.MultiIndex.from_tuples([(' K0' , ' Y0' ), (' K1' , ' Y1' ),
930
927
(' K2' , ' Y2' ), (' K2' , ' Y3' )],
@@ -1116,28 +1113,20 @@ Timeseries friendly merging
1116
1113
Merging Ordered Data
1117
1114
~~~~~~~~~~~~~~~~~~~~
1118
1115
1119
- The `` pd.merge_ordered() ` ` function allows combining time series and other
1116
+ A :func: ` pd.merge_ordered ` function allows combining time series and other
1120
1117
ordered data. In particular it has an optional ``fill_method `` keyword to
1121
1118
fill/interpolate missing data:
1122
1119
1123
1120
.. ipython :: python
1124
1121
1125
- left = DataFrame({' k' : [' K0' , ' K1' , ' K1' , ' K2' ],
1126
- ' lv' : [1 , 2 , 3 , 4 ],
1127
- ' s' : [' a' , ' b' , ' c' , ' d' ]})
1128
-
1129
- right = DataFrame({' k' : [' K1' , ' K2' , ' K4' ],
1130
- ' rv' : [1 , 2 , 3 ]})
1122
+ left = pd.DataFrame({' k' : [' K0' , ' K1' , ' K1' , ' K2' ],
1123
+ ' lv' : [1 , 2 , 3 , 4 ],
1124
+ ' s' : [' a' , ' b' , ' c' , ' d' ]})
1131
1125
1132
- result = pd.merge_ordered(left, right, fill_method = ' ffill' , left_by = ' s' )
1133
-
1134
- .. ipython :: python
1135
- :suppress:
1126
+ right = pd.DataFrame({' k' : [' K1' , ' K2' , ' K4' ],
1127
+ ' rv' : [1 , 2 , 3 ]})
1136
1128
1137
- @savefig merging_ordered_merge.png
1138
- p.plot([left, right], result,
1139
- labels = [' left' , ' right' ], vertical = True );
1140
- plt.close(' all' );
1129
+ pd.merge_ordered(left, right, fill_method = ' ffill' , left_by = ' s' )
1141
1130
1142
1131
.. _merging.merge_asof :
1143
1132
@@ -1146,12 +1135,7 @@ Merging AsOf
1146
1135
1147
1136
.. versionadded :: 0.18.2
1148
1137
1149
- An ``pd.merge_asof() `` this is similar to an ordered left-join except that we
1150
- match on nearest key rather than equal keys.
1151
-
1152
- For each row in the ``left `` DataFrame, we select the last row in the ``right ``
1153
- DataFrame whose ``on `` key is less than the left's key. Both DataFrames must
1154
- be sorted by the key.
1138
+ A :func: `pd.merge_asof ` is similar to an ordered left-join except that we match on nearest key rather than equal keys. For each row in the ``left `` DataFrame, we select the last row in the ``right `` DataFrame whose ``on `` key is less than the left's key. Both DataFrames must be sorted by the key.
1155
1139
1156
1140
Optionally an asof merge can perform a group-wise merge. This matches the ``by `` key equally,
1157
1141
in addition to the nearest match on the ``on `` key.
0 commit comments