Skip to content

Commit 6fb3760

Browse files
author
AntonioAndraues
committed
Merge remote-tracking branch 'upstream/master'
2 parents e74e5cb + 46e89b0 commit 6fb3760

37 files changed

+428
-214
lines changed

ci/setup_env.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ echo
5555
echo "update conda"
5656
conda config --set ssl_verify false
5757
conda config --set quiet true --set always_yes true --set changeps1 false
58+
conda install pip # create conda to create a historical artifact for pip & setuptools
5859
conda update -n base conda
5960

6061
echo "conda info -a"

doc/source/user_guide/advanced.rst

Lines changed: 25 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -783,27 +783,41 @@ values **not** in the categories, similarly to how you can reindex **any** panda
783783

784784
.. ipython:: python
785785
786-
df2.reindex(['a', 'e'])
787-
df2.reindex(['a', 'e']).index
788-
df2.reindex(pd.Categorical(['a', 'e'], categories=list('abcde')))
789-
df2.reindex(pd.Categorical(['a', 'e'], categories=list('abcde'))).index
786+
df3 = pd.DataFrame({'A': np.arange(3),
787+
'B': pd.Series(list('abc')).astype('category')})
788+
df3 = df3.set_index('B')
789+
df3
790+
791+
.. ipython:: python
792+
793+
df3.reindex(['a', 'e'])
794+
df3.reindex(['a', 'e']).index
795+
df3.reindex(pd.Categorical(['a', 'e'], categories=list('abe')))
796+
df3.reindex(pd.Categorical(['a', 'e'], categories=list('abe'))).index
790797
791798
.. warning::
792799

793800
Reshaping and Comparison operations on a ``CategoricalIndex`` must have the same categories
794801
or a ``TypeError`` will be raised.
795802

796-
.. code-block:: ipython
803+
.. ipython:: python
797804
798-
In [9]: df3 = pd.DataFrame({'A': np.arange(6), 'B': pd.Series(list('aabbca')).astype('category')})
805+
df4 = pd.DataFrame({'A': np.arange(2),
806+
'B': list('ba')})
807+
df4['B'] = df4['B'].astype(CategoricalDtype(list('ab')))
808+
df4 = df4.set_index('B')
809+
df4.index
799810
800-
In [11]: df3 = df3.set_index('B')
811+
df5 = pd.DataFrame({'A': np.arange(2),
812+
'B': list('bc')})
813+
df5['B'] = df5['B'].astype(CategoricalDtype(list('bc')))
814+
df5 = df5.set_index('B')
815+
df5.index
801816
802-
In [11]: df3.index
803-
Out[11]: CategoricalIndex(['a', 'a', 'b', 'b', 'c', 'a'], categories=['a', 'b', 'c'], ordered=False, name='B', dtype='category')
817+
.. code-block:: ipython
804818
805-
In [12]: pd.concat([df2, df3])
806-
TypeError: categories must match existing categories when appending
819+
In [1]: pd.concat([df4, df5])
820+
TypeError: categories must match existing categories when appending
807821
808822
.. _indexing.rangeindex:
809823

doc/source/user_guide/io.rst

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3811,6 +3811,8 @@ storing/selecting from homogeneous index ``DataFrames``.
38113811
# the levels are automatically included as data columns
38123812
store.select('df_mi', 'foo=bar')
38133813
3814+
.. note::
3815+
The ``index`` keyword is reserved and cannot be use as a level name.
38143816

38153817
.. _io.hdf5-query:
38163818

@@ -3829,6 +3831,7 @@ A query is specified using the ``Term`` class under the hood, as a boolean expre
38293831

38303832
* ``index`` and ``columns`` are supported indexers of ``DataFrames``.
38313833
* if ``data_columns`` are specified, these can be used as additional indexers.
3834+
* level name in a MultiIndex, with default name ``level_0``, ``level_1``, … if not provided.
38323835

38333836
Valid comparison operators are:
38343837

@@ -3947,7 +3950,7 @@ space. These are in terms of the total number of rows in a table.
39473950

39483951
.. _io.hdf5-timedelta:
39493952

3950-
Using timedelta64[ns]
3953+
Query timedelta64[ns]
39513954
+++++++++++++++++++++
39523955

39533956
You can store and query using the ``timedelta64[ns]`` type. Terms can be
@@ -3966,6 +3969,35 @@ specified in the format: ``<float>(<unit>)``, where float may be signed (and fra
39663969
store.append('dftd', dftd, data_columns=True)
39673970
store.select('dftd', "C<'-3.5D'")
39683971
3972+
Query MultiIndex
3973+
++++++++++++++++
3974+
3975+
Selecting from a ``MultiIndex`` can be achieved by using the name of the level.
3976+
3977+
.. ipython:: python
3978+
3979+
df_mi.index.names
3980+
store.select('df_mi', "foo=baz and bar=two")
3981+
3982+
If the ``MultiIndex`` levels names are ``None``, the levels are automatically made available via
3983+
the ``level_n`` keyword with ``n`` the level of the ``MultiIndex`` you want to select from.
3984+
3985+
.. ipython:: python
3986+
3987+
index = pd.MultiIndex(
3988+
levels=[["foo", "bar", "baz", "qux"], ["one", "two", "three"]],
3989+
codes=[[0, 0, 0, 1, 1, 2, 2, 3, 3, 3], [0, 1, 2, 0, 1, 1, 2, 0, 1, 2]],
3990+
)
3991+
df_mi_2 = pd.DataFrame(np.random.randn(10, 3),
3992+
index=index, columns=["A", "B", "C"])
3993+
df_mi_2
3994+
3995+
store.append("df_mi_2", df_mi_2)
3996+
3997+
# the levels are automatically included as data columns with keyword level_n
3998+
store.select("df_mi_2", "level_0=foo and level_1=two")
3999+
4000+
39694001
Indexing
39704002
++++++++
39714003

doc/source/whatsnew/v1.0.0.rst

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@ Other enhancements
109109
(:issue:`28368`)
110110
- :meth:`DataFrame.to_json` now accepts an ``indent`` integer argument to enable pretty printing of JSON output (:issue:`12004`)
111111
- :meth:`read_stata` can read Stata 119 dta files. (:issue:`28250`)
112+
- Added ``encoding`` argument to :func:`DataFrame.to_html` for non-ascii text (:issue:`28663`)
112113

113114
Build Changes
114115
^^^^^^^^^^^^^
@@ -123,7 +124,37 @@ source, you should no longer need to install Cython into your build environment
123124
Backwards incompatible API changes
124125
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
125126

126-
- :class:`pandas.core.groupby.GroupBy.transform` now raises on invalid operation names (:issue:`27489`).
127+
.. _whatsnew_1000.api_breaking.MultiIndex._names:
128+
129+
``MultiIndex.levels`` do not hold level names any longer
130+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
131+
132+
- A :class:`MultiIndex` previously stored the level names as attributes of each of its
133+
:attr:`MultiIndex.levels`. From Pandas 1.0, the names are only accessed through
134+
:attr:`MultiIndex.names` (which was also possible previously). This is done in order to
135+
make :attr:`MultiIndex.levels` more similar to :attr:`CategoricalIndex.categories` (:issue:`27242`:).
136+
137+
*pandas 0.25.x*
138+
139+
.. code-block:: ipython
140+
141+
In [1]: mi = pd.MultiIndex.from_product([[1, 2], ['a', 'b']], names=['x', 'y'])
142+
Out[2]: mi
143+
MultiIndex([(1, 'a'),
144+
(1, 'b'),
145+
(2, 'a'),
146+
(2, 'b')],
147+
names=['x', 'y'])
148+
Out[3]: mi.levels[0].name
149+
'x'
150+
151+
*pandas 1.0.0*
152+
153+
.. ipython:: python
154+
155+
mi = pd.MultiIndex.from_product([[1, 2], ['a', 'b']], names=['x', 'y'])
156+
mi.levels[0].name
157+
127158
- :class:`pandas.core.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`)
128159

129160
*pandas 0.25.x*
@@ -149,6 +180,7 @@ Backwards incompatible API changes
149180
Other API changes
150181
^^^^^^^^^^^^^^^^^
151182

183+
- :class:`pandas.core.groupby.GroupBy.transform` now raises on invalid operation names (:issue:`27489`)
152184
- :meth:`pandas.api.types.infer_dtype` will now return "integer-na" for integer and ``np.nan`` mix (:issue:`27283`)
153185
- :meth:`MultiIndex.from_arrays` will no longer infer names from arrays if ``names=None`` is explicitly provided (:issue:`27292`)
154186
- In order to improve tab-completion, Pandas does not include most deprecated attributes when introspecting a pandas object using ``dir`` (e.g. ``dir(df)``).
@@ -162,6 +194,7 @@ Documentation Improvements
162194
^^^^^^^^^^^^^^^^^^^^^^^^^^
163195

164196
- Added new section on :ref:`scale` (:issue:`28315`).
197+
- Added sub-section Query MultiIndex in IO tools user guide (:issue:`28791`)
165198

166199
.. _whatsnew_1000.deprecations:
167200

@@ -221,6 +254,7 @@ Categorical
221254

222255
- Added test to assert the :func:`fillna` raises the correct ValueError message when the value isn't a value from categories (:issue:`13628`)
223256
- Bug in :meth:`Categorical.astype` where ``NaN`` values were handled incorrectly when casting to int (:issue:`28406`)
257+
- :meth:`DataFrame.reindex` with a :class:`CategoricalIndex` would fail when the targets contained duplicates, and wouldn't fail if the source contained duplicates (:issue:`28107`)
224258
- Bug in :meth:`Categorical.astype` not allowing for casting to extension dtypes (:issue:`28668`)
225259
- Bug where :func:`merge` was unable to join on categorical and extension dtype columns (:issue:`28668`)
226260
- :meth:`Categorical.searchsorted` and :meth:`CategoricalIndex.searchsorted` now work on unordered categoricals also (:issue:`21667`)
@@ -290,6 +324,9 @@ Indexing
290324
- Bug in reindexing a :meth:`PeriodIndex` with another type of index that contained a `Period` (:issue:`28323`) (:issue:`28337`)
291325
- Fix assignment of column via `.loc` with numpy non-ns datetime type (:issue:`27395`)
292326
- Bug in :meth:`Float64Index.astype` where ``np.inf`` was not handled properly when casting to an integer dtype (:issue:`28475`)
327+
- :meth:`Index.union` could fail when the left contained duplicates (:issue:`28257`)
328+
- :meth:`Index.get_indexer_non_unique` could fail with `TypeError` in some cases, such as when searching for ints in a string index (:issue:`28257`)
329+
-
293330

294331
Missing
295332
^^^^^^^

pandas/_libs/algos_take_helper.pxi.in

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -276,7 +276,6 @@ cdef _take_2d(ndarray[take_t, ndim=2] values, object idx):
276276
Py_ssize_t i, j, N, K
277277
ndarray[Py_ssize_t, ndim=2, cast=True] indexer = idx
278278
ndarray[take_t, ndim=2] result
279-
object val
280279

281280
N, K = (<object>values).shape
282281

0 commit comments

Comments
 (0)