From b34da222f774d64922f046979c32fb45aae50235 Mon Sep 17 00:00:00 2001 From: tp Date: Sun, 25 Jul 2021 09:51:36 +0100 Subject: [PATCH 1/4] DOC: NumericIndex --- doc/source/reference/indexing.rst | 1 + doc/source/user_guide/advanced.rst | 46 ++++++++++++++++++++++++++++ doc/source/whatsnew/v1.4.0.rst | 49 ++++++++++++++++++++++++++++-- 3 files changed, 93 insertions(+), 3 deletions(-) diff --git a/doc/source/reference/indexing.rst b/doc/source/reference/indexing.rst index 1a8c21a2c1a74..6e58f487d5f4a 100644 --- a/doc/source/reference/indexing.rst +++ b/doc/source/reference/indexing.rst @@ -170,6 +170,7 @@ Numeric Index :toctree: api/ :template: autosummary/class_without_autosummary.rst + NumericIndex RangeIndex Int64Index UInt64Index diff --git a/doc/source/user_guide/advanced.rst b/doc/source/user_guide/advanced.rst index 3b33ebe701037..bdc55bf673573 100644 --- a/doc/source/user_guide/advanced.rst +++ b/doc/source/user_guide/advanced.rst @@ -851,6 +851,13 @@ values **not** in the categories, similarly to how you can reindex **any** panda Int64Index and RangeIndex ~~~~~~~~~~~~~~~~~~~~~~~~~ +.. note:: + + In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types + instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types + will be removed. See :ref:`here ` for more. + ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. + :class:`Int64Index` is a fundamental basic index in pandas. This is an immutable array implementing an ordered, sliceable set. @@ -862,6 +869,13 @@ implementing an ordered, sliceable set. Float64Index ~~~~~~~~~~~~ +.. note:: + + In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types + instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types + will be removed. See :ref:`here ` for more. + ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. + By default a :class:`Float64Index` will be automatically created when passing floating, or mixed-integer-floating values in index creation. This enables a pure label-based slicing paradigm that makes ``[],ix,loc`` for scalar indexing and slicing work exactly the same. @@ -956,6 +970,38 @@ If you need integer based selection, you should use ``iloc``: dfir.iloc[0:5] + +.. _indexing.numericindex: + +NumericIndex +~~~~~~~~~~~~ + +.. versionadded:: 1.4.0 + +.. note:: + + In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types + instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types + will be removed. See :ref:`here ` for more. + ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. + +:class:`NumericIndex` is an index type that can hold data of any numpy int/uint/float dtype. For example: + +.. ipython:: python + + index = pd.NumericIndex([1, 2, 4, 5], dtype="int8") + index + ser = pd.Series(range(5), index=index) + ser + +``NumericIndex`` works the same way as the existing ``Int64Index``, ``Float64Index`` and +``UInt64Index`` except that it can hold any numpy int, uint or float dtype. + +Until Pandas 2.0, you will have to call ``NumericIndex`` explicitly in order to use it, like in the example above. +In Pandas 2.0, ``NumericIndex`` will become the default pandas numeric index type and will automatically be used where appropriate. + +Please notice that ``NumericIndex`` *can not* hold Pandas numeric dtypes (:class:`Int64Dtype`, :class:`Int32Dtype` etc.). + .. _advanced.intervalindex: IntervalIndex diff --git a/doc/source/whatsnew/v1.4.0.rst b/doc/source/whatsnew/v1.4.0.rst index fa9c424351b00..c17981db71c06 100644 --- a/doc/source/whatsnew/v1.4.0.rst +++ b/doc/source/whatsnew/v1.4.0.rst @@ -15,10 +15,53 @@ including other versions of pandas. Enhancements ~~~~~~~~~~~~ -.. _whatsnew_140.enhancements.enhancement1: +.. _whatsnew_140.enhancements.numeric_index: -enhancement1 -^^^^^^^^^^^^ +More flexible numeric dtypes for indexes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Until now, it has only been possible to create numeric indexes with int64/float64/uint64 dtypes, +but not with lower bit sizes (int32, int8, uint32, uint8 etc). It is now possible to create +an index of any numpy dtype using the new :class:`NumericIndex` (:issue:`41153`): + +.. ipython:: python + + pd.NumericIndex([1, 2, 3], dtype="int8") + pd.NumericIndex([1, 2, 3], dtype="uint32") + pd.NumericIndex([1, 2, 3], dtype="float32") + +In order to maintain backwards compatibility, calls to the base :class:`Index` will in +pandas 1.x. return :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index`. +For example, notice that the code below returns an ``Int64Index`` with dtype ``int64``: + +.. code-block:: ipython + + In [1]: pd.Index([1, 2, 3], dtype="int8") + Int64Index([1, 2, 3], dtype='int64') + +For the duration of Pandas 1.x, in order to maintain backwards compatibility, all +operations that until now have returned :class:`Int64Index`, :class:`UInt64Index` and +:class:`Float64Index` will continue to so. This means, that in order to use +``NumericIndex``, you will have to call it explicitly. For example: + +.. code-block:: ipython + + In [2]: ser = pd.Series([1, 2, 3], index=[1, 2, 3]) + In [3]: ser.index + Int64Index([1, 2, 3], dtype='int64') + +Instead if you want to use a ``NumericIndex``, you should do: + +.. code-block:: ipython + + In [2]: ser = pd.Series([1, 2, 3], index=pd.NumericIndex([1, 2, 3], dtype="int8")) + In [3]: ser.index + NumericIndex([1, 2, 3], dtype='int8') + +In Pandas 2.0, :class:`NumericIndex` will become the default numeric index type and +``Int64Index``, ``UInt64Index`` and ``Float64Index`` will be removed. + +See :ref:`here ` for more. .. _whatsnew_140.enhancements.enhancement2: From fc8a75e5bf75ec1427421d74094c1639db322a67 Mon Sep 17 00:00:00 2001 From: tp Date: Fri, 6 Aug 2021 14:20:14 +0100 Subject: [PATCH 2/4] Make NumericIndex public --- pandas/__init__.py | 1 + pandas/_testing/__init__.py | 2 +- pandas/core/indexes/category.py | 2 +- pandas/tests/api/test_api.py | 1 + pandas/tests/base/test_unique.py | 2 +- pandas/tests/indexes/common.py | 2 +- pandas/tests/indexes/numeric/test_numeric.py | 2 +- pandas/tests/indexes/test_common.py | 2 +- pandas/tests/indexes/test_numpy_compat.py | 2 +- 9 files changed, 9 insertions(+), 7 deletions(-) diff --git a/pandas/__init__.py b/pandas/__init__.py index 43f05617584cc..d8df7a42911ab 100644 --- a/pandas/__init__.py +++ b/pandas/__init__.py @@ -75,6 +75,7 @@ UInt64Index, RangeIndex, Float64Index, + NumericIndex, MultiIndex, IntervalIndex, TimedeltaIndex, diff --git a/pandas/_testing/__init__.py b/pandas/_testing/__init__.py index 97e07a76b9149..793afc1532be4 100644 --- a/pandas/_testing/__init__.py +++ b/pandas/_testing/__init__.py @@ -50,6 +50,7 @@ Int64Index, IntervalIndex, MultiIndex, + NumericIndex, RangeIndex, Series, UInt64Index, @@ -105,7 +106,6 @@ use_numexpr, with_csv_dialect, ) -from pandas.core.api import NumericIndex from pandas.core.arrays import ( DatetimeArray, PandasArray, diff --git a/pandas/core/indexes/category.py b/pandas/core/indexes/category.py index 2faf2cab75117..fcdbbb4c6e48e 100644 --- a/pandas/core/indexes/category.py +++ b/pandas/core/indexes/category.py @@ -283,7 +283,7 @@ def _is_dtype_compat(self, other) -> Categorical: @doc(Index.astype) def astype(self, dtype: Dtype, copy: bool = True) -> Index: - from pandas.core.api import NumericIndex + from pandas import NumericIndex dtype = pandas_dtype(dtype) diff --git a/pandas/tests/api/test_api.py b/pandas/tests/api/test_api.py index 95dc1d82cb286..7173a43d4c5e6 100644 --- a/pandas/tests/api/test_api.py +++ b/pandas/tests/api/test_api.py @@ -68,6 +68,7 @@ class TestPDApi(Base): "Index", "Int64Index", "MultiIndex", + "NumericIndex", "Period", "PeriodIndex", "RangeIndex", diff --git a/pandas/tests/base/test_unique.py b/pandas/tests/base/test_unique.py index 6ca5f2f76861e..9124e3d546123 100644 --- a/pandas/tests/base/test_unique.py +++ b/pandas/tests/base/test_unique.py @@ -9,8 +9,8 @@ ) import pandas as pd +from pandas import NumericIndex import pandas._testing as tm -from pandas.core.api import NumericIndex from pandas.tests.base.common import allow_na_ops diff --git a/pandas/tests/indexes/common.py b/pandas/tests/indexes/common.py index 2c4067c347a35..614b565ae1500 100644 --- a/pandas/tests/indexes/common.py +++ b/pandas/tests/indexes/common.py @@ -26,6 +26,7 @@ Int64Index, IntervalIndex, MultiIndex, + NumericIndex, PeriodIndex, RangeIndex, Series, @@ -34,7 +35,6 @@ ) from pandas import UInt64Index # noqa:F401 import pandas._testing as tm -from pandas.core.api import NumericIndex from pandas.core.indexes.datetimelike import DatetimeIndexOpsMixin diff --git a/pandas/tests/indexes/numeric/test_numeric.py b/pandas/tests/indexes/numeric/test_numeric.py index e7dd547b3e73e..395ccb6d306ae 100644 --- a/pandas/tests/indexes/numeric/test_numeric.py +++ b/pandas/tests/indexes/numeric/test_numeric.py @@ -8,11 +8,11 @@ Float64Index, Index, Int64Index, + NumericIndex, Series, UInt64Index, ) import pandas._testing as tm -from pandas.core.api import NumericIndex from pandas.tests.indexes.common import NumericBase diff --git a/pandas/tests/indexes/test_common.py b/pandas/tests/indexes/test_common.py index 8facaf279f2cf..33aa8bbb942d5 100644 --- a/pandas/tests/indexes/test_common.py +++ b/pandas/tests/indexes/test_common.py @@ -21,12 +21,12 @@ CategoricalIndex, DatetimeIndex, MultiIndex, + NumericIndex, PeriodIndex, RangeIndex, TimedeltaIndex, ) import pandas._testing as tm -from pandas.core.api import NumericIndex class TestCommon: diff --git a/pandas/tests/indexes/test_numpy_compat.py b/pandas/tests/indexes/test_numpy_compat.py index 80ba0c53fb9c4..3e88dbafdb7f5 100644 --- a/pandas/tests/indexes/test_numpy_compat.py +++ b/pandas/tests/indexes/test_numpy_compat.py @@ -5,11 +5,11 @@ DatetimeIndex, Float64Index, Index, + NumericIndex, PeriodIndex, TimedeltaIndex, ) import pandas._testing as tm -from pandas.core.api import NumericIndex from pandas.core.indexes.datetimelike import DatetimeIndexOpsMixin From 15870f545f08e17e8be3a5335dfc731476a7aff1 Mon Sep 17 00:00:00 2001 From: tp Date: Fri, 6 Aug 2021 14:55:57 +0100 Subject: [PATCH 3/4] fix doc example --- doc/source/user_guide/advanced.rst | 8 ++++---- doc/source/whatsnew/v1.4.0.rst | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/doc/source/user_guide/advanced.rst b/doc/source/user_guide/advanced.rst index bdc55bf673573..7cad4d719e933 100644 --- a/doc/source/user_guide/advanced.rst +++ b/doc/source/user_guide/advanced.rst @@ -855,7 +855,7 @@ Int64Index and RangeIndex In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types - will be removed. See :ref:`here ` for more. + will be removed. See :ref:`here ` for more. ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. :class:`Int64Index` is a fundamental basic index in pandas. This is an immutable array @@ -873,7 +873,7 @@ Float64Index In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types - will be removed. See :ref:`here ` for more. + will be removed. See :ref:`here ` for more. ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. By default a :class:`Float64Index` will be automatically created when passing floating, or mixed-integer-floating values in index creation. @@ -982,7 +982,7 @@ NumericIndex In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types - will be removed. See :ref:`here ` for more. + will be removed. See :ref:`here ` for more. ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. :class:`NumericIndex` is an index type that can hold data of any numpy int/uint/float dtype. For example: @@ -991,7 +991,7 @@ NumericIndex index = pd.NumericIndex([1, 2, 4, 5], dtype="int8") index - ser = pd.Series(range(5), index=index) + ser = pd.Series(range(4), index=index) ser ``NumericIndex`` works the same way as the existing ``Int64Index``, ``Float64Index`` and diff --git a/doc/source/whatsnew/v1.4.0.rst b/doc/source/whatsnew/v1.4.0.rst index c17981db71c06..cc92bfceef563 100644 --- a/doc/source/whatsnew/v1.4.0.rst +++ b/doc/source/whatsnew/v1.4.0.rst @@ -61,7 +61,7 @@ Instead if you want to use a ``NumericIndex``, you should do: In Pandas 2.0, :class:`NumericIndex` will become the default numeric index type and ``Int64Index``, ``UInt64Index`` and ``Float64Index`` will be removed. -See :ref:`here ` for more. +See :ref:`here ` for more. .. _whatsnew_140.enhancements.enhancement2: From a9965b0fae9d2e13ee37c3b8325a38ae7fab56c7 Mon Sep 17 00:00:00 2001 From: tp Date: Fri, 6 Aug 2021 23:01:19 +0100 Subject: [PATCH 4/4] clean-ups --- doc/source/user_guide/advanced.rst | 24 ++++++++++++------------ doc/source/user_guide/categorical.rst | 2 +- doc/source/whatsnew/v0.13.0.rst | 2 +- doc/source/whatsnew/v0.16.1.rst | 2 +- doc/source/whatsnew/v1.4.0.rst | 20 ++++++++++---------- 5 files changed, 25 insertions(+), 25 deletions(-) diff --git a/doc/source/user_guide/advanced.rst b/doc/source/user_guide/advanced.rst index 7cad4d719e933..535b503e4372c 100644 --- a/doc/source/user_guide/advanced.rst +++ b/doc/source/user_guide/advanced.rst @@ -7,7 +7,7 @@ MultiIndex / advanced indexing ****************************** This section covers :ref:`indexing with a MultiIndex ` -and :ref:`other advanced indexing features `. +and :ref:`other advanced indexing features `. See the :ref:`Indexing and Selecting Data ` for general indexing documentation. @@ -738,7 +738,7 @@ faster than fancy indexing. %timeit ser.iloc[indexer] %timeit ser.take(indexer) -.. _indexing.index_types: +.. _advanced.index_types: Index types ----------- @@ -749,7 +749,7 @@ and documentation about ``TimedeltaIndex`` is found :ref:`here ` for more. + will be removed. See :ref:`here ` for more. ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. :class:`Int64Index` is a fundamental basic index in pandas. This is an immutable array @@ -864,7 +864,7 @@ implementing an ordered, sliceable set. :class:`RangeIndex` is a sub-class of ``Int64Index`` that provides the default index for all ``NDFrame`` objects. ``RangeIndex`` is an optimized version of ``Int64Index`` that can represent a monotonic ordered set. These are analogous to Python `range types `__. -.. _indexing.float64index: +.. _advanced.float64index: Float64Index ~~~~~~~~~~~~ @@ -873,7 +873,7 @@ Float64Index In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types - will be removed. See :ref:`here ` for more. + will be removed. See :ref:`here ` for more. ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. By default a :class:`Float64Index` will be automatically created when passing floating, or mixed-integer-floating values in index creation. @@ -971,7 +971,7 @@ If you need integer based selection, you should use ``iloc``: dfir.iloc[0:5] -.. _indexing.numericindex: +.. _advanced.numericindex: NumericIndex ~~~~~~~~~~~~ @@ -982,16 +982,16 @@ NumericIndex In pandas 2.0, :class:`NumericIndex` will become the default index type for numeric types instead of ``Int64Index``, ``Float64Index`` and ``UInt64Index`` and those index types - will be removed. See :ref:`here ` for more. + will be removed. ``RangeIndex`` however, will not be removed, as it represents an optimized version of an integer index. :class:`NumericIndex` is an index type that can hold data of any numpy int/uint/float dtype. For example: .. ipython:: python - index = pd.NumericIndex([1, 2, 4, 5], dtype="int8") - index - ser = pd.Series(range(4), index=index) + idx = pd.NumericIndex([1, 2, 4, 5], dtype="int8") + idx + ser = pd.Series(range(4), index=idx) ser ``NumericIndex`` works the same way as the existing ``Int64Index``, ``Float64Index`` and diff --git a/doc/source/user_guide/categorical.rst b/doc/source/user_guide/categorical.rst index 6f9d8eb3474c2..0105cf99193dd 100644 --- a/doc/source/user_guide/categorical.rst +++ b/doc/source/user_guide/categorical.rst @@ -1141,7 +1141,7 @@ Categorical index ``CategoricalIndex`` is a type of index that is useful for supporting indexing with duplicates. This is a container around a ``Categorical`` and allows efficient indexing and storage of an index with a large number of duplicated elements. -See the :ref:`advanced indexing docs ` for a more detailed +See the :ref:`advanced indexing docs ` for a more detailed explanation. Setting the index will create a ``CategoricalIndex``: diff --git a/doc/source/whatsnew/v0.13.0.rst b/doc/source/whatsnew/v0.13.0.rst index 3c6b70fb21383..b2596358d0c9d 100644 --- a/doc/source/whatsnew/v0.13.0.rst +++ b/doc/source/whatsnew/v0.13.0.rst @@ -310,7 +310,7 @@ Float64Index API change - Added a new index type, ``Float64Index``. This will be automatically created when passing floating values in index creation. This enables a pure label-based slicing paradigm that makes ``[],ix,loc`` for scalar indexing and slicing work exactly the - same. See :ref:`the docs`, (:issue:`263`) + same. See :ref:`the docs`, (:issue:`263`) Construction is by default for floating type values. diff --git a/doc/source/whatsnew/v0.16.1.rst b/doc/source/whatsnew/v0.16.1.rst index 269854111373f..cbf5b7703bd79 100644 --- a/doc/source/whatsnew/v0.16.1.rst +++ b/doc/source/whatsnew/v0.16.1.rst @@ -168,7 +168,7 @@ values NOT in the categories, similarly to how you can reindex ANY pandas index. ordered=False, name='B', dtype='category') -See the :ref:`documentation ` for more. (:issue:`7629`, :issue:`10038`, :issue:`10039`) +See the :ref:`documentation ` for more. (:issue:`7629`, :issue:`10038`, :issue:`10039`) .. _whatsnew_0161.enhancements.sample: diff --git a/doc/source/whatsnew/v1.4.0.rst b/doc/source/whatsnew/v1.4.0.rst index cc92bfceef563..9076a36ebbc50 100644 --- a/doc/source/whatsnew/v1.4.0.rst +++ b/doc/source/whatsnew/v1.4.0.rst @@ -20,9 +20,8 @@ Enhancements More flexible numeric dtypes for indexes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Until now, it has only been possible to create numeric indexes with int64/float64/uint64 dtypes, -but not with lower bit sizes (int32, int8, uint32, uint8 etc). It is now possible to create -an index of any numpy dtype using the new :class:`NumericIndex` (:issue:`41153`): +Until now, it has only been possible to create numeric indexes with int64/float64/uint64 dtypes. +It is now possible to create an index of any numpy int/uint/float dtype using the new :class:`NumericIndex` index type (:issue:`41153`): .. ipython:: python @@ -32,7 +31,7 @@ an index of any numpy dtype using the new :class:`NumericIndex` (:issue:`41153`) In order to maintain backwards compatibility, calls to the base :class:`Index` will in pandas 1.x. return :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index`. -For example, notice that the code below returns an ``Int64Index`` with dtype ``int64``: +For example, the code below returns an ``Int64Index`` with dtype ``int64``: .. code-block:: ipython @@ -42,7 +41,8 @@ For example, notice that the code below returns an ``Int64Index`` with dtype ``i For the duration of Pandas 1.x, in order to maintain backwards compatibility, all operations that until now have returned :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` will continue to so. This means, that in order to use -``NumericIndex``, you will have to call it explicitly. For example: +``NumericIndex``, you will have to call ``NumericIndex`` explicitly. For example the below series +will have an ``Int64Index``: .. code-block:: ipython @@ -52,16 +52,16 @@ operations that until now have returned :class:`Int64Index`, :class:`UInt64Index Instead if you want to use a ``NumericIndex``, you should do: -.. code-block:: ipython +.. ipython:: python - In [2]: ser = pd.Series([1, 2, 3], index=pd.NumericIndex([1, 2, 3], dtype="int8")) - In [3]: ser.index - NumericIndex([1, 2, 3], dtype='int8') + idx = pd.NumericIndex([1, 2, 3], dtype="int8") + ser = pd.Series([1, 2, 3], index=idx) + ser.index In Pandas 2.0, :class:`NumericIndex` will become the default numeric index type and ``Int64Index``, ``UInt64Index`` and ``Float64Index`` will be removed. -See :ref:`here ` for more. +See :ref:`here ` for more. .. _whatsnew_140.enhancements.enhancement2: