From b7ba2416548cef5a32ec26fe72694859a0977e29 Mon Sep 17 00:00:00 2001 From: Joris Van den Bossche Date: Wed, 29 Jan 2020 21:28:10 +0100 Subject: [PATCH 1/2] DOC: separate section with experimental features in 1.0.0 whatsnew --- doc/source/whatsnew/v1.0.0.rst | 159 +++++++++++++++++---------------- 1 file changed, 82 insertions(+), 77 deletions(-) diff --git a/doc/source/whatsnew/v1.0.0.rst b/doc/source/whatsnew/v1.0.0.rst index 2abe85f042af1..8bbaeae32b3ee 100755 --- a/doc/source/whatsnew/v1.0.0.rst +++ b/doc/source/whatsnew/v1.0.0.rst @@ -37,6 +37,79 @@ See :ref:`policies.version` for more. Enhancements ~~~~~~~~~~~~ +.. _whatsnew_100.numba_rolling_apply: + +Using Numba in ``rolling.apply`` and ``expanding.apply`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +We've added an ``engine`` keyword to :meth:`~core.window.rolling.Rolling.apply` and :meth:`~core.window.expanding.Expanding.apply` +that allows the user to execute the routine using `Numba `__ instead of Cython. +Using the Numba engine can yield significant performance gains if the apply function can operate on numpy arrays and +the data set is larger (1 million rows or greater). For more details, see +:ref:`rolling apply documentation ` (:issue:`28987`, :issue:`30936`) + +.. _whatsnew_100.custom_window: + +Defining custom windows for rolling operations +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +We've added a :func:`pandas.api.indexers.BaseIndexer` class that allows users to define how +window bounds are created during ``rolling`` operations. Users can define their own ``get_window_bounds`` +method on a :func:`pandas.api.indexers.BaseIndexer` subclass that will generate the start and end +indices used for each window during the rolling aggregation. For more details and example usage, see +the :ref:`custom window rolling documentation ` + +.. _whatsnew_100.to_markdown: + +Converting to Markdown +^^^^^^^^^^^^^^^^^^^^^^ + +We've added :meth:`~DataFrame.to_markdown` for creating a markdown table (:issue:`11052`) + +.. ipython:: python + + df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b']) + print(df.to_markdown()) + +.. _whatsnew_100.enhancements.other: + +Other enhancements +^^^^^^^^^^^^^^^^^^ + +- :meth:`DataFrame.to_string` added the ``max_colwidth`` parameter to control when wide columns are truncated (:issue:`9784`) +- Added the ``na_value`` argument to :meth:`Series.to_numpy`, :meth:`Index.to_numpy` and :meth:`DataFrame.to_numpy` to control the value used for missing data (:issue:`30322`) +- :meth:`MultiIndex.from_product` infers level names from inputs if not explicitly provided (:issue:`27292`) +- :meth:`DataFrame.to_latex` now accepts ``caption`` and ``label`` arguments (:issue:`25436`) +- DataFrames with :ref:`nullable integer `, the :ref:`new string dtype ` + and period data type can now be converted to ``pyarrow`` (>=0.15.0), which means that it is + supported in writing to the Parquet file format when using the ``pyarrow`` engine (:issue:`28368`). + Full roundtrip to parquet (writing and reading back in with :meth:`~DataFrame.to_parquet` / :func:`read_parquet`) + is supported starting with pyarrow >= 0.16 (:issue:`20612`). +- :func:`to_parquet` now appropriately handles the ``schema`` argument for user defined schemas in the pyarrow engine. (:issue:`30270`) +- :meth:`DataFrame.to_json` now accepts an ``indent`` integer argument to enable pretty printing of JSON output (:issue:`12004`) +- :meth:`read_stata` can read Stata 119 dta files. (:issue:`28250`) +- Implemented :meth:`pandas.core.window.Window.var` and :meth:`pandas.core.window.Window.std` functions (:issue:`26597`) +- Added ``encoding`` argument to :meth:`DataFrame.to_string` for non-ascii text (:issue:`28766`) +- Added ``encoding`` argument to :func:`DataFrame.to_html` for non-ascii text (:issue:`28663`) +- :meth:`Styler.background_gradient` now accepts ``vmin`` and ``vmax`` arguments (:issue:`12145`) +- :meth:`Styler.format` added the ``na_rep`` parameter to help format the missing values (:issue:`21527`, :issue:`28358`) +- :func:`read_excel` now can read binary Excel (``.xlsb``) files by passing ``engine='pyxlsb'``. For more details and example usage, see the :ref:`Binary Excel files documentation `. Closes :issue:`8540`. +- The ``partition_cols`` argument in :meth:`DataFrame.to_parquet` now accepts a string (:issue:`27117`) +- :func:`pandas.read_json` now parses ``NaN``, ``Infinity`` and ``-Infinity`` (:issue:`12213`) +- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`) +- :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` have gained ``ignore_index`` keyword to be able to reset index after sorting (:issue:`30114`) +- :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` have gained ``ignore_index`` keyword to reset index (:issue:`30114`) +- :meth:`DataFrame.drop_duplicates` has gained ``ignore_index`` keyword to reset index (:issue:`30114`) +- Added new writer for exporting Stata dta files in versions 118 and 119, ``StataWriterUTF8``. These files formats support exporting strings containing Unicode characters. Format 119 supports data sets with more than 32,767 variables (:issue:`23573`, :issue:`30959`) +- :meth:`Series.map` now accepts ``collections.abc.Mapping`` subclasses as a mapper (:issue:`29733`) +- Added an experimental :attr:`~DataFrame.attrs` for storing global metadata about a dataset (:issue:`29062`) +- :meth:`Timestamp.fromisocalendar` is now compatible with python 3.8 and above (:issue:`28115`) +- :meth:`DataFrame.to_pickle` and :func:`read_pickle` now accept URL (:issue:`30163`) + + +Experimental new features +~~~~~~~~~~~~~~~~~~~~~~~~~ + .. _whatsnew_100.NA: Experimental ``NA`` scalar to denote missing values @@ -187,83 +260,6 @@ This is especially useful after reading in data using readers such as :func:`rea and :func:`read_excel`. See :ref:`here ` for a description. -.. _whatsnew_100.numba_rolling_apply: - -Using Numba in ``rolling.apply`` and ``expanding.apply`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -We've added an ``engine`` keyword to :meth:`~core.window.rolling.Rolling.apply` and :meth:`~core.window.expanding.Expanding.apply` -that allows the user to execute the routine using `Numba `__ instead of Cython. -Using the Numba engine can yield significant performance gains if the apply function can operate on numpy arrays and -the data set is larger (1 million rows or greater). For more details, see -:ref:`rolling apply documentation ` (:issue:`28987`, :issue:`30936`) - -.. _whatsnew_100.custom_window: - -Defining custom windows for rolling operations -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -We've added a :func:`pandas.api.indexers.BaseIndexer` class that allows users to define how -window bounds are created during ``rolling`` operations. Users can define their own ``get_window_bounds`` -method on a :func:`pandas.api.indexers.BaseIndexer` subclass that will generate the start and end -indices used for each window during the rolling aggregation. For more details and example usage, see -the :ref:`custom window rolling documentation ` - -.. _whatsnew_100.to_markdown: - -Converting to Markdown -^^^^^^^^^^^^^^^^^^^^^^ - -We've added :meth:`~DataFrame.to_markdown` for creating a markdown table (:issue:`11052`) - -.. ipython:: python - - df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b']) - print(df.to_markdown()) - -.. _whatsnew_100.enhancements.other: - -Other enhancements -^^^^^^^^^^^^^^^^^^ - -- :meth:`DataFrame.to_string` added the ``max_colwidth`` parameter to control when wide columns are truncated (:issue:`9784`) -- Added the ``na_value`` argument to :meth:`Series.to_numpy`, :meth:`Index.to_numpy` and :meth:`DataFrame.to_numpy` to control the value used for missing data (:issue:`30322`) -- :meth:`MultiIndex.from_product` infers level names from inputs if not explicitly provided (:issue:`27292`) -- :meth:`DataFrame.to_latex` now accepts ``caption`` and ``label`` arguments (:issue:`25436`) -- DataFrames with :ref:`nullable integer `, the :ref:`new string dtype ` - and period data type can now be converted to ``pyarrow`` (>=0.15.0), which means that it is - supported in writing to the Parquet file format when using the ``pyarrow`` engine (:issue:`28368`). - Full roundtrip to parquet (writing and reading back in with :meth:`~DataFrame.to_parquet` / :func:`read_parquet`) - is supported starting with pyarrow >= 0.16 (:issue:`20612`). -- :func:`to_parquet` now appropriately handles the ``schema`` argument for user defined schemas in the pyarrow engine. (:issue:`30270`) -- :meth:`DataFrame.to_json` now accepts an ``indent`` integer argument to enable pretty printing of JSON output (:issue:`12004`) -- :meth:`read_stata` can read Stata 119 dta files. (:issue:`28250`) -- Implemented :meth:`pandas.core.window.Window.var` and :meth:`pandas.core.window.Window.std` functions (:issue:`26597`) -- Added ``encoding`` argument to :meth:`DataFrame.to_string` for non-ascii text (:issue:`28766`) -- Added ``encoding`` argument to :func:`DataFrame.to_html` for non-ascii text (:issue:`28663`) -- :meth:`Styler.background_gradient` now accepts ``vmin`` and ``vmax`` arguments (:issue:`12145`) -- :meth:`Styler.format` added the ``na_rep`` parameter to help format the missing values (:issue:`21527`, :issue:`28358`) -- :func:`read_excel` now can read binary Excel (``.xlsb``) files by passing ``engine='pyxlsb'``. For more details and example usage, see the :ref:`Binary Excel files documentation `. Closes :issue:`8540`. -- The ``partition_cols`` argument in :meth:`DataFrame.to_parquet` now accepts a string (:issue:`27117`) -- :func:`pandas.read_json` now parses ``NaN``, ``Infinity`` and ``-Infinity`` (:issue:`12213`) -- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`) -- :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` have gained ``ignore_index`` keyword to be able to reset index after sorting (:issue:`30114`) -- :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` have gained ``ignore_index`` keyword to reset index (:issue:`30114`) -- :meth:`DataFrame.drop_duplicates` has gained ``ignore_index`` keyword to reset index (:issue:`30114`) -- Added new writer for exporting Stata dta files in versions 118 and 119, ``StataWriterUTF8``. These files formats support exporting strings containing Unicode characters. Format 119 supports data sets with more than 32,767 variables (:issue:`23573`, :issue:`30959`) -- :meth:`Series.map` now accepts ``collections.abc.Mapping`` subclasses as a mapper (:issue:`29733`) -- Added an experimental :attr:`~DataFrame.attrs` for storing global metadata about a dataset (:issue:`29062`) -- :meth:`Timestamp.fromisocalendar` is now compatible with python 3.8 and above (:issue:`28115`) -- :meth:`DataFrame.to_pickle` and :func:`read_pickle` now accept URL (:issue:`30163`) - - -Build Changes -^^^^^^^^^^^^^ - -Pandas has added a `pyproject.toml `_ file and will no longer include -cythonized files in the source distribution uploaded to PyPI (:issue:`28341`, :issue:`20775`). If you're installing -a built distribution (wheel) or via conda, this shouldn't have any effect on you. If you're building pandas from -source, you should no longer need to install Cython into your build environment before calling ``pip install pandas``. .. --------------------------------------------------------------------------- @@ -701,6 +697,15 @@ Optional libraries below the lowest tested version may still work, but are not c See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more. +Build Changes +^^^^^^^^^^^^^ + +Pandas has added a `pyproject.toml `_ file and will no longer include +cythonized files in the source distribution uploaded to PyPI (:issue:`28341`, :issue:`20775`). If you're installing +a built distribution (wheel) or via conda, this shouldn't have any effect on you. If you're building pandas from +source, you should no longer need to install Cython into your build environment before calling ``pip install pandas``. + + .. _whatsnew_100.api.other: Other API changes From be11b0c2ce30c91039702dceb0551f64224c2dcb Mon Sep 17 00:00:00 2001 From: Joris Van den Bossche Date: Wed, 29 Jan 2020 21:34:25 +0100 Subject: [PATCH 2/2] move other down --- doc/source/whatsnew/v1.0.0.rst | 73 +++++++++++++++++----------------- 1 file changed, 37 insertions(+), 36 deletions(-) diff --git a/doc/source/whatsnew/v1.0.0.rst b/doc/source/whatsnew/v1.0.0.rst index 8bbaeae32b3ee..6586c19ee3426 100755 --- a/doc/source/whatsnew/v1.0.0.rst +++ b/doc/source/whatsnew/v1.0.0.rst @@ -71,42 +71,6 @@ We've added :meth:`~DataFrame.to_markdown` for creating a markdown table (:issue df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b']) print(df.to_markdown()) -.. _whatsnew_100.enhancements.other: - -Other enhancements -^^^^^^^^^^^^^^^^^^ - -- :meth:`DataFrame.to_string` added the ``max_colwidth`` parameter to control when wide columns are truncated (:issue:`9784`) -- Added the ``na_value`` argument to :meth:`Series.to_numpy`, :meth:`Index.to_numpy` and :meth:`DataFrame.to_numpy` to control the value used for missing data (:issue:`30322`) -- :meth:`MultiIndex.from_product` infers level names from inputs if not explicitly provided (:issue:`27292`) -- :meth:`DataFrame.to_latex` now accepts ``caption`` and ``label`` arguments (:issue:`25436`) -- DataFrames with :ref:`nullable integer `, the :ref:`new string dtype ` - and period data type can now be converted to ``pyarrow`` (>=0.15.0), which means that it is - supported in writing to the Parquet file format when using the ``pyarrow`` engine (:issue:`28368`). - Full roundtrip to parquet (writing and reading back in with :meth:`~DataFrame.to_parquet` / :func:`read_parquet`) - is supported starting with pyarrow >= 0.16 (:issue:`20612`). -- :func:`to_parquet` now appropriately handles the ``schema`` argument for user defined schemas in the pyarrow engine. (:issue:`30270`) -- :meth:`DataFrame.to_json` now accepts an ``indent`` integer argument to enable pretty printing of JSON output (:issue:`12004`) -- :meth:`read_stata` can read Stata 119 dta files. (:issue:`28250`) -- Implemented :meth:`pandas.core.window.Window.var` and :meth:`pandas.core.window.Window.std` functions (:issue:`26597`) -- Added ``encoding`` argument to :meth:`DataFrame.to_string` for non-ascii text (:issue:`28766`) -- Added ``encoding`` argument to :func:`DataFrame.to_html` for non-ascii text (:issue:`28663`) -- :meth:`Styler.background_gradient` now accepts ``vmin`` and ``vmax`` arguments (:issue:`12145`) -- :meth:`Styler.format` added the ``na_rep`` parameter to help format the missing values (:issue:`21527`, :issue:`28358`) -- :func:`read_excel` now can read binary Excel (``.xlsb``) files by passing ``engine='pyxlsb'``. For more details and example usage, see the :ref:`Binary Excel files documentation `. Closes :issue:`8540`. -- The ``partition_cols`` argument in :meth:`DataFrame.to_parquet` now accepts a string (:issue:`27117`) -- :func:`pandas.read_json` now parses ``NaN``, ``Infinity`` and ``-Infinity`` (:issue:`12213`) -- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`) -- :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` have gained ``ignore_index`` keyword to be able to reset index after sorting (:issue:`30114`) -- :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` have gained ``ignore_index`` keyword to reset index (:issue:`30114`) -- :meth:`DataFrame.drop_duplicates` has gained ``ignore_index`` keyword to reset index (:issue:`30114`) -- Added new writer for exporting Stata dta files in versions 118 and 119, ``StataWriterUTF8``. These files formats support exporting strings containing Unicode characters. Format 119 supports data sets with more than 32,767 variables (:issue:`23573`, :issue:`30959`) -- :meth:`Series.map` now accepts ``collections.abc.Mapping`` subclasses as a mapper (:issue:`29733`) -- Added an experimental :attr:`~DataFrame.attrs` for storing global metadata about a dataset (:issue:`29062`) -- :meth:`Timestamp.fromisocalendar` is now compatible with python 3.8 and above (:issue:`28115`) -- :meth:`DataFrame.to_pickle` and :func:`read_pickle` now accept URL (:issue:`30163`) - - Experimental new features ~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -261,6 +225,43 @@ and :func:`read_excel`. See :ref:`here ` for a description. +.. _whatsnew_100.enhancements.other: + +Other enhancements +~~~~~~~~~~~~~~~~~~ + +- :meth:`DataFrame.to_string` added the ``max_colwidth`` parameter to control when wide columns are truncated (:issue:`9784`) +- Added the ``na_value`` argument to :meth:`Series.to_numpy`, :meth:`Index.to_numpy` and :meth:`DataFrame.to_numpy` to control the value used for missing data (:issue:`30322`) +- :meth:`MultiIndex.from_product` infers level names from inputs if not explicitly provided (:issue:`27292`) +- :meth:`DataFrame.to_latex` now accepts ``caption`` and ``label`` arguments (:issue:`25436`) +- DataFrames with :ref:`nullable integer `, the :ref:`new string dtype ` + and period data type can now be converted to ``pyarrow`` (>=0.15.0), which means that it is + supported in writing to the Parquet file format when using the ``pyarrow`` engine (:issue:`28368`). + Full roundtrip to parquet (writing and reading back in with :meth:`~DataFrame.to_parquet` / :func:`read_parquet`) + is supported starting with pyarrow >= 0.16 (:issue:`20612`). +- :func:`to_parquet` now appropriately handles the ``schema`` argument for user defined schemas in the pyarrow engine. (:issue:`30270`) +- :meth:`DataFrame.to_json` now accepts an ``indent`` integer argument to enable pretty printing of JSON output (:issue:`12004`) +- :meth:`read_stata` can read Stata 119 dta files. (:issue:`28250`) +- Implemented :meth:`pandas.core.window.Window.var` and :meth:`pandas.core.window.Window.std` functions (:issue:`26597`) +- Added ``encoding`` argument to :meth:`DataFrame.to_string` for non-ascii text (:issue:`28766`) +- Added ``encoding`` argument to :func:`DataFrame.to_html` for non-ascii text (:issue:`28663`) +- :meth:`Styler.background_gradient` now accepts ``vmin`` and ``vmax`` arguments (:issue:`12145`) +- :meth:`Styler.format` added the ``na_rep`` parameter to help format the missing values (:issue:`21527`, :issue:`28358`) +- :func:`read_excel` now can read binary Excel (``.xlsb``) files by passing ``engine='pyxlsb'``. For more details and example usage, see the :ref:`Binary Excel files documentation `. Closes :issue:`8540`. +- The ``partition_cols`` argument in :meth:`DataFrame.to_parquet` now accepts a string (:issue:`27117`) +- :func:`pandas.read_json` now parses ``NaN``, ``Infinity`` and ``-Infinity`` (:issue:`12213`) +- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`) +- :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` have gained ``ignore_index`` keyword to be able to reset index after sorting (:issue:`30114`) +- :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` have gained ``ignore_index`` keyword to reset index (:issue:`30114`) +- :meth:`DataFrame.drop_duplicates` has gained ``ignore_index`` keyword to reset index (:issue:`30114`) +- Added new writer for exporting Stata dta files in versions 118 and 119, ``StataWriterUTF8``. These files formats support exporting strings containing Unicode characters. Format 119 supports data sets with more than 32,767 variables (:issue:`23573`, :issue:`30959`) +- :meth:`Series.map` now accepts ``collections.abc.Mapping`` subclasses as a mapper (:issue:`29733`) +- Added an experimental :attr:`~DataFrame.attrs` for storing global metadata about a dataset (:issue:`29062`) +- :meth:`Timestamp.fromisocalendar` is now compatible with python 3.8 and above (:issue:`28115`) +- :meth:`DataFrame.to_pickle` and :func:`read_pickle` now accept URL (:issue:`30163`) + + + .. --------------------------------------------------------------------------- .. _whatsnew_100.api_breaking: