diff --git a/doc/source/development/extending.rst b/doc/source/development/extending.rst index e341dcb8318bc..41ef258dce7e6 100644 --- a/doc/source/development/extending.rst +++ b/doc/source/development/extending.rst @@ -59,7 +59,7 @@ Now users can access your methods using the ``geo`` namespace: This can be a convenient way to extend pandas objects without subclassing them. If you write a custom accessor, make a pull request adding it to our -:ref:`ecosystem` page. +`Ecosystem `__ page. We highly recommend validating the data in your accessor's `__init__`. In our ``GeoAccessor``, we validate that the data contains the expected columns, @@ -91,7 +91,7 @@ objects). Many methods like :func:`pandas.isna` will dispatch to the extension type's implementation. If you're building a library that implements the interface, please publicize it -on :ref:`ecosystem.extensions`. +on `Extension data types `__. The interface consists of two classes. diff --git a/doc/source/development/index.rst b/doc/source/development/index.rst index 1228f00667f3a..7af47799ef466 100644 --- a/doc/source/development/index.rst +++ b/doc/source/development/index.rst @@ -17,4 +17,3 @@ Development extending developer policies - roadmap diff --git a/doc/source/development/roadmap.rst b/doc/source/development/roadmap.rst deleted file mode 100644 index 00598830e2fe9..0000000000000 --- a/doc/source/development/roadmap.rst +++ /dev/null @@ -1,193 +0,0 @@ -.. _roadmap: - -======= -Roadmap -======= - -This page provides an overview of the major themes in pandas' development. Each of -these items requires a relatively large amount of effort to implement. These may -be achieved more quickly with dedicated funding or interest from contributors. - -An item being on the roadmap does not mean that it will *necessarily* happen, even -with unlimited funding. During the implementation period we may discover issues -preventing the adoption of the feature. - -Additionally, an item *not* being on the roadmap does not exclude it from inclusion -in pandas. The roadmap is intended for larger, fundamental changes to the project that -are likely to take months or years of developer time. Smaller-scoped items will continue -to be tracked on our `issue tracker `__. - -See :ref:`roadmap.evolution` for proposing changes to this document. - -Extensibility -------------- - -Pandas :ref:`extending.extension-types` allow for extending NumPy types with custom -data types and array storage. Pandas uses extension types internally, and provides -an interface for 3rd-party libraries to define their own custom data types. - -Many parts of pandas still unintentionally convert data to a NumPy array. -These problems are especially pronounced for nested data. - -We'd like to improve the handling of extension arrays throughout the library, -making their behavior more consistent with the handling of NumPy arrays. We'll do this -by cleaning up pandas' internals and adding new methods to the extension array interface. - -String data type ----------------- - -Currently, pandas stores text data in an ``object`` -dtype NumPy array. -The current implementation has two primary drawbacks: First, ``object`` -dtype -is not specific to strings: any Python object can be stored in an ``object`` -dtype -array, not just strings. Second: this is not efficient. The NumPy memory model -isn't especially well-suited to variable width text data. - -To solve the first issue, we propose a new extension type for string data. This -will initially be opt-in, with users explicitly requesting ``dtype="string"``. -The array backing this string dtype may initially be the current implementation: -an ``object`` -dtype NumPy array of Python strings. - -To solve the second issue (performance), we'll explore alternative in-memory -array libraries (for example, Apache Arrow). As part of the work, we may -need to implement certain operations expected by pandas users (for example -the algorithm used in, ``Series.str.upper``). That work may be done outside of -pandas. - -Apache Arrow interoperability ------------------------------ - -`Apache Arrow `__ is a cross-language development -platform for in-memory data. The Arrow logical types are closely aligned with -typical pandas use cases. - -We'd like to provide better-integrated support for Arrow memory and data types -within pandas. This will let us take advantage of its I/O capabilities and -provide for better interoperability with other languages and libraries -using Arrow. - -Block manager rewrite ---------------------- - -We'd like to replace pandas current internal data structures (a collection of -1 or 2-D arrays) with a simpler collection of 1-D arrays. - -Pandas internal data model is quite complex. A DataFrame is made up of -one or more 2-dimensional "blocks", with one or more blocks per dtype. This -collection of 2-D arrays is managed by the BlockManager. - -The primary benefit of the BlockManager is improved performance on certain -operations (construction from a 2D array, binary operations, reductions across the columns), -especially for wide DataFrames. However, the BlockManager substantially increases the -complexity and maintenance burden of pandas. - -By replacing the BlockManager we hope to achieve - -* Substantially simpler code -* Easier extensibility with new logical types -* Better user control over memory use and layout -* Improved micro-performance -* Option to provide a C / Cython API to pandas' internals - -See `these design documents `__ -for more. - -Decoupling of indexing and internals ------------------------------------- - -The code for getting and setting values in pandas' data structures needs refactoring. -In particular, we must clearly separate code that converts keys (e.g., the argument -to ``DataFrame.loc``) to positions from code that uses these positions to get -or set values. This is related to the proposed BlockManager rewrite. Currently, the -BlockManager sometimes uses label-based, rather than position-based, indexing. -We propose that it should only work with positional indexing, and the translation of keys -to positions should be entirely done at a higher level. - -Indexing is a complicated API with many subtleties. This refactor will require care -and attention. More details are discussed at -https://github.com/pandas-dev/pandas/wiki/(Tentative)-rules-for-restructuring-indexing-code - -Numba-accelerated operations ----------------------------- - -`Numba `__ is a JIT compiler for Python code. We'd like to provide -ways for users to apply their own Numba-jitted functions where pandas accepts user-defined functions -(for example, :meth:`Series.apply`, :meth:`DataFrame.apply`, :meth:`DataFrame.applymap`, -and in groupby and window contexts). This will improve the performance of -user-defined-functions in these operations by staying within compiled code. - - -Documentation improvements --------------------------- - -We'd like to improve the content, structure, and presentation of the pandas documentation. -Some specific goals include - -* Overhaul the HTML theme with a modern, responsive design (:issue:`15556`) -* Improve the "Getting Started" documentation, designing and writing learning paths - for users different backgrounds (e.g. brand new to programming, familiar with - other languages like R, already familiar with Python). -* Improve the overall organization of the documentation and specific subsections - of the documentation to make navigation and finding content easier. - -Package docstring validation ----------------------------- - -To improve the quality and consistency of pandas docstrings, we've developed -tooling to check docstrings in a variety of ways. -https://github.com/pandas-dev/pandas/blob/master/scripts/validate_docstrings.py -contains the checks. - -Like many other projects, pandas uses the -`numpydoc `__ style for writing -docstrings. With the collaboration of the numpydoc maintainers, we'd like to -move the checks to a package other than pandas so that other projects can easily -use them as well. - -Performance monitoring ----------------------- - -Pandas uses `airspeed velocity `__ to -monitor for performance regressions. ASV itself is a fabulous tool, but requires -some additional work to be integrated into an open source project's workflow. - -The `asv-runner `__ organization, currently made up -of pandas maintainers, provides tools built on top of ASV. We have a physical -machine for running a number of project's benchmarks, and tools managing the -benchmark runs and reporting on results. - -We'd like to fund improvements and maintenance of these tools to - -* Be more stable. Currently, they're maintained on the nights and weekends when - a maintainer has free time. -* Tune the system for benchmarks to improve stability, following - https://pyperf.readthedocs.io/en/latest/system.html -* Build a GitHub bot to request ASV runs *before* a PR is merged. Currently, the - benchmarks are only run nightly. - -.. _roadmap.evolution: - -Roadmap Evolution ------------------ - -Pandas continues to evolve. The direction is primarily determined by community -interest. Everyone is welcome to review existing items on the roadmap and -to propose a new item. - -Each item on the roadmap should be a short summary of a larger design proposal. -The proposal should include - -1. Short summary of the changes, which would be appropriate for inclusion in - the roadmap if accepted. -2. Motivation for the changes. -3. An explanation of why the change is in scope for pandas. -4. Detailed design: Preferably with example-usage (even if not implemented yet) - and API documentation -5. API Change: Any API changes that may result from the proposal. - -That proposal may then be submitted as a GitHub issue, where the pandas maintainers -can review and comment on the design. The `pandas mailing list `__ -should be notified of the proposal. - -When there's agreement that an implementation -would be welcome, the roadmap should be updated to include the summary and a -link to the discussion issue. diff --git a/doc/source/ecosystem.rst b/doc/source/ecosystem.rst deleted file mode 100644 index 48c722bc16a86..0000000000000 --- a/doc/source/ecosystem.rst +++ /dev/null @@ -1,383 +0,0 @@ -:orphan: - -.. _ecosystem: - -{{ header }} - -**************** -Pandas ecosystem -**************** - -Increasingly, packages are being built on top of pandas to address specific needs -in data preparation, analysis and visualization. -This is encouraging because it means pandas is not only helping users to handle -their data tasks but also that it provides a better starting point for developers to -build powerful and more focused data tools. -The creation of libraries that complement pandas' functionality also allows pandas -development to remain focused around it's original requirements. - -This is an inexhaustive list of projects that build on pandas in order to provide -tools in the PyData space. For a list of projects that depend on pandas, -see the -`libraries.io usage page for pandas `_ -or `search pypi for pandas `_. - -We'd like to make it easier for users to find these projects, if you know of other -substantial projects that you feel should be on this list, please let us know. - -.. _ecosystem.data_cleaning_and_validation: - -Data cleaning and validation ----------------------------- - -`pyjanitor `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Pyjanitor provides a clean API for cleaning data, using method chaining. - -`Engarde `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Engarde is a lightweight library used to explicitly state assumptions about your datasets -and check that they're *actually* true. - -.. _ecosystem.stats: - -Statistics and machine learning -------------------------------- - -`Statsmodels `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Statsmodels is the prominent Python "statistics and econometrics library" and it has -a long-standing special relationship with pandas. Statsmodels provides powerful statistics, -econometrics, analysis and modeling functionality that is out of pandas' scope. -Statsmodels leverages pandas objects as the underlying data container for computation. - -`sklearn-pandas `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Use pandas DataFrames in your `scikit-learn `__ -ML pipeline. - -`Featuretools `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Featuretools is a Python library for automated feature engineering built on top of pandas. It excels at transforming temporal and relational datasets into feature matrices for machine learning using reusable feature engineering "primitives". Users can contribute their own primitives in Python and share them with the rest of the community. - -.. _ecosystem.visualization: - -Visualization -------------- - -`Altair `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Altair is a declarative statistical visualization library for Python. -With Altair, you can spend more time understanding your data and its -meaning. Altair's API is simple, friendly and consistent and built on -top of the powerful Vega-Lite JSON specification. This elegant -simplicity produces beautiful and effective visualizations with a -minimal amount of code. Altair works with Pandas DataFrames. - - -`Bokeh `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Bokeh is a Python interactive visualization library for large datasets that natively uses -the latest web technologies. Its goal is to provide elegant, concise construction of novel -graphics in the style of Protovis/D3, while delivering high-performance interactivity over -large data to thin clients. - -`Pandas-Bokeh `__ provides a high level API -for Bokeh that can be loaded as a native Pandas plotting backend via - -.. code:: python - - pd.set_option("plotting.backend", "pandas_bokeh") - -It is very similar to the matplotlib plotting backend, but provides interactive -web-based charts and maps. - - -`seaborn `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Seaborn is a Python visualization library based on -`matplotlib `__. It provides a high-level, dataset-oriented -interface for creating attractive statistical graphics. The plotting functions -in seaborn understand pandas objects and leverage pandas grouping operations -internally to support concise specification of complex visualizations. Seaborn -also goes beyond matplotlib and pandas with the option to perform statistical -estimation while plotting, aggregating across observations and visualizing the -fit of statistical models to emphasize patterns in a dataset. - -`yhat/ggpy `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Hadley Wickham's `ggplot2 `__ is a foundational exploratory visualization package for the R language. -Based on `"The Grammar of Graphics" `__ it -provides a powerful, declarative and extremely general way to generate bespoke plots of any kind of data. -It's really quite incredible. Various implementations to other languages are available, -but a faithful implementation for Python users has long been missing. Although still young -(as of Jan-2014), the `yhat/ggpy `__ project has been -progressing quickly in that direction. - -`IPython Vega `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -`IPython Vega `__ leverages `Vega -`__ to create plots within Jupyter Notebook. - -`Plotly `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -`Plotly’s `__ `Python API `__ enables interactive figures and web shareability. Maps, 2D, 3D, and live-streaming graphs are rendered with WebGL and `D3.js `__. The library supports plotting directly from a pandas DataFrame and cloud-based collaboration. Users of `matplotlib, ggplot for Python, and Seaborn `__ can convert figures into interactive web-based plots. Plots can be drawn in `IPython Notebooks `__ , edited with R or MATLAB, modified in a GUI, or embedded in apps and dashboards. Plotly is free for unlimited sharing, and has `cloud `__, `offline `__, or `on-premise `__ accounts for private use. - -`QtPandas `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Spun off from the main pandas library, the `qtpandas `__ -library enables DataFrame visualization and manipulation in PyQt4 and PySide applications. - - -.. _ecosystem.ide: - -IDE ------- - -`IPython `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -IPython is an interactive command shell and distributed computing -environment. IPython tab completion works with Pandas methods and also -attributes like DataFrame columns. - -`Jupyter Notebook / Jupyter Lab `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Jupyter Notebook is a web application for creating Jupyter notebooks. -A Jupyter notebook is a JSON document containing an ordered list -of input/output cells which can contain code, text, mathematics, plots -and rich media. -Jupyter notebooks can be converted to a number of open standard output formats -(HTML, HTML presentation slides, LaTeX, PDF, ReStructuredText, Markdown, -Python) through 'Download As' in the web interface and ``jupyter convert`` -in a shell. - -Pandas DataFrames implement ``_repr_html_``and ``_repr_latex`` methods -which are utilized by Jupyter Notebook for displaying -(abbreviated) HTML or LaTeX tables. LaTeX output is properly escaped. -(Note: HTML tables may or may not be -compatible with non-HTML Jupyter output formats.) - -See :ref:`Options and Settings ` and -:ref:`Available Options ` -for pandas ``display.`` settings. - -`quantopian/qgrid `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -qgrid is "an interactive grid for sorting and filtering -DataFrames in IPython Notebook" built with SlickGrid. - -`Spyder `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Spyder is a cross-platform PyQt-based IDE combining the editing, analysis, -debugging and profiling functionality of a software development tool with the -data exploration, interactive execution, deep inspection and rich visualization -capabilities of a scientific environment like MATLAB or Rstudio. - -Its `Variable Explorer `__ -allows users to view, manipulate and edit pandas ``Index``, ``Series``, -and ``DataFrame`` objects like a "spreadsheet", including copying and modifying -values, sorting, displaying a "heatmap", converting data types and more. -Pandas objects can also be renamed, duplicated, new columns added, -copyed/pasted to/from the clipboard (as TSV), and saved/loaded to/from a file. -Spyder can also import data from a variety of plain text and binary files -or the clipboard into a new pandas DataFrame via a sophisticated import wizard. - -Most pandas classes, methods and data attributes can be autocompleted in -Spyder's `Editor `__ and -`IPython Console `__, -and Spyder's `Help pane `__ can retrieve -and render Numpydoc documentation on pandas objects in rich text with Sphinx -both automatically and on-demand. - - -.. _ecosystem.api: - -API ---- - -`pandas-datareader `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -``pandas-datareader`` is a remote data access library for pandas (PyPI:``pandas-datareader``). -It is based on functionality that was located in ``pandas.io.data`` and ``pandas.io.wb`` but was -split off in v0.19. -See more in the `pandas-datareader docs `_: - -The following data feeds are available: - - * Google Finance - * Tiingo - * Morningstar - * IEX - * Robinhood - * Enigma - * Quandl - * FRED - * Fama/French - * World Bank - * OECD - * Eurostat - * TSP Fund Data - * Nasdaq Trader Symbol Definitions - * Stooq Index Data - * MOEX Data - -`quandl/Python `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Quandl API for Python wraps the Quandl REST API to return -Pandas DataFrames with timeseries indexes. - -`pydatastream `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -PyDatastream is a Python interface to the -`Thomson Dataworks Enterprise (DWE/Datastream) `__ -SOAP API to return indexed Pandas DataFrames with financial data. -This package requires valid credentials for this API (non free). - -`pandaSDMX `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -pandaSDMX is a library to retrieve and acquire statistical data -and metadata disseminated in -`SDMX `_ 2.1, an ISO-standard -widely used by institutions such as statistics offices, central banks, -and international organisations. pandaSDMX can expose datasets and related -structural metadata including data flows, code-lists, -and data structure definitions as pandas Series -or MultiIndexed DataFrames. - -`fredapi `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -fredapi is a Python interface to the `Federal Reserve Economic Data (FRED) `__ -provided by the Federal Reserve Bank of St. Louis. It works with both the FRED database and ALFRED database that -contains point-in-time data (i.e. historic data revisions). fredapi provides a wrapper in Python to the FRED -HTTP API, and also provides several convenient methods for parsing and analyzing point-in-time data from ALFRED. -fredapi makes use of pandas and returns data in a Series or DataFrame. This module requires a FRED API key that -you can obtain for free on the FRED website. - - -.. _ecosystem.domain: - -Domain specific ---------------- - -`Geopandas `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Geopandas extends pandas data objects to include geographic information which support -geometric operations. If your work entails maps and geographical coordinates, and -you love pandas, you should take a close look at Geopandas. - -`xarray `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -xarray brings the labeled data power of pandas to the physical sciences by -providing N-dimensional variants of the core pandas data structures. It aims to -provide a pandas-like and pandas-compatible toolkit for analytics on multi- -dimensional arrays, rather than the tabular data for which pandas excels. - - -.. _ecosystem.out-of-core: - -Out-of-core -------------- - -`Blaze `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Blaze provides a standard API for doing computations with various -in-memory and on-disk backends: NumPy, Pandas, SQLAlchemy, MongoDB, PyTables, -PySpark. - -`Dask `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Dask is a flexible parallel computing library for analytics. Dask -provides a familiar ``DataFrame`` interface for out-of-core, parallel and distributed computing. - -`Dask-ML `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Dask-ML enables parallel and distributed machine learning using Dask alongside existing machine learning libraries like Scikit-Learn, XGBoost, and TensorFlow. - -`Koalas `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Koalas provides a familiar pandas DataFrame interface on top of Apache Spark. It enables users to leverage multi-cores on one machine or a cluster of machines to speed up or scale their DataFrame code. - -`Odo `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Odo provides a uniform API for moving data between different formats. It uses -pandas own ``read_csv`` for CSV IO and leverages many existing packages such as -PyTables, h5py, and pymongo to move data between non pandas formats. Its graph -based approach is also extensible by end users for custom formats that may be -too specific for the core of odo. - -`Ray `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Pandas on Ray is an early stage DataFrame library that wraps Pandas and transparently distributes the data and computation. The user does not need to know how many cores their system has, nor do they need to specify how to distribute the data. In fact, users can continue using their previous Pandas notebooks while experiencing a considerable speedup from Pandas on Ray, even on a single machine. Only a modification of the import statement is needed, as we demonstrate below. Once you’ve changed your import statement, you’re ready to use Pandas on Ray just like you would Pandas. - -.. code:: python - - # import pandas as pd - import ray.dataframe as pd - - -`Vaex `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Increasingly, packages are being built on top of pandas to address specific needs in data preparation, analysis and visualization. Vaex is a python library for Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets. It can calculate statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid up to a billion (10\ :sup:`9`) objects/rows per second. Visualization is done using histograms, density plots and 3d volume rendering, allowing interactive exploration of big data. Vaex uses memory mapping, zero memory copy policy and lazy computations for best performance (no memory wasted). - - * vaex.from_pandas - * vaex.to_pandas_df - -.. _ecosystem.extensions: - -Extension data types --------------------- - -Pandas provides an interface for defining -:ref:`extension types ` to extend NumPy's type -system. The following libraries implement that interface to provide types not -found in NumPy or pandas, which work well with pandas' data containers. - -`cyberpandas`_ -~~~~~~~~~~~~~~ - -Cyberpandas provides an extension type for storing arrays of IP Addresses. These -arrays can be stored inside pandas' Series and DataFrame. - -.. _ecosystem.accessors: - -Accessors ---------- - -A directory of projects providing -:ref:`extension accessors `. This is for users to -discover new accessors and for library authors to coordinate on the namespace. - -============== ========== ========================= -Library Accessor Classes -============== ========== ========================= -`cyberpandas`_ ``ip`` ``Series`` -`pdvega`_ ``vgplot`` ``Series``, ``DataFrame`` -============== ========== ========================= - -.. _cyberpandas: https://cyberpandas.readthedocs.io/en/latest -.. _pdvega: https://altair-viz.github.io/pdvega/ - diff --git a/doc/source/getting_started/basics.rst b/doc/source/getting_started/basics.rst index 125990f7cadcd..b55a707b67d14 100644 --- a/doc/source/getting_started/basics.rst +++ b/doc/source/getting_started/basics.rst @@ -1934,8 +1934,8 @@ does not support timezone-aware datetimes). Pandas and third-party libraries *extend* NumPy's type system in a few places. This section describes the extensions pandas has made internally. See :ref:`extending.extension-types` for how to write your own extension that -works with pandas. See :ref:`ecosystem.extensions` for a list of third-party -libraries that have implemented an extension. +works with pandas. See `Extension data types `__ +for a list of third-party libraries that have implemented an extension. The following table lists all of pandas extension types. See the respective documentation sections for more on each type. diff --git a/doc/source/index.rst.template b/doc/source/index.rst.template index 9ec330c956ff1..85b12620383d0 100644 --- a/doc/source/index.rst.template +++ b/doc/source/index.rst.template @@ -86,7 +86,6 @@ See the :ref:`overview` for more detail about what's in the library. * :doc:`user_guide/gotchas` * :doc:`user_guide/cookbook` -* :doc:`ecosystem` * :doc:`reference/index` * :doc:`reference/io` diff --git a/doc/source/user_guide/scale.rst b/doc/source/user_guide/scale.rst index 7b590a3a1fcc8..be8e2073fb1ef 100644 --- a/doc/source/user_guide/scale.rst +++ b/doc/source/user_guide/scale.rst @@ -234,7 +234,7 @@ Use other libraries Pandas is just one library offering a DataFrame API. Because of its popularity, pandas' API has become something of a standard that other libraries implement. The pandas documentation maintains a list of libraries implementing a DataFrame API -in :ref:`our ecosystem page `. +in `our ecosystem page `__. For example, `Dask`_, a parallel computing library, has `dask.dataframe`_, a pandas-like API for working with larger than memory datasets in parallel. Dask diff --git a/doc/source/user_guide/visualization.rst b/doc/source/user_guide/visualization.rst index 39051440e9d9a..485f872660f7f 100644 --- a/doc/source/user_guide/visualization.rst +++ b/doc/source/user_guide/visualization.rst @@ -14,7 +14,7 @@ We use the standard convention for referencing the matplotlib API: plt.close('all') We provide the basics in pandas to easily create decent looking plots. -See the :ref:`ecosystem ` section for visualization +See the `ecosystem `__ section for visualization libraries that go beyond the basics documented here. .. note:: diff --git a/doc/source/whatsnew/v0.13.1.rst b/doc/source/whatsnew/v0.13.1.rst index 6242c40d44bf8..5af99de998875 100644 --- a/doc/source/whatsnew/v0.13.1.rst +++ b/doc/source/whatsnew/v0.13.1.rst @@ -17,7 +17,7 @@ Highlights include: - Will intelligently limit display precision for datetime/timedelta formats. - Enhanced Panel :meth:`~pandas.Panel.apply` method. - Suggested tutorials in new :ref:`Tutorials` section. -- Our pandas ecosystem is growing, We now feature related projects in a new :ref:`Pandas Ecosystem` section. +- Our pandas ecosystem is growing, We now feature related projects in a new `Pandas Ecosystem `__ section. - Much work has been taking place on improving the docs, and a new :ref:`Contributing` section has been added. - Even though it may only be of interest to devs, we <3 our new CI status page: `ScatterCI `__. diff --git a/doc/source/whatsnew/v0.23.0.rst b/doc/source/whatsnew/v0.23.0.rst index f4c283ea742f7..291ef1a45d89f 100644 --- a/doc/source/whatsnew/v0.23.0.rst +++ b/doc/source/whatsnew/v0.23.0.rst @@ -237,7 +237,7 @@ array are respected: For more, see the :ref:`extension types ` documentation. If you build an extension array, publicize it on our -:ref:`ecosystem page `. +`ecosystem page `__. .. _cyberpandas: https://cyberpandas.readthedocs.io/en/latest/ diff --git a/doc/source/whatsnew/v0.24.0.rst b/doc/source/whatsnew/v0.24.0.rst index 42579becd4237..c406cadf6add3 100644 --- a/doc/source/whatsnew/v0.24.0.rst +++ b/doc/source/whatsnew/v0.24.0.rst @@ -161,7 +161,7 @@ See :ref:`Dtypes ` and :ref:`Attributes and Underlying Data `, including -extension arrays registered by :ref:`3rd party libraries `. +extension arrays registered by `3rd party libraries `__. See the :ref:`dtypes docs ` for more on extension arrays. .. ipython:: python