From 9f9475d04b85b821e16a8aa115fbb9630e04ca02 Mon Sep 17 00:00:00 2001 From: y-p Date: Fri, 3 Jan 2014 04:38:29 +0200 Subject: [PATCH] DOC: add 'pandas ecosystem' section to docs --- doc/source/ecosystem.rst | 60 ++++++++++++++++++++++++++++++++++++++++ doc/source/index.rst | 2 +- doc/source/related.rst | 57 -------------------------------------- 3 files changed, 61 insertions(+), 58 deletions(-) create mode 100644 doc/source/ecosystem.rst delete mode 100644 doc/source/related.rst diff --git a/doc/source/ecosystem.rst b/doc/source/ecosystem.rst new file mode 100644 index 0000000000000..4c9e4ca4ef7ee --- /dev/null +++ b/doc/source/ecosystem.rst @@ -0,0 +1,60 @@ +**************** +Pandas Ecosystem +**************** + +Increasingly, packages are being built on top of pandas to address specific needs +in data preparation, analysis and visualization. +This is encouraging because it means pandas is not only helping users to handle +their data tasks but also that provides a better starting point for developers to +build powerful and more focused data tools. +The creation of libraries that complement pandas' functionality also allows pandas +development to remain focused around it's original requirements. + +This is an in-exhaustive list of projects that build on pandas in order to provide +tools in the PyData space. + +We'd like to make it easier for users to find these project, if you know of other +substantial projects that you feel should be on this list, please let us know. + +`Statsmodels `__ +----------- + +Statsmodels is the prominent python "statistics and econometrics library" and it has +a long-standing special relationship with pandas. Statsmodels provides powerful statistics, +econometrics, analysis and modeling functionality that is out of pandas' scope. +Statsmodels leverages pandas objects as the underlying data container for computation. + +`Vincent `__ +------- + +The `Vincent `__ project leverages `Vega `__ to create +plots (that in turn, leverages `d3 `__). It has great support for pandas data objects. + +`yhat/ggplot `__ +----------- + +Hadley Wickham's `ggplot2 `__is a foundational exploratory visualization package for the R language. +Based on `"The Grammer of Graphics" `__ it +provides a powerful, declarative and extremely general way to generate plots of arbitrary data. +It's really quite incredible. Various implementations to other languages are available, +but a faithful implementation for python users has long been missing. Although still young +(as of Jan-2014), the `yhat/ggplot ` project has been +progressing quickly in that direction. + + +`Seaborn `__ +------- + +Although pandas has quite a bit of "just plot it" functionality built-in, visualization and +in particular statistical graphics is a vast field with a long tradition and lots of ground +to cover. `The Seaborn project `__ builds on top of pandas +and `matplotlib `__ to provide easy plotting of data which extends to +more advanced types of plots then those offered by pandas. + + +`Geopandas `__ +--------- + +Geopandas extends pandas data objects to include geographic information which support +geometric operations. If your work entails maps and geographical coordinates, and +you love pandas, you should take a close look at Geopandas. diff --git a/doc/source/index.rst b/doc/source/index.rst index c406c4f2cfa27..a416e4af4e486 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -130,7 +130,7 @@ See the package overview for more detail about what's in the library. sparse gotchas r_interface - related + ecosystem comparison_with_r comparison_with_sql api diff --git a/doc/source/related.rst b/doc/source/related.rst deleted file mode 100644 index 33dad8115e5b1..0000000000000 --- a/doc/source/related.rst +++ /dev/null @@ -1,57 +0,0 @@ -************************ -Related Python libraries -************************ - -la (larry) ----------- - -Keith Goodman's excellent `labeled array package -`__ is very similar to pandas in many regards, -though with some key differences. The main philosophical design difference is -to be a wrapper around a single NumPy ``ndarray`` object while adding axis -labeling and label-based operations and indexing. Because of this, creating a -size-mutable object with heterogeneous columns (e.g. DataFrame) is not possible -with the ``la`` package. - - - Provide a single n-dimensional object with labeled axes with functionally - analogous data alignment semantics to pandas objects - - Advanced / label-based indexing similar to that provided in pandas but - setting is not supported - - Stays much closer to NumPy arrays than pandas-- ``larry`` objects must be - homogeneously typed - - GroupBy support is relatively limited, but a few functions are available: - ``group_mean``, ``group_median``, and ``group_ranking`` - - It has a collection of analytical functions suited to quantitative - portfolio construction for financial applications - - It has a collection of moving window statistics implemented in - `Bottleneck `__ - -statsmodels ------------ - -The main `statistics and econometrics library -`__ for Python. pandas has become a -dependency of this library. - -scikits.timeseries ------------------- - -`scikits.timeseries `__ provides a data -structure for fixed frequency time series data based on the numpy.MaskedArray -class. For time series data, it provides some of the same functionality to the -pandas Series class. It has many more functions for time series-specific -manipulation. Also, it has support for many more frequencies, though less -customizable by the user (so 5-minutely data is easier to do with pandas for -example). - -We are aiming to merge these libraries together in the near future. - -Progress: - - - It has a collection of moving window statistics implemented in - `Bottleneck `__ - - `Outstanding issues `__ - -Summarising, Pandas offers superior functionality due to its combination with the :py:class:`pandas.DataFrame`. - -An introduction for former users of :mod:`scikits.timeseries` is provided in the :ref:`migration guide `. \ No newline at end of file