Skip to content

Commit a16e560

Browse files
committed
DOC: added recommended dependencies section in install.rst, linking from v0.11.0.rst, and basics.rst
DOC: added timing table for numexpr
1 parent 13f54e5 commit a16e560

File tree

6 files changed

+54
-8
lines changed

6 files changed

+54
-8
lines changed

README.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ Highly Recommended Dependencies
7474
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7575
* `numexpr <http://code.google.com/p/numexpr/>`__: to accelerate some expression evaluation operations
7676
also required by `PyTables`
77-
* `bottleneck <http://berkeleyanalytics.com/>`__: to accelerate certain numerical operations
77+
* `bottleneck <http://berkeleyanalytics.com/bottleneck>`__: to accelerate certain numerical operations
7878

7979
Optional dependencies
8080
~~~~~~~~~~~~~~~~~~~~~

RELEASE.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ pandas 0.11.0
4242
- Added ``.at`` attribute, to support fast scalar access via labels (replaces ``get_value/set_value``)
4343
- Moved functionaility from ``irow,icol,iget_value/iset_value`` to ``.iloc`` indexer
4444
(via ``_ixs`` methods in each object)
45+
- Added support for expression evaluation using the ``numexpr`` library
4546

4647
**Improvements to existing features**
4748

doc/source/basics.rst

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,33 @@ unlike the axis labels, cannot be assigned to.
8686
strings are involved, the result will be of object dtype. If there are only
8787
floats and integers, the resulting array will be of float dtype.
8888

89+
.. _basics.accelerate:
90+
91+
Accelerated operations
92+
----------------------
93+
94+
Pandas has support for accelerating certain types of binary numerical and boolean operations using
95+
the ``numexpr`` library (starting in 0.11.0) and the ``bottleneck`` libraries.
96+
97+
These libraries are especially useful when dealing with large data sets, and provide large
98+
speedups. ``numexpr`` uses smart chunking, caching, and multiple cores. ``bottleneck`` is
99+
a set of specialized cython routines that are especially fast when dealing with arrays that have
100+
``nans``.
101+
102+
Here is a sample (using 100 column x 100,000 row ``DataFrames``):
103+
104+
.. csv-table::
105+
:header: "Operation", "0.11.0 (ms)", "Prior Vern (ms)", "Ratio to Prior"
106+
:widths: 30, 30, 30, 30
107+
:delim: ;
108+
109+
``df1 > df2``; 13.32; 125.35; 0.1063
110+
``df1 * df2``; 21.71; 36.63; 0.5928
111+
``df1 + df2``; 22.04; 36.50; 0.6039
112+
113+
You are highly encouraged to install both libraries. See the section
114+
:ref:`Recommended Dependencies <install.recommended_dependencies>` for more installation info.
115+
89116
.. _basics.binop:
90117

91118
Flexible binary operations

doc/source/faq.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ Frequently Asked Questions (FAQ)
2323
2424
.. _ref-monkey-patching:
2525

26-
27-
----------------------------------------------------
26+
Adding Features to your Pandas Installation
27+
-------------------------------------------
2828

2929
Pandas is a powerful tool and already has a plethora of data manipulation
3030
operations implemented, most of them are very fast as well.

doc/source/install.rst

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,23 @@ Dependencies
7070
* `pytz <http://pytz.sourceforge.net/>`__
7171
* Needed for time zone support
7272

73-
Optional dependencies
73+
.. _install.recommended_dependencies:
74+
75+
Recommended Dependencies
76+
~~~~~~~~~~~~~~~~~~~~~~~~
77+
78+
* `numexpr <http://code.google.com/p/numexpr/>`__: for accelerating certain numerical operations.
79+
``numexpr`` uses multiple cores as well as smart chunking and caching to achieve large speedups.
80+
* `bottleneck <http://berkeleyanalytics.com/bottleneck>`__: for accelerating certain types of ``nan``
81+
evaluations. ``bottleneck`` uses specialized cython routines to achieve large speedups.
82+
83+
.. note::
84+
85+
You are highly encouraged to install these libraries, as they provide large speedups, especially
86+
if working with large data sets.
87+
88+
89+
Optional Dependencies
7490
~~~~~~~~~~~~~~~~~~~~~
7591

7692
* `Cython <http://www.cython.org>`__: Only necessary to build development

doc/source/v0.11.0.txt

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ pay close attention to.
1212
There is a new section in the documentation, :ref:`10 Minutes to Pandas <10min>`,
1313
primarily geared to new users.
1414

15+
There are several libraries that are now :ref:`Recommended Dependencies <install.recommended_dependencies>`
16+
1517
Selection Choices
1618
~~~~~~~~~~~~~~~~~
1719

@@ -224,11 +226,11 @@ API changes
224226
Enhancements
225227
~~~~~~~~~~~~
226228

227-
- Numexpr is now a 'highly recommended dependency', to accelerate certain
228-
types of expression evaluation
229+
- Numexpr is now a :ref:`Recommended Dependencies <install.recommended_dependencies>`, to accelerate certain
230+
types of numerical and boolean operations
229231

230-
- Bottleneck is now a 'highly recommended dependency', to accelerate certain
231-
types of numerical evaluations
232+
- Bottleneck is now a :ref:`Recommended Dependencies <install.recommended_dependencies>`, to accelerate certain
233+
types of ``nan`` operations
232234

233235
- In ``HDFStore``, provide dotted attribute access to ``get`` from stores
234236
(e.g. ``store.df == store['df']``)

0 commit comments

Comments
 (0)