|
1 | 1 | .. currentmodule:: pandas
|
2 | 2 | .. _compare_with_r:
|
3 | 3 |
|
4 |
| -******************************* |
5 | 4 | Comparison with R / R libraries
|
6 | 5 | *******************************
|
7 | 6 |
|
8 |
| -Since pandas aims to provide a lot of the data manipulation and analysis |
9 |
| -functionality that people use R for, this page was started to provide a more |
10 |
| -detailed look at the R language and it's many 3rd party libraries as they |
11 |
| -relate to pandas. In offering comparisons with R and CRAN libraries, we care |
12 |
| -about the following things: |
| 7 | +Since ``pandas`` aims to provide a lot of the data manipulation and analysis |
| 8 | +functionality that people use `R <http://www.r-project.org/>`__ for, this page |
| 9 | +was started to provide a more detailed look at the `R language |
| 10 | +<http://en.wikipedia.org/wiki/R_(programming_language)>`__ and its many third |
| 11 | +party libraries as they relate to ``pandas``. In comparisons with R and CRAN |
| 12 | +libraries, we care about the following things: |
13 | 13 |
|
14 |
| - - **Functionality / flexibility**: what can / cannot be done with each tool |
15 |
| - - **Performance**: how fast are operations. Hard numbers / benchmarks are |
| 14 | + - **Functionality / flexibility**: what can/cannot be done with each tool |
| 15 | + - **Performance**: how fast are operations. Hard numbers/benchmarks are |
16 | 16 | preferable
|
17 |
| - - **Ease-of-use**: is one tool easier or harder to use (you may have to be |
18 |
| - the judge of this given side-by-side code comparisons) |
| 17 | + - **Ease-of-use**: Is one tool easier/harder to use (you may have to be |
| 18 | + the judge of this, given side-by-side code comparisons) |
| 19 | + |
| 20 | +This page is also here to offer a bit of a translation guide for users of these |
| 21 | +R packages. |
| 22 | + |
| 23 | +Base R |
| 24 | +------ |
| 25 | + |
| 26 | +|subset|_ |
| 27 | +~~~~~~~~~~ |
| 28 | + |
| 29 | +.. versionadded:: 0.13 |
| 30 | + |
| 31 | +The :meth:`~pandas.DataFrame.query` method is similar to the base R ``subset`` |
| 32 | +function. In R you might want to get the rows of a ``data.frame`` where one |
| 33 | +column's values are less than another column's values: |
| 34 | + |
| 35 | + .. code-block:: r |
| 36 | +
|
| 37 | + df <- data.frame(a=rnorm(10), b=rnorm(10)) |
| 38 | + subset(df, a <= b) |
| 39 | + df[df$a <= df$b,] # note the comma |
| 40 | +
|
| 41 | +In ``pandas``, there are a few ways to perform subsetting. You can use |
| 42 | +:meth:`~pandas.DataFrame.query` or pass an expression as if it were an |
| 43 | +index/slice as well as standard boolean indexing: |
| 44 | + |
| 45 | + .. ipython:: python |
| 46 | +
|
| 47 | + from pandas import DataFrame |
| 48 | + from numpy.random import randn |
| 49 | +
|
| 50 | + df = DataFrame({'a': randn(10), 'b': randn(10)}) |
| 51 | + df.query('a <= b') |
| 52 | + df['a <= b'] |
| 53 | + df[df.a <= df.b] |
| 54 | + df.loc[df.a <= df.b] |
19 | 55 |
|
20 |
| -As I do not have an encyclopedic knowledge of R packages, feel free to suggest |
21 |
| -additional CRAN packages to add to this list. This is also here to offer a big |
22 |
| -of a translation guide for users of these R packages. |
| 56 | +For more details and examples see :ref:`the query documentation |
| 57 | +<indexing.query>`. |
23 | 58 |
|
24 |
| -data.frame |
25 |
| ----------- |
| 59 | + |
| 60 | +|with|_ |
| 61 | +~~~~~~~~ |
| 62 | + |
| 63 | +.. versionadded:: 0.13 |
| 64 | + |
| 65 | +An expression using a data.frame called ``df`` in R with the columns ``a`` and |
| 66 | +``b`` would be evaluated using ``with`` like so: |
| 67 | + |
| 68 | + .. code-block:: r |
| 69 | +
|
| 70 | + df <- data.frame(a=rnorm(10), b=rnorm(10)) |
| 71 | + with(df, a + b) |
| 72 | + df$a + df$b # same as the previous expression |
| 73 | +
|
| 74 | +In ``pandas`` the equivalent expression, using the |
| 75 | +:meth:`~pandas.DataFrame.eval` method, would be: |
| 76 | + |
| 77 | + .. ipython:: python |
| 78 | +
|
| 79 | + df = DataFrame({'a': randn(10), 'b': randn(10)}) |
| 80 | + df.eval('a + b') |
| 81 | + df.a + df.b # same as the previous expression |
| 82 | +
|
| 83 | +In certain cases :meth:`~pandas.DataFrame.eval` will be much faster than |
| 84 | +evaluation in pure Python. For more details and examples see :ref:`the eval |
| 85 | +documentation <enhancingperf.eval>`. |
26 | 86 |
|
27 | 87 | zoo
|
28 | 88 | ---
|
|
36 | 96 | reshape / reshape2
|
37 | 97 | ------------------
|
38 | 98 |
|
| 99 | + |
| 100 | +.. |with| replace:: ``with`` |
| 101 | +.. _with: http://finzi.psych.upenn.edu/R/library/base/html/with.html |
| 102 | + |
| 103 | +.. |subset| replace:: ``subset`` |
| 104 | +.. _subset: http://finzi.psych.upenn.edu/R/library/base/html/subset.html |
0 commit comments