Skip to content

DOC: Improve 10min/EnhancingPerf.rst #45390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 40 additions & 42 deletions doc/source/user_guide/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ a default integer index:
s = pd.Series([1, 3, 5, np.nan, 6, 8])
s

Creating a :class:`DataFrame` by passing a NumPy array, with a datetime index
Creating a :class:`DataFrame` by passing a NumPy array, with a datetime index using :func:`date_range`
and labeled columns:

.. ipython:: python
Expand Down Expand Up @@ -93,14 +93,15 @@ Viewing data

See the :ref:`Basics section <basics>`.

Here is how to view the top and bottom rows of the frame:
Use :meth:`DataFrame.head` and :meth:`DataFrame.tail` to view the top and bottom rows of the frame
respectively:

.. ipython:: python

df.head()
df.tail(3)

Display the index, columns:
Display the :attr:`DataFrame.index` or :attr:`DataFrame.columns`:

.. ipython:: python

Expand All @@ -116,7 +117,7 @@ while pandas DataFrames have one dtype per column**. When you call
of the dtypes in the DataFrame. This may end up being ``object``, which requires
casting every value to a Python object.

For ``df``, our :class:`DataFrame` of all floating-point values,
For ``df``, our :class:`DataFrame` of all floating-point values, and
:meth:`DataFrame.to_numpy` is fast and doesn't require copying data:

.. ipython:: python
Expand Down Expand Up @@ -147,13 +148,13 @@ Transposing your data:

df.T

Sorting by an axis:
:meth:`DataFrame.sort_index` sorts by an axis:

.. ipython:: python

df.sort_index(axis=1, ascending=False)

Sorting by values:
:meth:`DataFrame.sort_values` sorts by values:

.. ipython:: python

Expand All @@ -166,8 +167,8 @@ Selection

While standard Python / NumPy expressions for selecting and setting are
intuitive and come in handy for interactive work, for production code, we
recommend the optimized pandas data access methods, ``.at``, ``.iat``,
``.loc`` and ``.iloc``.
recommend the optimized pandas data access methods, :meth:`DataFrame.at`, :meth:`DataFrame.iat`,
:meth:`DataFrame.loc` and :meth:`DataFrame.iloc`.

See the indexing documentation :ref:`Indexing and Selecting Data <indexing>` and :ref:`MultiIndex / Advanced Indexing <advanced>`.

Expand All @@ -181,7 +182,7 @@ equivalent to ``df.A``:

df["A"]

Selecting via ``[]``, which slices the rows:
Selecting via ``[]`` (``__getitem__``), which slices the rows:

.. ipython:: python

Expand All @@ -191,7 +192,7 @@ Selecting via ``[]``, which slices the rows:
Selection by label
~~~~~~~~~~~~~~~~~~

See more in :ref:`Selection by Label <indexing.label>`.
See more in :ref:`Selection by Label <indexing.label>` using :meth:`DataFrame.loc` or :meth:`DataFrame.at`.

For getting a cross section using a label:

Expand Down Expand Up @@ -232,7 +233,7 @@ For getting fast access to a scalar (equivalent to the prior method):
Selection by position
~~~~~~~~~~~~~~~~~~~~~

See more in :ref:`Selection by Position <indexing.integer>`.
See more in :ref:`Selection by Position <indexing.integer>` using :meth:`DataFrame.iloc` or :meth:`DataFrame.at`.

Select via the position of the passed integers:

Expand Down Expand Up @@ -361,19 +362,19 @@ returns a copy of the data:
df1.loc[dates[0] : dates[1], "E"] = 1
df1

To drop any rows that have missing data:
:meth:`DataFrame.dropna` drops any rows that have missing data:

.. ipython:: python

df1.dropna(how="any")

Filling missing data:
:meth:`DataFrame.fillna` fills missing data:

.. ipython:: python

df1.fillna(value=5)

To get the boolean mask where values are ``nan``:
:func:`isna` gets the boolean mask where values are ``nan``:

.. ipython:: python

Expand Down Expand Up @@ -415,7 +416,7 @@ In addition, pandas automatically broadcasts along the specified dimension:
Apply
~~~~~

Applying functions to the data:
:meth:`DataFrame.apply` applies a user defined function to the data:

.. ipython:: python

Expand Down Expand Up @@ -461,7 +462,7 @@ operations.

See the :ref:`Merging section <merging>`.

Concatenating pandas objects together with :func:`concat`:
Concatenating pandas objects together along an axis with :func:`concat`:

.. ipython:: python

Expand All @@ -482,7 +483,7 @@ Concatenating pandas objects together with :func:`concat`:
Join
~~~~

SQL style merges. See the :ref:`Database style joining <merging.join>` section.
:func:`merge` enables SQL style join types along specific columns. See the :ref:`Database style joining <merging.join>` section.

.. ipython:: python

Expand Down Expand Up @@ -572,7 +573,7 @@ columns:
stacked = df2.stack()
stacked

With a "stacked" DataFrame or Series (having a ``MultiIndex`` as the
With a "stacked" DataFrame or Series (having a :class:`MultiIndex` as the
``index``), the inverse operation of :meth:`~DataFrame.stack` is
:meth:`~DataFrame.unstack`, which by default unstacks the **last level**:

Expand All @@ -599,7 +600,7 @@ See the section on :ref:`Pivot Tables <reshaping.pivot>`.
)
df

We can produce pivot tables from this data very easily:
:func:`pivot_table` pivots a :class:`DataFrame` specifying the ``values``, ``index`` and ``columns``

.. ipython:: python

Expand All @@ -620,7 +621,7 @@ financial applications. See the :ref:`Time Series section <timeseries>`.
ts = pd.Series(np.random.randint(0, 500, len(rng)), index=rng)
ts.resample("5Min").sum()

Time zone representation:
:meth:`Series.tz_localize` localizes a time series to a time zone:

.. ipython:: python

Expand All @@ -630,7 +631,7 @@ Time zone representation:
ts_utc = ts.tz_localize("UTC")
ts_utc

Converting to another time zone:
:meth:`Series.tz_convert` converts a timezones aware time series to another time zone:

.. ipython:: python

Expand Down Expand Up @@ -722,7 +723,7 @@ We use the standard convention for referencing the matplotlib API:

plt.close("all")

The :meth:`~plt.close` method is used to `close <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.close.html>`__ a figure window:
The ``plt.close`` method is used to `close <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.close.html>`__ a figure window:

.. ipython:: python

Expand All @@ -732,7 +733,7 @@ The :meth:`~plt.close` method is used to `close <https://matplotlib.org/3.1.1/ap
@savefig series_plot_basic.png
ts.plot();

If running under Jupyter Notebook, the plot will appear on :meth:`~ts.plot`. Otherwise use
If running under Jupyter Notebook, the plot will appear on :meth:`~Series.plot`. Otherwise use
`matplotlib.pyplot.show <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.show.html>`__ to show it or
`matplotlib.pyplot.savefig <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.savefig.html>`__ to write it to a file.

Expand All @@ -756,19 +757,19 @@ of the columns with labels:
@savefig frame_plot_basic.png
plt.legend(loc='best');

Getting data in/out
-------------------
Importing and exporting data
----------------------------

CSV
~~~

:ref:`Writing to a csv file: <io.store_in_csv>`
:ref:`Writing to a csv file: <io.store_in_csv>` using :meth:`DataFrame.to_csv`

.. ipython:: python

df.to_csv("foo.csv")

:ref:`Reading from a csv file: <io.read_csv_table>`
:ref:`Reading from a csv file: <io.read_csv_table>` using :func:`read_csv`

.. ipython:: python

Expand All @@ -786,13 +787,13 @@ HDF5

Reading and writing to :ref:`HDFStores <io.hdf5>`.

Writing to a HDF5 Store:
Writing to a HDF5 Store using :meth:`DataFrame.to_hdf`:

.. ipython:: python

df.to_hdf("foo.h5", "df")

Reading from a HDF5 Store:
Reading from a HDF5 Store using :func:`read_hdf`:

.. ipython:: python

Expand All @@ -806,15 +807,15 @@ Reading from a HDF5 Store:
Excel
~~~~~

Reading and writing to :ref:`MS Excel <io.excel>`.
Reading and writing to :ref:`Excel <io.excel>`.

Writing to an excel file:
Writing to an excel file using :meth:`DataFrame.to_excel`:

.. ipython:: python

df.to_excel("foo.xlsx", sheet_name="Sheet1")

Reading from an excel file:
Reading from an excel file using :func:`read_excel`:

.. ipython:: python

Expand All @@ -828,16 +829,13 @@ Reading from an excel file:
Gotchas
-------

If you are attempting to perform an operation you might see an exception like:
If you are attempting to perform a boolean operation on a :class:`Series` or :class:`DataFrame`
you might see an exception like:

.. code-block:: python

>>> if pd.Series([False, True, False]):
... print("I was true")
Traceback
...
ValueError: The truth value of an array is ambiguous. Use a.empty, a.any() or a.all().
.. ipython:: python
:okexcept:

See :ref:`Comparisons<basics.compare>` for an explanation and what to do.
if pd.Series([False, True, False]):
print("I was true")

See :ref:`Gotchas<gotchas>` as well.
See :ref:`Comparisons<basics.compare>` and :ref:`Gotchas<gotchas>` for an explanation and what to do.
Loading