From a2f714ffcc2e4d53c21c5c0afbc09e8bca4c7743 Mon Sep 17 00:00:00 2001 From: yanglinlee Date: Thu, 25 Jul 2019 16:05:42 -0400 Subject: [PATCH 1/2] DOC: add documentation for read_spss (#27476) --- doc/source/reference/io.rst | 7 +++++++ doc/source/user_guide/io.rst | 37 ++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/doc/source/reference/io.rst b/doc/source/reference/io.rst index 666220d390cdc..91f4942d03b0d 100644 --- a/doc/source/reference/io.rst +++ b/doc/source/reference/io.rst @@ -105,6 +105,13 @@ SAS read_sas +SPSS +~~~~ +.. autosummary:: + :toctree: api/ + + read_spss + SQL ~~~ .. autosummary:: diff --git a/doc/source/user_guide/io.rst b/doc/source/user_guide/io.rst index ae288ba5bde16..292236965288c 100644 --- a/doc/source/user_guide/io.rst +++ b/doc/source/user_guide/io.rst @@ -39,6 +39,7 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like binary;`Msgpack `__;:ref:`read_msgpack`;:ref:`to_msgpack` binary;`Stata `__;:ref:`read_stata`;:ref:`to_stata` binary;`SAS `__;:ref:`read_sas`; + binary;`SPSS `__;:ref:`read_spss`; binary;`Python Pickle Format `__;:ref:`read_pickle`;:ref:`to_pickle` SQL;`SQL `__;:ref:`read_sql`;:ref:`to_sql` SQL;`Google Big Query `__;:ref:`read_gbq`;:ref:`to_gbq` @@ -5477,6 +5478,42 @@ web site. No official documentation is available for the SAS7BDAT format. +.. _io.spss: + +.. _io.spss_reader: + +SPSS formats +------------ + +The top-level function :func:`read_spss` can read (but not write) SPSS +`sav` (.sav) and `zsav` (.zsav) format files(since *v0.25.0*). + +SPSS files contain column names. By default the +whole file is read, categorical columns are converted into ``pd.Categorical`` +and a ``DataFrame`` with all columns is returned. + +Specify a ``usecols`` to obtain a subset of columns. Specify ``convert_categoricals=False`` +to avoid converting categorical columns into ``pd.Categorical``. + +Read a spss file: + +.. code-block:: python + + df = pd.read_spss('spss_data.zsav') + +Extract a subset of columns ``usecols`` from SPSS file and +avoid converting categorical columns into ``pd.Categorical``: + +.. code-block:: python + + df = pd.read_spss('spss_data.zsav', usecols=['foo', 'bar'], + convert_categoricals=False) + +More info_ about the sav and zsav file format is available from the IBM +web site. + +.. _info: https://www.ibm.com/support/knowledgecenter/en/SSLVMB_22.0.0/com.ibm.spss.statistics.help/spss/base/savedatatypes.htm + .. _io.other: Other file formats From e600b0483a9e5529c1b53dbb2c3407a66c936156 Mon Sep 17 00:00:00 2001 From: yanglinlee Date: Fri, 26 Jul 2019 14:50:04 -0400 Subject: [PATCH 2/2] DOC: add documentation for read_spss (#27476) --- doc/source/user_guide/io.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/doc/source/user_guide/io.rst b/doc/source/user_guide/io.rst index 292236965288c..8e5352c337072 100644 --- a/doc/source/user_guide/io.rst +++ b/doc/source/user_guide/io.rst @@ -5485,8 +5485,10 @@ No official documentation is available for the SAS7BDAT format. SPSS formats ------------ +.. versionadded:: 0.25.0 + The top-level function :func:`read_spss` can read (but not write) SPSS -`sav` (.sav) and `zsav` (.zsav) format files(since *v0.25.0*). +`sav` (.sav) and `zsav` (.zsav) format files. SPSS files contain column names. By default the whole file is read, categorical columns are converted into ``pd.Categorical``