Skip to content

Commit bae62ea

Browse files
committed
DOC: minor edits in missing_data.rst and io/stata edits
1 parent 064445f commit bae62ea

File tree

3 files changed

+43
-22
lines changed

3 files changed

+43
-22
lines changed

doc/source/io.rst

Lines changed: 29 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1833,8 +1833,27 @@ There are a few other available functions:
18331833
**SQLite**. Moreover, the **index** will currently be **dropped**.
18341834

18351835

1836+
STATA Format
1837+
------------
1838+
1839+
Writing to STATA format
1840+
~~~~~~~~~~~~~~~~~~~~~~~
1841+
1842+
.. _io.StataWriter:
1843+
1844+
The function :func:'~pandas.io.StataWriter.write_file' will write a DataFrame
1845+
into a .dta file. The format version of this file is always the latest one,
1846+
115.
1847+
1848+
.. ipython:: python
1849+
1850+
from pandas.io.stata import StataWriter
1851+
df = DataFrame(randn(10,2),columns=list('AB'))
1852+
writer = StataWriter('stata.dta',df)
1853+
writer.write_file()
1854+
18361855
Reading from STATA format
1837-
~~~~~~~~~~~~~~~~~~~~~~
1856+
~~~~~~~~~~~~~~~~~~~~~~~~~
18381857

18391858
.. _io.StataReader:
18401859

@@ -1845,30 +1864,22 @@ initialization. Its function :func:'~pandas.io.StataReader.data' will
18451864
read the observations, converting them to a DataFrame which is returned:
18461865

18471866
.. ipython:: python
1848-
reader = StataReader(dta_filepath)
1849-
dataframe = reader.data()
1867+
1868+
from pandas.io.stata import StataReader
1869+
reader = StataReader('stata.dta')
1870+
reader.data()
18501871
18511872
The parameter convert_categoricals indicates wheter value labels should be
18521873
read and used to create a Categorical variable from them. Value labels can
18531874
also be retrieved by the function variable_labels, which requires data to be
18541875
called before.
1876+
18551877
The StataReader supports .dta Formats 104, 105, 108, 113-115.
18561878

1857-
Alternatively, the function :func:'~pandas.io.read_stata' can be used:
1879+
Alternatively, the function :func:'~pandas.io.read_stata' can be used
18581880

18591881
.. ipython:: python
1860-
dataframe = read_stata(dta_filepath)
1861-
1862-
1863-
Writing to STATA format
1864-
~~~~~~~~~~~~~~~~~~~~~~
1865-
1866-
.. _io.StataWriter:
1867-
1868-
The function :func:'~pandas.io.StataWriter.write_file' will write a DataFrame
1869-
into a .dta file. The format version of this file is always the latest one,
1870-
115.
1882+
:suppress:
18711883
1872-
.. ipython:: python
1873-
writer = StataWriter(filename, dataframe)
1874-
writer.write_file()
1884+
import os
1885+
os.remove('stata.dta')

doc/source/missing_data.rst

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -334,8 +334,10 @@ missing and interpolate over them:
334334
335335
ser.replace([1, 2, 3], method='pad')
336336
337+
.. _missing_data.replace_expression:
338+
337339
String/Regular Expression Replacement
338-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
340+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
339341

340342
.. note::
341343

@@ -350,10 +352,14 @@ String/Regular Expression Replacement
350352
Replace the '.' with ``nan`` (str -> str)
351353

352354
.. ipython:: python
355+
:suppress:
353356
354357
from numpy.random import rand, randn
355-
from numpy import nan
358+
nan = np.nan
356359
from pandas import DataFrame
360+
361+
.. ipython:: python
362+
357363
d = {'a': range(4), 'b': list('ab..'), 'c': ['a', 'b', nan, 'd']}
358364
df = DataFrame(d)
359365
df.replace('.', nan)
@@ -434,16 +440,20 @@ want to use a regular expression.
434440
a compiled regular expression is valid as well.
435441

436442
Numeric Replacement
437-
^^^^^^^^^^^^^^^^^^^
443+
~~~~~~~~~~~~~~~~~~~
438444

439445
Similiar to ``DataFrame.fillna``
440446

441447
.. ipython:: python
448+
:suppress:
442449
443450
from numpy.random import rand, randn
444451
from numpy import nan
445452
from pandas import DataFrame
446453
from pandas.util.testing import assert_frame_equal
454+
455+
.. ipython:: python
456+
447457
df = DataFrame(randn(10, 2))
448458
df[rand(df.shape[0]) > 0.5] = 1.5
449459
df.replace(1.5, nan)

doc/source/v0.11.1.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ Enhancements
5757
- Added module for reading and writing Stata files: pandas.io.stata (GH1512_)
5858
- ``DataFrame.replace()`` now allows regular expressions on contained
5959
``Series`` with object dtype. See the examples section in the regular docs
60-
and the generated documentation for the method for more details.
60+
:ref:`Replacing via String Expression <missing_data.replace_expression>`
6161

6262
See the `full release notes
6363
<https://github.com/pydata/pandas/blob/master/RELEASE.rst>`__ or issue tracker

0 commit comments

Comments
 (0)