Skip to content

Commit 743b83b

Browse files
committed
improve docs
1 parent 67b7e3e commit 743b83b

File tree

1 file changed

+16
-32
lines changed

1 file changed

+16
-32
lines changed

doc/source/user_guide/io.rst

Lines changed: 16 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1295,57 +1295,41 @@ too many fields will raise an error by default:
12951295
12961296
You can elect to skip bad lines:
12971297

1298-
.. code-block:: ipython
1299-
1300-
In [29]: pd.read_csv(StringIO(data), on_bad_lines="warn")
1301-
Skipping line 3: expected 3 fields, saw 4
1298+
.. ipython:: ipython
13021299

1303-
Out[29]:
1304-
a b c
1305-
0 1 2 3
1306-
1 8 9 10
1300+
pd.read_csv(StringIO(data), on_bad_lines="warn")
13071301

13081302
Or pass a callable function to handle the bad line if ``engine="python"``.
13091303
The bad line will be a list of strings that was split by the ``sep``:
13101304

1311-
.. code-block:: ipython
1305+
.. versionadded:: 1.4.0
1306+
1307+
.. ipython:: ipython
1308+
1309+
external_list = []
13121310

1313-
In [30]: pd.read_csv(StringIO(data), on_bad_lines=lambda x: x[-3:], engine="python")
1314-
Out[30]:
1315-
a b c
1316-
0 1 2 3
1317-
1 5 6 7
1318-
2 8 9 10
1311+
def func(line):
1312+
external_list.append(line)
1313+
return line[-3:]
13191314

1320-
.. versionadded:: 1.4.0
1315+
pd.read_csv(StringIO(data), on_bad_lines=func, engine="python")
13211316

1317+
external_list
13221318

13231319
You can also use the ``usecols`` parameter to eliminate extraneous column
13241320
data that appear in some lines but not others:
13251321

1326-
.. code-block:: ipython
1327-
1328-
In [31]: pd.read_csv(StringIO(data), usecols=[0, 1, 2])
1322+
.. ipython:: ipython
13291323

1330-
Out[31]:
1331-
a b c
1332-
0 1 2 3
1333-
1 4 5 6
1334-
2 8 9 10
1324+
pd.read_csv(StringIO(data), usecols=[0, 1, 2])
13351325

13361326
In case you want to keep all data including the lines with too many fields, you can
13371327
specify a sufficient number of ``names``. This ensures that lines with not enough
13381328
fields are filled with ``NaN``.
13391329

1340-
.. code-block:: ipython
1341-
1342-
In [32]: pd.read_csv(StringIO(data), names=['a', 'b', 'c', 'd'])
1330+
.. ipython:: ipython
13431331

1344-
Out[32]:
1345-
a b c d
1346-
0 1 2 3 NaN
1347-
1 4 5 6 7
1348-
2 8 9 10 NaN
1332+
pd.read_csv(StringIO(data), names=['a', 'b', 'c', 'd'])
13491333

13501334
.. _io.dialect:
13511335

0 commit comments

Comments
 (0)