@@ -64,27 +64,11 @@ API changes
64
64
65
65
Enhancements
66
66
~~~~~~~~~~~~
67
+
67
68
- ``pd.read_html()`` can now parse HTML strings, files or urls and return
68
- DataFrames
69
- courtesy of @cpcloud. (GH3477_, GH3605_, GH3606_)
70
- - ``read_html()`` (GH3616_)
71
- - now works with only a *single* parser backend, that is:
72
- - BeautifulSoup4 + html5lib
73
- - does *not* and will never support using the html parsing library
74
- included with Python as a parser backend
75
- - is a bit smarter about the parent table elements of matched text: if
76
- multiple matches are found then only the *unique* parents of the result
77
- are returned (uniqueness is determined using ``set``).
78
- - no longer tries to guess about what you want to do with empty table cells
79
- - argument ``infer_types`` now defaults to ``False``.
80
- - now returns DataFrames whose default column index is the elements of
81
- ``<thead>`` elements in the HTML soup, if any exist.
82
- - considers all ``<th>`` and ``<td>`` elements inside of ``<thead>``
83
- elements.
84
- - tests are now correctly skipped if the proper libraries are not
85
- installed.
86
- - tests now include a ground-truth csv file from the FDIC failed bank list
87
- data set.
69
+ DataFrames, courtesy of @cpcloud. (GH3477_, GH3605_, GH3606_, GH3616_).
70
+ It works with a *single* parser backend: BeautifulSoup4 + html5lib
71
+
88
72
- ``HDFStore``
89
73
90
74
- will retain index attributes (freq,tz,name) on recreation (GH3499_)
0 commit comments