Closed
Description
Current Situation
If a HTML string has ...
- Additional whitespace at the tail of the HTML document
- A null (non-terminated) HTML tag such as
<hr>
within the document
... then html_to_vdom
will silently fail without raising an exception
See my Django IDOM PR for how I caused the script
tag issue to occur at this LOC.
Proposed Actions
Whitespace issue can be fixed by simple regex and/or using str.strip()
.
Script tag issue needs further investigation. This might open up a can of worms to test html_to_vdom
with all idom.html
types.