Why Is Lxml Closing This "ol" Tag When Parsing?
Here is some HTML:
- item
Solution 1:
I think neither HTML 4 nor HTML5 allows an ul
element as a child of an ol
element. Only li
elements can be direct children.
That might be why an HTML parser builds a tree structure not representing the nesting you have in your input markup. Whether a "traditional" HTML 4 parser, like probably implemented in lxml's/libxml's HTML parser algorithm, did the same change to the structure is something I don't remember and I am not sure where to test it.
While two HTML5 validators flag your ul
as a not-allowed child of ol
, current browsers seem to preserve that nesting.
Post a Comment for "Why Is Lxml Closing This "ol" Tag When Parsing?"