← Back to team overview

registry team mailing list archive

[Bug 661890] Re: After parsing certain rouge html file, all further parses of any data raise an excepton

 

Thanks for the report and the test case. It turns out that this is a
problem in libxml2, which fails to reset a fatal error flag in the
parser context at the next parser run. It's easy to work around in
lxml,etree, so Ive committed a quick fix.

diff -r 9ce32c6e84f4 src/lxml/parser.pxi
--- a/src/lxml/parser.pxi       Wed Oct 20 20:01:33 2010 +0200
+++ b/src/lxml/parser.pxi       Thu Oct 21 19:41:31 2010 +0200
@@ -504,6 +504,7 @@
         if self._c_ctxt is not NULL:
             if self._c_ctxt.html:
                 htmlparser.htmlCtxtReset(self._c_ctxt)
+                self._c_ctxt.disableSAX = 0 # work around bug in libxml2
             elif self._c_ctxt.spaceTab is not NULL or \
                     _LIBXML_VERSION_INT >= 20629: # work around bug in libxml2
                 xmlparser.xmlClearParserCtxt(self._c_ctxt)


** Changed in: lxml
   Importance: Undecided => Medium

** Changed in: lxml
       Status: New => Fix Committed

** Changed in: lxml
     Assignee: (unassigned) => Stefan Behnel (scoder)

** Bug watch added: GNOME Bug Tracker #632811
   https://bugzilla.gnome.org/show_bug.cgi?id=632811

** Also affects: libxml2 via
   https://bugzilla.gnome.org/show_bug.cgi?id=632811
   Importance: Unknown
       Status: Unknown

-- 
After parsing certain rouge html file, all further parses of any data raise an excepton
https://bugs.launchpad.net/bugs/661890
You received this bug notification because you are a member of Registry
Administrators, which is the registrant for libxml2.