← Back to team overview

mahara-contributors team mailing list archive

[Bug 1997291] Re: DOMDocument::loadHTML() expecting '; '

 

** Changed in: mahara
    Milestone: None => 23.04.0

** Changed in: mahara
    Milestone: 23.04.0 => 22.10.1

** Also affects: mahara/23.04
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Mahara
Contributors, which is subscribed to Mahara.
Matching subscriptions: mahara-contributors
https://bugs.launchpad.net/bugs/1997291

Title:
  DOMDocument::loadHTML() expecting ';'

Status in Mahara:
  New
Status in Mahara 23.04 series:
  New

Bug description:
  While running /lib/cron.php I noticed a lot of PHP Warnings on the
  next page load. These are also in the error log.

  My current suspicion is that these are triggered when trying to send
  e-mail about forum activity.

  The actual error:

  DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity.

  While this is a PHP Warning it isn't causing crashes.  However, it
  will be filling up error logs and may be causing unexpected behaviour
  in other places.

  This error is occurring whenever html2text() is called. This is
  calling HtmltoText which calls DOMDocument and this is where the error
  happens.  When DOMDocument::loadHTML() is called the errors are thrown
  whenever a non-encoded ampersand is found in the document.  i.e. &
  rather than &

  Showing the error in an interactive shell:

  php > # Example 1:
  php > $s = '<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></head><p>Forum topic</p><p><img width="1024" height="" style="" alt="body_fire.jpg" src="https://dev.mahara.local/artefact/file/download.php?file=193&embedded=1&group=1&topic=1&post=1";></p>';
  php > 
  php > # Example 2 is to demonstrate a working version of the string:
  php > $t = '<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></head><p>Forum topic</p><p><img width="1024" height="" style="" alt="body_fire.jpg" src="https://dev.mahara.local/artefact/file/download.php?file=193&amp;embedded=1&amp;group=1&amp;topic=1&amp;post=1";></p>';
  php > 
  php > $doc = new DOMDocument;
  php > $doc->loadHTML($s);
  PHP Warning:  DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 2 in php shell code on line 1
  PHP Warning:  DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 2 in php shell code on line 1
  PHP Warning:  DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 2 in php shell code on line 1
  PHP Warning:  DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 2 in php shell code on line 1
  php > $doc->loadHTML($t);
  php > 

  The examples I've been finding are in interaction_forum_post's with
  images in them.

  The specific code that causes this to come about looks to be in in
  prepare_post_body() in htdocs/interaction/forum/lib.php.  This is
  explicitly stripping out &amp; and leaving just the & character in any
  tags that have a call to download.php when a post is saved.

To manage notifications about this bug go to:
https://bugs.launchpad.net/mahara/+bug/1997291/+subscriptions



References