Peter Nemeth .malomsok. Peter Nemeth .malomsok. - 4 months ago 21
HTML Question

php html tags converted to string

I am trying to process a HTML file with php as a DOM document. Processing is okay, but when I save the html document with $html->saveHTMLFile("file_out.html"); all link tags are converted from:

Click here: <a title="editable" href="http://somewhere.net">somewhere.net</a>


to

Click here: &lt;a title="editable" href="http://somewhere.net"&gt; somewhere.net &lt;/a&gt;


I process the links as php scripts, maybe this makes a difference?
I cannot convert the
&lt;
back to < with htmlentitites_decode() or such. Is there any other conversion or encoding I can use?

The php script looks like the following:

<?php
$text = $_POST["textareaX"];
$id = $_GET["id"];
$ref = $_GET["ref"];
$html = new DOMDocument();
$html->preserveWhiteSpace = true;
$html->formatOutput = false;
$html->substituteEntities = false;
$html->loadHTMLFile($ref.".html");
$elem = $html->getElementById($id);
$elem->nodeValue = $innerHTML;

if ($text == "")
{ $text = "--- No details. ---"; }
$newtext = "";
$words = explode(" ",$text);
foreach ($words as $word) {
if (strpos($word, "http://") !== false) {
$newtext .= "<a alt=\"editable\" href=\"".$word."\">".$word."</a>";
}
else {$newtext .= $word." ";}
}

$text = $newtext;

function setInnerHTML($DOM, $element, $innerHTML) {
$node = $DOM->createTextNode($innerHTML);
$children = $element->childNodes;
foreach ($children as $child) {
$element->removeChild($child);
}
$element->appendChild($node);
}

setInnerHTML($html, $elem, $text);
$html->saveHTMLFile($ref.".html");
header('Location: '."tracking.php?ref=$ref&user=unLock");
?>


We get the reference to a file from "id" and "ref" and the input data from array "textareaX". Next I open the file, identify the html element by id and replace its content (a link) with the input data from the textarea. I provide only the href in the textarea and the script builds the hyperlink from that. Next I plug this back into the original file and overwrite the input file.

When I write the new file though, the link
<a href= ...> </a>
is converted to
&lt;a href=...&gt; &lt;/a&gt;
, which is a problem.

Jim Jim
Answer

Here is part of your code with the issue identified:

<?php

function setInnerHTML($DOM, $element, $innerHTML) {
  /*********************************
      Well, there's your problem:
  **********************************/
  $node = $DOM->createTextNode($innerHTML);
  $children = $element->childNodes;
  foreach ($children as $child) {
    $element->removeChild($child);
  }
  $element->appendChild($node);
}

?>

What you are doing is passing your new anchor (a) tag as a string then creating a text node out of it (text is just that - text, not HTML). The createTextNode function automatically encodes any HTML tags so that they will be visible as text when viewed by a browser (this is so you can present HTML as visible code on your page if you choose to).

What you need to do is create the element as HTML (not a text node) then append it:

<?php

function setInnerHTML($DOM, $element, $innerHTML) {

  $f = $DOM->createDocumentFragment();
  $f->appendXML($innerHTML);
  $element->appendChild($f);

}

?>