Nathan Harper Nathan Harper - 3 years ago 62
PHP Question

Extending PHP's SimpleXMLElement

I'm sending XML documents to outside vendors, and one of them is having trouble parsing our XML because single and double quotes are present in the XML content. I know that per the official XML spec these only need to be escaped when used in attributes, but I figured that it wouldn't be too much difficulty to extend PHP's SimpleXMLElement to make it escape quotes. This was not the case. My first attempt was this:

<?php
class BetterXMLElement extends SimpleXMLElement
{
public function __set($name, $value)
{
echo "called __set with $name and $value";
$this->addChild($name, $value);
}

public function addChild($name, $value=null, $ns=null)
{
$new_value = strtr($value, [
'&' => '&amp;',
'"' => '&quot;',
"'" => '&apos;',
]);
echo "New Value: $new_value\n";
parent::addChild($name, $new_value, $ns);
}
}

$xml = new BetterXMLElement('<?xml version="1.0" encoding="UTF-8"?><TRANSACTION></TRANSACTION>');
$xml->COST = "apos: ', amp: &, quot: \"";
$xml->addChild('PRODUCT', "apos: ', amp: &, quot: \"");
echo $xml->asXML();


The above code outputs:

New Value: apos: &apos;, amp: &amp;, quot: &quot;
<?xml version="1.0" encoding="UTF-8"?>
<TRANSACTION><COST>apos: ', amp: &amp;, quot: "</COST><PRODUCT>apos: ', amp: &amp;, quot: "</PRODUCT></TRANSACTION>


What this indicates to me is:


  1. The echo in
    __set
    is not being called, as I would expect it to be when I am setting COST. Why isn't this working?

  2. My override on addChild is getting called when I set PRODUCT, but the HTML entities for the quotes are getting transformed back when asXML is called. Why does it work like this? Is there a way to disable it?


Answer Source

As far as I can tell, there's no clean built-in way to do what I needed to do with either SimpleXMLElement or DOMDocument due to the inherent design of libxml2. I ended up replacing the necessary characters with placeholders before XML processing (e.g. a single quote gets replaced with "{quot}") and then swapping those placeholders with the appropriate HTML entity in the XML output.

Also, as axiac pointed out in a previous comment, SimpleXMLElement is not a normal PHP class, which is why my override attempts were failing.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download