Rocky Rocky - 4 months ago 19
PHP Question

Using PHP DOM to change title in bold format and other string modifications

<?php
$data = 'THE CORRECT ANSWER IS C.
<p>Choice A Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s</p>
<p></p>
<p>Choice B Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s</p>
<p>Choice D Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s</p>
<p></p>
<p>Choice E simply dummy text of the printing and typesetting industry.</p>
<p></p>
<p><br>THIS IS MY MAIN TITLE IN CAPS<br>This my sub title.</p>
<p><br>TEST ABC: Lorem Ipsum is simply dummy text of the printing and typesetting industry.</p>
<p>1) If not at Goal Blood Pressure <140/90 mmHg OR <130/80 mmHg for patients with diabetes or chronic kidney disease start medication:
<br><br>2) 1. Lifestyle Modifications </p>
<p><br>TEST XYZ: Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.</p>
<p><br>TES T TEST: It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.</p>
<p><br>TESTXXX: It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>';

$dom = new DOMDocument();
$dom->loadHTML($data, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//text()') as $node) {
$txt = trim($node->nodeValue);
$p = $node->parentNode;
if (preg_match("/^\s*(TEST ABC:|TEST XYZ:|TES T TEST:|TESTXXX)(.*)$/s", $node->nodeValue, $matches)) {
// Put Choice X in bold:
$p->insertBefore($dom->createElement('strong', $matches[1]), $node);
$node->nodeValue = " " . trim($matches[2]);
} else if (strtoupper($txt) === $txt && $txt !== '') {
// Put header in bold
$p->insertBefore($dom->createElement('strong', $txt), $node);
$node->nodeValue = "";
}
}
$data = $dom->saveHTML();
echo $data;


I have tried 1st, 2nd points are working good just have to solve 3rd issue:


  1. Title with bold: "THIS IS MY MAIN TITLE IN CAPS" (title not always same)

  2. Words with bold: TEST ABC:, TEST XYZ:, TES T TEST:, TESTXXX: (this words are always same)

  3. Some strings are not showing skipping a line when you run this code (lessthen and graterthen in string forex: <140/90 mmHg OR <130/80 mmHg).


Answer

Regular expression could indeed be used to deal with this, but in general it is advisable to perform HTML manipulation through a DOM. PHP's DOMDocument provides this.

You could then use this code, which walks through all text nodes and sees if any of the two conditions are met:

  • The text starts with words in a predefined list
  • The text is entirely in upper case

In both cases a new strong node is created with that content, and the original node is adapted accordingly.

$dom = new DOMDocument();
$dom->loadHTML($data, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach($xpath->query('//text()') as $node) {
    $txt = trim($node->nodeValue);
    $p = $node->parentNode;
    if (preg_match("/^\s*(TEST ABC:|TEST XYZ:|TES T TEST:|TESTXXX)(.*)$/s", $node->nodeValue, $matches)) {
        // Put Choice X in bold:
        $p->insertBefore($dom->createElement('strong', $matches[1]), $node);
        $node->nodeValue = " " . trim($matches[2]);
    } else if (strtoupper($txt) === $txt && $txt !== '') {
        // Put header in bold
        $p->insertBefore($dom->createElement('strong', $txt), $node);
        $node->nodeValue = "";
    }
}
$data = $dom->saveHTML();

See it run on ideone.com