Masteryogurt Masteryogurt -4 years ago 91
PHP Question

php curl/xpath data based off < p> text information?

I know how to xpath and echo text off another website via tags like div id, class ,etc, using the below code. But, I don't know how to do it under more precise conditions, for example when trying to scrape and echo a bit of text that has no unique tag identifier like a div.
This below code spits out scraped data.


$doc = new DOMDocument;

// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;

// Most HTML Developers are chimps and produce invalid markup...
$doc->strictErrorChecking = false;
$doc->recover = true;

$doc->loadHTMLFile('http://www.nbcnews.com/business');

$xpath = new DOMXPath($doc);

$query = "//div[@class='market']";

$entries = $xpath->query($query);
foreach ($entries as $entry) {
echo trim($entry->textContent); // use `trim` to eliminate spaces
}


In this below source code for an example, I want to pull the value "21,271.97". But there's no unique tag for this, no div id. Is it possible to pull this data by identifying a keyword in the < p> that never changes, for example "DJIA all time".

<p>DJIA All Time, Record-High Close: <font color="#0000FF">June 9,
2017</font>
(<font color="#FF0000"><b bgcolor="#FFFFCC"><font face="Verdana, Arial,
Helvetica, sans-serif" size="2">21,271.97</font></b></font>)</p>


Wondering if I could possibly replace this with something around the lines of $query = "//div[@class='market']";
$query = "//p['DJIA all time']";

Could this be possible?

I also wonder if using a loop with something like $query = "//p[='DJIA']";?
could work, though I don't know how to use that exactly.
Thanks!!

Answer Source

Try to use below XPath expression:

//p[contains(text(), "DJIA All Time")]//b/font

Considering provided link (http://www.nbcnews.com/business) you can get required text with

//span[text()="DJIA"]/following-sibling::span[@class="market_item market_price"]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download