Latestarter Latestarter - 6 months ago 16
HTML Question

Google Sheets importxml for New York Times

For months, I've stuck in this tough issue.

Please visit this link for details:

[Goal]


By importxml, it is to get the data and its a href link in Google
Sheets, but I failed in bringing the link in. Let me know how to bring
the href link in B2.


[Problem]


A href link doesn't have its class.


Please visit the link below to see the detail because I'm new to Stackoverflow and not qualified to upload an image. Sorry about it.

[The codes below should work, but it doesn't produce any result]
I think there's an xpath grammar error.

Namely,

A2: http://www.nytimes.com/pages/world/index.html

B2: =importxml(A2, ╩║//div[@class='story']|//div[@class='thumbnail']//a/@href╩║)

B2 doesn't show any result.

https://productforums.google.com/forum/#!searchin/docs/importxml$20new$20york$20times/docs/4-IJJ6_h5Pw/prNdITEsQ4AJ

Answer

It is still not very clear what you would like to find, but sometimes it's easier to comment on an attempted solution than trying to explain.

The formula you have shown does return results. If I add the following in my own Sheets file:

=IMPORTXML("http://www.nytimes.com/pages/world/index.html", "//div[@class='story']|//div[@class='thumbnail']//a/@href")

the function returns several href attributes, and the result looks like

enter image description here

Is this what you expected or would you like to change the result?