Jasper Jasper - 1 year ago 96
HTML Question

Xpath text() returning no text

I'm trying to restaurant names from Tripadvisor with Python 3 & lxml. The text i'm trying to retrieve is in the following element and is named 'Al Fresco's in this case.

<a target="_blank" href="/Restaurant_Review-g293925-d8327527-Reviews-
Al_Fresco_s-Ho_Chi_Minh_City.html" class="property_title"
onclick="ta.restaurant_list_tracking.clickDetailTitle('/Restaurant_Review-
g293925-d8327527-Reviews-Al_Fresco_s-
Ho_Chi_Minh_City.html','tags_category_tag_restaurants','8327527','1','0');">
Al Fresco's
</a>


The Xpath reference to this element:

//*[@id="eatery_8327527"]/div[2]/div[1]/div[1]/a


I use the following simple code to retrieve the text in this element:

from lxml import html
import requests

page = requests.get('https://www.tripadvisor.nl/Restaurants-g293925-
Ho_Chi_Minh_City.html')
tree = html.fromstring(page.content)

#This will create a list of Names:
Name = tree.xpath('//*[@id="eatery_8327527"]/div[2]/div[1]/div[1]/a/text()')
print ('Name: ', Name)


This returns me an empty array: Name: []
How do I get the text I want?

Answer Source

Without having a look at the actual page your Xpath is probably too strict. Try something like this:

//a[contains(@href,"Restaurant_Review")]/text()

If that yields too many results try adding the parent in front.

Hope that helps.

UPDATE:

After having a look at the actual page, this i probably what you are looking for:

//a[contains(@class,"property_title")]/text()
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download