I am running the below code to find an element containing Unicode Arabic characters. The below code works just fine if I replace XXX with English letter, however, if I replace them with Arabic letters It won't.
I checked the html page and it has "< meta charset="utf-8" >" so I set the character set in my Py script at the first line just to make sure the letters are interpreted as expected but still not working.
Any clue is much appreciate it.
from selenium import webdriver
# create a new Firefox session
driver = webdriver.Firefox()
print driver.find_element_by_xpath(u"//*[contains(text(), 'XXX')]").text
I think you are not using the correct unicode in the xpath,
check the demo in
First I have selected one node to get the corresponding unicode for that arabic word, so after using that unicode modified the xpath as follows and this was the output.
In : response.xpath('//li[@class="lensItem"]/a/text()').extract() Out: [u'\u0639\u062f\u0633\u06cc'] In : response.xpath(u'//a[contains(text(), "\u0639\u062f\u0633\u06cc")]/text()').extract() Out: [u'\u0639\u062f\u0633\u06cc', u'\u0639\u062f\u0633\u06cc', u'\u0645\u0634\u062e\u0635\u0627\u062a \u0639\u062f\u0633\u06cc \u0622\u0641\u062a\u0627\u0628\u06cc'] In : a = response.xpath(u'//a[contains(text(), "\u0639\u062f\u0633\u06cc")]/text()').extract() In : for i in a: ...: print i ...: عدسی عدسی مشخصات عدسی آفتابی
I have tested the xpath using
Scrapy but this will also work with
In : driver.find_element_by_xpath(u'//a[contains(text(), "\u0639\u062f\u0633\u06cc")]').text Out: u'\u0639\u062f\u0633\u06cc'
I hope this will help you to solve your issues.