PDGill PDGill - 2 months ago 43
Python Question

Python - Selenium AttributeError: list object has no attribute find_element_by_xpath

I am attempting to perform some scraping of nutritional data from a website, and everything seems to be going swimmingly so far, until I run into pages that are formatted slightly different.

Using selenium and a line like this, returns an empty list:

values = browser.find_elements_by_class_name('size-12-fl-oz' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value')


print would return this:

[]
[]
[]
[]
[]


But if I define out the element position, then it works fine:

kcal = data.find_elements_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])[position()=1]").text


The issue I have ran into, is when the elements are not the same from page to page as I iterate. So if a div does not exist in position 9, then an error is thrown.

Now when I go back and try to edit my code to do a
try/catch
, I am getting:


AttributeError: 'list' object has no attribute 'find_element_by_xpath'


or


AttributeError: 'list' object has no attribute 'find_elements_by_xpath'


Here is the code, with my commented out areas from my testing back and forth.

import requests, bs4, urllib2, csv
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import NoSuchElementException

browser = webdriver.Firefox()
...

#Loop on URLs to get Nutritional Information from each one.
with open('products.txt') as f:
for line in f:
url = line
# url = 'http://www.tapintoyourbeer.com/index.cfm?id=3'
browser.get(url)
with open("output.csv", "a") as o:
writeFile = csv.writer(o)
browser.implicitly_wait(3)
product_name = browser.find_element_by_tag_name('h1').text.title() #Get product name
size = browser.find_element_by_xpath("(//div[@class='dotted-tab'])").text #Get product size
data = browser.find_elements_by_xpath("//table[@class='beer-data-table']")
# values=[]
# values = browser.find_elements_by_class_name('size-12-fl-oz' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value')
try:
# values = data.find_elements_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])")
kcal = data.find_element_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])[position()=1]").text
kj = data.find_element_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])[position()=3]").text
fat = data.find_element_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])[position()=5]").text
carbs = data.find_element_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])[position()=7]").text
protein = data.find_element_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])[position()=9]").text
values = [kcal, kj, fat, carbs, protein]
print values
writeFile.writerow([product_name] + [size] + values)
except NoSuchElementException:
print("No Protein listed")
browser.quit()


I had it working earlier to produce a list, and output to a CSV, but on occasion, the position count would come out wrong.

[u'Budweiser', u'12 FL OZ', u'145.00', u'', u'', u'', u'']
[u"Beck'S", u'12 FL OZ', u'146.00', u'610.86', u'0.00', u'10.40', u'1.80']
[u'Bud Light', u'12 FL OZ', u'110.00', u'460.24', u'0.00', u'6.60', u'0.90']
[u'Michelob Ultra', u'12 FL OZ', u'95.00', u'397.48', u'0.00', u'2.60', u'0.60']
[u'Stella Artois', u'100 ML', u'43.30', u'KCAL/100 ML', u'181.17', u'KJ/100 ML', u'0.00']


The problems started when position 9 didn't exist on a particular page.

Are there any suggestions on how to fix this headache? Do I need to have cases set up for the different pages & sizes?

I appreciate the help.

Answer

Actually find_elements() returns either list of WebElement or empty list. You're storing this result into a list variable name data.

AttributeError: 'list' object has no attribute 'find_element_by_xpath'

AttributeError: 'list' object has no attribute 'find_elements_by_xpath'

This occurs because you're going to find nested WebElement on data list that's why you're calling as data.find_element_by_xpath() or data.find_elements_by_xpath() which is absolutely wrong.

Actually find_element() or find_elements() is used to search the element on the page context or the context of the WebElement instead of list.

So you should try to find individual WebElement from the data list and then find further nested WebElement using this element context as below :-

if len(data) > 0:
  #now find desire element using index
  individual_element = data[0]

  #now you can find further nested single element using find_element() or list of elements using find_elements() at individual_element context
  kcal = individual_element.find_element_by_xpath("(.//div[@class='size-12-fl-oz nutrition-value' or 'size-330-ml hide nutrition-value' or 'size-8-fl-oz nutrition-value'])[position()=1]").text

  ----------------------------
  ----------------------------
Comments