Ibraham Ibraham - 3 months ago 15
Python Question

scrape Url and text from website using lxml python

I don't know much about lxml and xpaths and I want to learn how to scrape data from website. When I run this code I don't get any results and don't know why. Please help me to fix it.

code here

from lxml import html
import requests
pageLen=str(100)
page = requests.get('http://www.yellowpages.com/search?search_terms=lawyer&geo_location_terms=usa&page=2')
print(page)
tree = html.fromstring(page.content)
#phoneNumber = tree.xpath('//span[@class="c411Phone"]/text()')
Link=tree.xpath('//div[@class="info"]/a/@href')
Bname=tree.xpath('//a[@class="business-name"]/text()')
print(Bussiness_names)
print(Bname)


HTML CODE

enter image description here

Answer

quick and dirty:

from lxml import html
import requests

url = 'http://www.yellowpages.com/search?search_terms=lawyer&geo_location_terms=usa&page=2'
page = requests.get(url)
tree = html.fromstring(page.text)
tree.make_links_absolute(url)
for business in tree.xpath('//a[@class="business-name"]'):
  print business.attrib['href'], business.text