mcoze mcoze - 2 months ago
322 0

I want to scrape sales information of aeroplanes for sale in Australia with BeautifulSoup:

One such site -
http://www.planesales.com.au/search/?dosearch=1&category_id=0&aircraft_type_id=0&model_id=%C2%A0&price_min=0&price_max=0&year_min=0&year_max=0&search=Search

The updated python script below outputs the data into a csv file - after concatenating lists of results...

I think that a similar method could be implemented to search for more detailed information in the linked pages. The second scrape of linked pages can be commenced once obtaining a list of valid href's. The results can then be nested into the table.

Python

BeautifulSoup - scraping help [updated]

#! python
import urllib
import urllib.request
import csv

from bs4 import BeautifulSoup

def make_soup(url):
	thepage = urllib.request.urlopen(url)
	soupdata = BeautifulSoup(thepage, "html.parser")
	return soupdata

var1 = []
var2 = []
	
for number in {"1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"}:
	soup = make_soup("http://www.planesales.com.au/search/?dosearch=1&category_id=0&aircraft_type_id=0&model_id=0&price_min=0&price_max=0&sort_by=0&search=Search&page=" + number + "&total=101")
	for name in soup.findAll('div',{"class":"details"}):
		print(name.find('strong').text)
		var1.append(name.find('strong').text)
	for price in soup.findAll('div',{"class":"summary"}):
		print(price.find('h4').text)
		var2.append(price.find('h4').text)
print( var1 )
print( var2 )
print("the number is", len(var1))
print("the number is", len(var2))

zip(var1, var2)

with open('some2.csv', 'w') as f:
	writer = csv.writer(f)
	writer.writerows(zip(var1,var2))