plzhelpmi plzhelpmi - 1 year ago 93
Python Question

Looping through HTML tags using BeautifulSoup

As mentioned in the previous questions, I am using Beautiful soup with python to retrieve weather data from a website.

Here's how the website looks like:

<title>2 Hour Forecast</title>
<source>Meteorological Services Singapore</source>
<description>2 Hour Forecast</description>
<title>Nowcast Table</title>
<category>Singapore Weather Conditions</category>
<forecastIssue date="18-07-2016" time="03:30 PM"/>
<validTime>3.30 pm to 5.30 pm</validTime>
<area forecast="TL" lat="1.37500000" lon="103.83900000" name="Ang Mo Kio"/>
<area forecast="SH" lat="1.32100000" lon="103.92400000" name="Bedok"/>
<area forecast="TL" lat="1.35077200" lon="103.83900000" name="Bishan"/>
<area forecast="CL" lat="1.30400000" lon="103.70100000" name="Boon Lay"/>
<area forecast="CL" lat="1.35300000" lon="103.75400000" name="Bukit Batok"/>
<area forecast="CL" lat="1.27700000" lon="103.81900000" name="Bukit Merah"/>`
<area forecast="PC" lat="1.41800000" lon="103.83900000" name="Yishun"/>

I managed to retrieve the information I need using these codes :

import requests
from bs4 import BeautifulSoup
import urllib3
import csv
import sys
import json

#getting the Validtime

area_attrs_li = []

r = requests.get('
soup = BeautifulSoup(r.content, "xml")
time = soup.find('validTime').string
print "validTime: " + time

#getting the date

for currentdate in soup.find_all('item'):
element = currentdate.find('forecastIssue')
print "date: " + element['date']

#getting the time

for currentdate in soup.find_all('item'):
element = currentdate.find('forecastIssue')
print "time: " + element['time']

#print area

for area in'area'):
print area

#print area name

areas ='area')
for data in areas:
name = (data.get('name'))
print name

f = open("C:\\scripts\\testing\\testingnea.csv" , 'wt')

for area in area_attrs_li:
#print str(area) + "\n"
writer = csv.writer(f)
writer.writerow( (time, element['date'], element['time'], area, name))


print open("C:/scripts/testing/testingnea.csv", 'rt').read()

I managed to get the data in a CSV, however when I run this part of the codes:

#print area name

areas ='area')
for data in areas:
name = (data.get('name'))
print name

This is the result:

This is what I got

Apparently, my loop is not working as it keeps printing the last area of the last record over and over again.

EDIT: I tried looping through the data for area in the list :

for area in area_attrs_li:
name = (area.get('name'))
print name

However, its still not looping.

I'm not sure where did the codes go wrong :/

Answer Source

This is because when you are writing, you are referring last instance of loop, try this :

writer.writerow( (time, element['date'], element['time'], area, area['name']))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download