emil rowland emil rowland - 6 months ago 25
Python Question

Python3 XML get text between tags

I have the following code in Python 3. I am using the

import xml.etree.ElementTree as ET
for XML parsing. the webScraper grab the text from an webside but on that website there is text between the
<link></link>
tag, but the program returns None. I can se that the program finds all tags but where the tag result should be printed it only says None.

result = webScrapper.scrappPart("http://www.dn.se/rss/senaste-nytt/", "body")
root = ET.fromstring(result)
for items in root.findall('.//item'):
link = items.find('link')
print(link.text)


Does anyone know how to fix this?

Answer

Since your URL is actually an RSS feed, you'd be much better off using an RSS feed parser on it, instead of trying to roll your own. Fortunately, this is why feedparser exists. Check this out:

import feedparser as fp

feed = fp.parse("http://www.dn.se/rss/senaste-nytt/")
for entry in feed["entries"]:
    print(entry["link"])

This returns

http://www.dn.se/sport/fotboll/cavani-het-i-svalt-psg/
http://www.dn.se/sport/fotbolls-em/kompany-missar-em/
http://www.dn.se/nyheter/sverige/livvaktens-slakting-fick-praktik-hos-sahlin-trots-myndighetens-avslag/
http://www.dn.se/sport/st-louis-andraperiod-avgjorde/
http://www.dn.se/nyheter/varlden/syrien-spanska-journalister-fria/
http://www.dn.se/sport/dansk-dynamit-ska-stoppa-tre-kronor/
http://www.dn.se/nyheter/sverige/mordmisstankt-slappt-ur-haktet-1/
http://www.dn.se/nyheter/varlden/ekonomiprofessor-loste-ekvation-togs-for-terrorist/
http://www.dn.se/sport/fotboll/leicester-firade-med-storseger/
http://www.dn.se/ekonomi/protester-mot-ny-granskontroll-urartade/
http://www.dn.se/sport/ishockey-vm/jimmie-ericsson-jag-ar-beredd-gora-allt-for-att-vinna/
http://www.dn.se/sport/ishockey-vm/schweiz-straffat-av-kazakstan/
http://www.dn.se/nyheter/varlden/natosoldater-dodade-i-afghanistan-2/
http://www.dn.se/sport/forsta-matchen-till-eslov/
http://www.dn.se/nyheter/sverige/drunknad-man-hittad-av-dykare/
http://www.dn.se/ekonomi/tagstopp-efter-olycka/
http://www.dn.se/sport/kristianstad-till-sm-final/
http://www.dn.se/sthlm/en-person-attackerad-med-kniv-i-centrala-stockholm/
http://www.dn.se/nyheter/sverige/inga-spar-efter-forsvunnen-22-arig-student/
http://www.dn.se/sport/fotboll/forlust-for-rydstrom-i-tranardebuten/
http://www.dn.se/nyheter/sverige/manga-grasbrander-runt-om-i-landet/
http://www.dn.se/nyheter/sverige/tre-gripna-efter-skottlossning-i-malmo/
http://www.dn.se/sport/fotboll/elfsborg-ar-med-i-toppen-igen/
http://www.dn.se/sport/em-silver-till-rissveds/

which I assume is what you're looking for.