Niche.P Niche.P - 11 days ago 4x
Python Question

python XML get text inside <p>...</p> tag

I guys, I have an xml structure which looks somewhat like this.

<p id = "p-0001" num = "0000">
blah blah blah

I would like to extract the
tag inside the
tag only.

I tried:

import xml.etree.ElementTree as ET

xroot = ET.parse('100/A/US07640598-20100105.XML').getroot()

for row in xroot.iter('p'):
print row.text

This get all the
tag in my xml which is not a good idea.

Is there anyway i can extract the text inside

My desire output would be extracting "blah blah blah"


You can use an XPath expression to search for p elements specifically inside the abstract:

for p in xroot.xpath(".//abstract//p"):

Or, if using iter() you may have a nested loop:

for abstract in xroot.iter('abstract'):
    for p in abstract.iter('p'):