Michael Michael -4 years ago 91
Python Question

Issue of inding text of lxml

I'm going to find a text on an element tree, but there are two situations that I cannot get the text and it shows 'None'

First Situation: First tag has a link

from lxml import etree

node = etree.fromstring('<a xml='www.www.com'><c>bum</c></a>')

print node.findtext('c',default = 'what happened?')


Second Situation: Text tag's parent tag has no content

from lxml import etree

node = etree.fromstring('<a><b><c>bum</c></b></a>')

print node.findtext('c', default = 'what happened?')


Successful Code: No link and tag that has no content

from lxml import etree

node = etree.fromstring('<a><c>bum</c></a>')

print node.findtext('c')


I want to know how can I get text of bum on these two situation

Thanks

Answer Source

Use .iter to find the correct tag(s), and then .text:

node1 = etree.fromstring("<a xml = 'www.www.com'><c>bum</c></a>")
node2 = etree.fromstring('<a><b><c>bum</c></b></a>')

for c_node in node1.iter(tag='c'):
    print(c_node.text)
    # bum

for c_node in node2.iter(tag='c'):
    print(c_node.text)
    # bum

Note that in the 1st case print(node1.find('c').text) works too, but in the 2nd (print(node2.find('c').text)) it doesn't.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download