The Bndr The Bndr - 1 month ago 31
Python Question

Test if children tag exists in beautifulsoup

i have an XML file with an defined structure but different number of tags, like

file1.xml:

<document>
<subDoc>
<id>1</id>
<myId>1</myId>
</subDoc>
</document>


file2.xml:

<document>
<subDoc>
<id>2</id>
</subDoc>
</document>


Now i like to check, if the tag
myId
exits. So i did the following:

data = open("file1.xml",'r').read()
xml = BeautifulSoup(data)

hasAttrBs = xml.document.subdoc.has_attr('myID')
hasAttrPy = hasattr(xml.document.subdoc,'myID')
hasType = type(xml.document.subdoc.myid)


The result is for
file1.xml:

hasAttrBs -> False
hasAttrPy -> True
hasType -> <class 'bs4.element.Tag'>


file2.xml:

hasAttrBs -> False
hasAttrPy -> True
hasType -> <type 'NoneType'>


Okay,
<myId>
is not an attribute of
<subdoc>
.

But how i can test, if an sub-tag exists?

//Edit: By the way: I'm don't really like to iterate trough the whole subdoc, because that will be very slow. I hope to find an way where I can direct address/ask that element.

Answer

You can get this by accessing the tag name as an attribute like this xml.document.subdoc.myid. So the whole thing would go something like this:

with open("file1.xml",'r') as data, open("file2.xml",'r') as data2:
    xml = BeautifulSoup(data.read())
    xml2 = BeautifulSoup(data2.read())

    hasAttrBs = xml.document.subdoc.myid
    hasAttrBs2 = xml2.document.subdoc.myid
    print hasAttrBs
    print hasAttrBs2

Prints

<myid>1</myid>
None
Comments