amphibient amphibient - 19 days ago 5
Python Question

How to replace/remove XML tag with BeautifulSoup?

I have XML in a local file that is a template for a final message that gets

POST
ed to a
REST
service. The script pre processes the template data before it gets posted.

So the template looks something like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<root>
<singleElement>
<subElementX>XYZ</subElementX>
</singleElement>
<repeatingElement id="11" name="Joe"/>
<repeatingElement id="12" name="Mary"/>
</root>


The message XML should look the same except that the
repeatingElement
tags need to be replaced with something else (XML generated by the script based on the attributes in the existing tag).

Here is my script so far:

xmlData = None

with open('conf//test1.xml', 'r') as xmlFile:
xmlData = xmlFile.read()

xmlSoup = BeautifulSoup(xmlData, 'html.parser')

repElemList = xmlSoup.find_all('repeatingelement')

for repElem in repElemList:
print("Processing repElem...")
repElemID = repElem.get('id')
repElemName = repElem.get('name')

# now I do something with repElemID and repElemName
# and no longer need it. I would like to replace it with <somenewtag/>
# and dump what is in the soup object back into a string.
# is it possible with BeautifulSoup?


Can I replace the repeating elements with something else and then dump the soup object into a new string that I can post to my REST API?

NOTE: I am using
html.parser
because I can't get the xml parser to work but it works alright, understanding HTML is more permissive than XML parsing.

Answer

You can use .replace_with() and .new_tag() methods:

for repElem in repElemList:
    print("Processing repElem...")
    repElemID = repElem.get('id')
    repElemName = repElem.get('name')

    repElem.replace_with(xmlSoup.new_tag("somenewtag"))

Then, you can dump the "soup" using str(soup) or soup.prettify().