quietgrit quietgrit - 1 month ago 9
Python Question

How to get the XML element text with ElementTree in Python

I have some xml like this:

<?xml version="1.0" encoding="UTF-8"?>
<robot generated="20161031 09:20:35.196" generator="Robot 3.0 (Python 2.7.10 on darwin)">
<suite source="/path/to/source" id="s1" name="App">
<kw type="setup" name="Login to App" library="SettingsAndLibraries">
<arguments>
<arg>${VALID USER}</arg>
<arg>${VALID PASSWORD}</arg>
</arguments>
<kw name="Open Browser" library="Selenium2Library">
<doc>Opens a new browser instance to given URL.</doc>
<arguments>
<arg>${LOGIN URL}</arg>
<arg>${BROWSER}</arg>
</arguments>
<assign>
<var>${BROWSER ID}</var>
</assign>
<msg timestamp="20161031 09:20:35.920" level="INFO">Opening browser 'Chrome' to base url 'https://app.com/'</msg>
<msg timestamp="20161031 09:20:40.668" level="INFO">${BROWSER ID} = 1</msg>
<status status="PASS" endtime="20161031 09:20:40.669" starttime="20161031 09:20:35.920" />
</kw>
<kw name="Maximize Browser Window" library="Selenium2Library">
<doc>Maximizes current browser window.</doc>
<status status="PASS" endtime="20161031 09:20:40.940" starttime="20161031 09:20:40.669" />
</kw>
<kw name="Set Selenium Speed" library="Selenium2Library">
<doc>Sets the delay in seconds that is waited after each Selenium command.</doc>
<arguments>
<arg>${DELAY}</arg>
</arguments>
<status status="PASS" endtime="20161031 09:20:40.941" starttime="20161031 09:20:40.940" />
</kw>
<kw name="Title Should Be" library="Selenium2Library">
<doc>Verifies that current page title equals `title`.</doc>
<arguments>
<arg>Sign In - App</arg>
</arguments>
<msg timestamp="20161031 09:20:40.947" level="INFO">Page title is 'Sign In'.</msg>
<status status="PASS" endtime="20161031 09:20:40.948" starttime="20161031 09:20:40.941" />
</kw>
<kw name="Input Text" library="Selenium2Library">
<doc>Types the given `text` into text field identified by `locator`.</doc>
<arguments>
<arg>${LOGIN.txtUser}</arg>
<arg>${username}</arg>
</arguments>
<msg timestamp="20161031 09:20:40.948" level="INFO">Typing text 'test' into text field 'ctl00_content_userNameText'</msg>
<status status="PASS" endtime="20161031 09:20:41.038" starttime="20161031 09:20:40.948" />
</kw>
<kw name="Input Text" library="Selenium2Library">
<doc>Types the given `text` into text field identified by `locator`.</doc>
<arguments>
<arg>${LOGIN.txtPassword}</arg>
<arg>${password}</arg>
</arguments>
<msg timestamp="20161031 09:20:41.039" level="INFO">Typing text 'pw' into text field 'ctl00_content_passwordText'</msg>
<status status="PASS" endtime="20161031 09:20:41.129" starttime="20161031 09:20:41.038" />
</kw>
<kw name="Input Text" library="Selenium2Library">
<doc>Types the given `text` into text field identified by `locator`.</doc>
<msg timestamp="20161031 09:20:41.129" level="INFO">Typing text 'text' into text field 'ctl00_content_Text'</msg>
<status status="PASS" endtime="20161031 09:20:41.209" starttime="20161031 09:20:41.129" />
</kw>
<kw name="Click Button" library="Selenium2Library">
<doc>Clicks a button identified by `locator`.</doc>
<arguments>
<arg>${LOGIN.btnLogin}</arg>
</arguments>
<msg timestamp="20161031 09:20:41.210" level="INFO">Clicking button 'ctl00_content_btnLogin'.</msg>
<status status="PASS" endtime="20161031 09:20:50.839" starttime="20161031 09:20:41.209" />
</kw>
<kw name="Location Should Be" library="Selenium2Library">
<doc>Verifies that current URL is exactly `url`.</doc>
<arguments>
<arg>${WELCOME URL}</arg>
</arguments>
<msg timestamp="20161031 09:20:50.846" level="INFO">Current location is 'https://app.com/home.aspx'.</msg>
<status status="PASS" endtime="20161031 09:20:50.846" starttime="20161031 09:20:50.839" />
</kw>
<kw name="Title Should Be" library="Selenium2Library">
<doc>Verifies that current page title equals `title`.</doc>
<arguments>
<arg>Welcome to Home</arg>
</arguments>
<msg timestamp="20161031 09:20:50.851" level="INFO">Page title is Welcome to Home'.</msg>
<status status="PASS" endtime="20161031 09:20:50.851" starttime="20161031 09:20:50.847" />
</kw>
<status status="PASS" endtime="20161031 09:20:50.851" starttime="20161031 09:20:35.919" />
</kw>
<suite source="/path/to/source" id="s1-s1" name="Admin">
<suite source="/path/to/source" id="s1-s1-s1" name="Church Setup">
<suite source="/path/to/source" id="s1-s1-s1-s1" name="Buildings">
<test id="s1-s1-s1-s1-t1" name="Verify Buildings Page">
<kw name="Click Link" library="Selenium2Library">
<doc>Clicks a link identified by locator.</doc>
<arguments>
<arg>${GLOBAL.lnkAdmin}</arg>
</arguments>
<msg timestamp="20161031 09:20:50.873" level="INFO">Clicking link 'link=Admin'.</msg>
<status status="PASS" endtime="20161031 09:20:50.968" starttime="20161031 09:20:50.873" />
</kw>
<kw name="Wait Until Element Is Visible" library="Selenium2Library">
<doc>Waits until element specified with `locator` is visible.</doc>
<arguments>
<arg>nav_sub_6</arg>
</arguments>
<status status="PASS" endtime="20161031 09:20:51.432" starttime="20161031 09:20:50.968" />
</kw>
<kw name="Click Link" library="Selenium2Library">
<doc>Clicks a link identified by locator.</doc>
<arguments>
<arg>${ADMIN.lnkBuildings}</arg>
</arguments>
<msg timestamp="20161031 09:20:51.432" level="INFO">Clicking link 'link=Buildings'.</msg>
<status status="PASS" endtime="20161031 09:20:52.224" starttime="20161031 09:20:51.432" />
</kw>
<kw name="Page Should Contain Element" library="Selenium2Library">
<doc>Verifies element identified by `locator` is found on the current page.</doc>
<arguments>
<arg>ctl00_ctl00_MainContent_content_txtBuildingName</arg>
</arguments>
<msg timestamp="20161031 09:20:52.248" level="INFO">Current page contains element 'ctl00_ctl00_MainContent_content_txtBuildingName'.</msg>
<status status="PASS" endtime="20161031 09:20:52.249" starttime="20161031 09:20:52.224" />
</kw>
<kw name="Page Should Contain Button" library="Selenium2Library">
<doc>Verifies button identified by `locator` is found from current page.</doc>
<arguments>
<arg>ctl00_ctl00_MainContent_content_btnSave</arg>
</arguments>
<msg timestamp="20161031 09:20:52.268" level="INFO">Current page contains input 'ctl00_ctl00_MainContent_content_btnSave'.</msg>
<status status="PASS" endtime="20161031 09:20:52.268" starttime="20161031 09:20:52.249" />
</kw>
<kw name="Page Should Contain Link" library="Selenium2Library">
<doc>Verifies link identified by `locator` is found from current page.</doc>
<arguments>
<arg>link=Edit</arg>
</arguments>
<msg timestamp="20161031 09:20:52.401" level="INFO">Current page contains link 'link=Edit'.</msg>
<status status="PASS" endtime="20161031 09:20:52.401" starttime="20161031 09:20:52.269" />
</kw>
<tags>
<tag>Admin</tag>
<tag>Smoke</tag>
<tag>testrailid=627</tag>
</tags>
<status status="PASS" endtime="20161031 09:20:52.401" critical="yes" starttime="20161031 09:20:50.872" />
</test>
<doc>Verifies the Admin, Setup, Buildings Page</doc>
<status status="PASS" endtime="20161031 09:20:52.404" starttime="20161031 09:20:50.856" />


I would like to end up with a dictionary that contains the test names and the test docs. Eg

`{'Verify Buildings Page': 'Verifies the Admin, Setup, Buildings Page'}`


The XML is much larger than this, I cut it short for this example. Here's what I have so far. How do I get the doc element text?

import xml.etree.ElementTree as ET

tree = ET.parse('../App/Logs/output.xml')

root = tree.getroot()

testName = []
testDoc = []

for test in root.iter('test'):
testName.append(test.get('name'))
#somehow get the test doc text here

print testName

Answer

If you were to use lxml.etree, you could've solved it via following-sibling:

import lxml.etree as ET

tree = ET.parse("input.xml")
root = tree.getroot()

result = {
    test.attrib['name']: test.xpath("following-sibling::doc")[0].text
    for test in root.iter('test')
}
print(result)

Which would print:

{'Verify Buildings Page': 'Verifies the Admin, Setup, Buildings Page'}