Jonathan Jonathan - 2 months ago 21
Python Question

Extract specific information from XML (Google Distance Matrix API)

Here is my XML output:

<DistanceMatrixResponse>
<status>OK</status>
<origin_address>
868-978 Middle Tennessee Blvd, Murfreesboro, TN 37130, USA
</origin_address>
<destination_address>
980-1060 Middle Tennessee Blvd, Murfreesboro, TN 37130, USA
</destination_address>
<row>
<element>
<status>OK</status>
<duration>
<value>19</value>
<text>1 min</text>
</duration>
<distance>
<value>154</value>
<text>0.1 mi</text>
</distance>
</element>
</row>
</DistanceMatrixResponse>


I am trying to use Python to save this XML from the web locally (this part is complete). After the file is saved, I want to extract the 'duration' value (of 19, in this case) and the 'distance' value (of 154, in this case).

I just can't seem to figure out how to read and extract the necessary information from this XML. I have tried working with ElementTree and trying to implement others solutions from stackoverflow, with no luck. I'm about 3 hours into what should be a quick process.

Here is my code as it sits now:

import urllib2
import xml.etree.ElementTree as ET

## import XML and save it out
url = "https://maps.googleapis.com/maps/api/distancematrix/xml?units=imperial&origins=35.827581,-86.394077&destinations=35.827398,-86.392381&key=mygooglemapsAPIkey"
s = urllib2.urlopen(url)
contents = s.read()
file = open("export.xml", 'w')
file.write(contents)
file.close()
## finish saving the XML


element_tree = ET.parse("export.xml")
root = element_tree.getroot()
agreement = root.find("duration").text
print agreement


## open XML and save out travel time in seconds
xmlfile = 'export.xml'
element_tree = ET.parse(xmlfile)
root = element_tree.getroot()
agreement = root.findall("duration").text
print agreement


Current error message is: AttributeError: 'Nonetype' object has no attribute 'text'

I know the code is incomplete to grab both the duration and distance, but I am just trying to get something working at this point!

Answer

Just use XPath queries:

duration = tree.find('.//duration/value').text
distance = tree.find('.//distance/value').text

Here's a nice XPath tutorial: http://zvon.org/comp/r/tut-XPath_1.html.