MrCastro MrCastro - 1 year ago 86
Python Question

XPath to namespaced XML in Python?

I am using lxml with xpath to parse an epub3, xhtml content file.

I want to select all the

nodes with the attribute

as for example

<li epub:type="footnote" id="fn14"> ... </li>

I cannot find the right xpath expression for it.

The expression


does select all the
nodes with attribute id, but when I try


I get the error

lxml.etree.XPathEvalError: Undefined namespace prefix

The XML is

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="" xmlns:epub="">
<meta charset="utf-8" />
<link rel="stylesheet" href="stylesheet.css" />
<section class="footnotes">
<hr />
<li id="fn1" epub:type="footnote">
<p>See foo</p>

Any suggestions on how to write the correct expression?

Answer Source

Have you declared the namespace prefix epub to lxml?

>>> tree.getroot().xpath(
...     "//li[@epub:type = 'footnote']", 
...     namespaces={'epub':''}
...     )

Update per question update

The XHTML namespace is also tripping you up. Try:

>>> tree.getroot().xpath(
...     "//xhtml:li[@epub:type = 'footnote']", 
...     namespaces={'epub':'', 'xhtml': ''}
...     )
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download