I'm trying to convert some HTML files to XML format on ubuntu and they should conform to a specific XML schema or DTD. I guess Tidy should do that but I don't understand the syntax for it. Or if there are other tools, I'd be glad to try them out.
For instance: Convert
Tidy can convert HTML to XHTML (the same structure of elements and attributes but meeting the rules for XML well-formedness), but it can't convert it to meet the requirements of some arbitrary DTD.
You'll need to write an explicit mapping between the two data formats for that. XSLT is a popular language for doing that.