JavaBeast JavaBeast - 7 months ago 45
C++ Question

Parse XML in qt and get tree tag structure

I need to parse an XML file in c++(!11)/ QT into a vector holding each value and its xml tag parent structure.

I'm new to QT and I know there are some good options in their library. However, much of what I have found focuses on those who know the tag names ahead of time. For me I need something more generic. The tag names (and values) are irrelevant for my purpose and could be anything, my focus is on the tag structure holding each value. What is the best approach to take for this? QDomDocument?

*Note: The actual xmls will be much more complex in tree structure length.

Example input

Test.xml

<MainTag>
<description>Test Description</description>
<type>3</type>
<source>
<description>Source test Description1</description>
<type>4</type>
</source>
<source>
<description>Source test Description2</description>
<type>5</type>
<name>
<element>1</element>
</name>
</source>

</MainTag>


Example Output

(string rows contained in c++ vector):

description=Test Description
type=3
source.description=Source test Description1
source.type=1
source.description=Source test Description2
source.type=2
source.name.element=1

Answer Source

When parsing XML files I find more flexible navigating the DOM of the XML than using an stream parser because your code is less aware of the order of the elements and focuses more on the structure and content.

For DOM navigation you can use the QDomDocument and related classes:

Example code for parsing an unknown XML

This code parses the XML and extract tag names as well as its text. It doesn't extract attributes nor empty nodes.

Note: I've corrected closing tag from given example <MainTag> to </MainTag>.

#include <QtXml>
#include <QtCore>
#include <vector>
#include <iostream>

// Recursive function to parse the XML
void parseXML(const QDomElement& root, const QString& baseName, std::vector<QString>& v)
{
  // Extract node value, if any
  if (!baseName.isEmpty() && !root.firstChild().nodeValue().isEmpty()) { // the first child is the node text
    v.push_back(baseName + "=" + root.firstChild().nodeValue());
  }

  // Parse children elements
  for (auto element = root.firstChildElement(); !element.isNull(); element = element.nextSiblingElement()) {
    parseXML(element, baseName + "." + element.tagName(), v);
  }
}

int main(int argc, char* argv[])
{
  const QString content = "<MainTag>"
                          "<description>Test Description</description>"
                          "<type>3</type>"
                          "<source>"
                          "    <description>Source test Description1</description>"
                          "    <type>4</type>"
                          "</source>"
                          "<source>"
                          "    <description>Source test Description2</description>"
                          "    <type>5</type>"
                          "    <name>"
                          "        <element>1</element>"
                          "    </name>"
                          "</source>"
                          "</MainTag>";
  std::vector<QString> v;

  QDomDocument xml("xml");
  xml.setContent(content);
  parseXML(xml.documentElement(), "", v); // root has no base name, as indicated in expected output

  for (auto it = v.begin(); it != v.end(); ++it) {
    std::cout << it->toStdString() << std::endl;
  }

  return 0;
}

DOM from file

To populate the DOM from a file change the setContent line with something like code below (error checking omitted for simplicity):

QFile file(filePath);
file.open(QFile::ReadOnly);
xml.setContent(file.readAll());