JuddGledhill JuddGledhill - 15 days ago 10
C# Question

Efficient way to read large XML into dfferent node types in C#

I am new to C#. I have a relatively large XML file (28MB) and am trying to parse its subtrees into several different types based on their content. Essentially, I have 6900+ Content nodes that all have to be interrogated to figure out what type they are.

<Collections>
<Content>..</Content>
<Content>..</Content>
<Content>..</Content>
...
</Collections>


For each Content node, the variety of nodes below it can have 1 of 3 different patterns. I have to look into the node to decide which pattern/type of object I am looking at.

So imagine a Content node that has about 100 subnodes in it, and the 14th node (in one case) has a URL in it and indicates it is a "type 1" and should have fields 1, 2, 3,...17, 28, 47 and 58 written to the DB.

Another type has an indicative pair of elements (let's say element 3 and 58) and indicates it is a "type 2" and should have a different set of elements written to the DB.

And so on...

From there, I map the objects into a CMS/DB and connect various bits of data to fields in that other system and write data from the pertinent elements over to the DB.

Since the source file is large, I would love to efficiently pull subtrees out of the larger file, zip up and down them (do decide on their types) and then wirte the important data (map them) over to the DB.

Do I have to store the values along the way somehow and decide after I have stored them, what sort of object this is?

I am struggling with the forward only approach of XmlReader and the ease of using a DOM based approach.

Thanks for the advice.

===edit====
Thank you commenters. The structure inside of the Content nodes would have 1 of 3 patterns in it. There are about 100 nodes in each type, so I did not bother pasting them in for readability's sake. I did try and clarify above though.

Answer

With large files you must use xmlreader. I prefer using combination of xmlreeader and xml linq. Try following :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XmlReader reader = XmlReader.Create(FILENAME);
            while (!reader.EOF)
            {
                if (reader.Name != "Content")
                {
                    reader.ReadToFollowing("Content");
                }
                if (!reader.EOF)
                {
                    XElement content = (XElement)XElement.ReadFrom(reader);
                }
            }
        }
    }
}