Charles Morrison Charles Morrison - 3 months ago 25
C# Question

Convert indented text to XML

I have a text based file with indents that represent each XML tag of an XML file.

How could I convert this text into a sample XML in C#?

I am a bit lost. I have to count the spaces and look back in the list to determine when tags should close.

sampleroot
rootHeader
miscInformation
Creation
DocumentIdentification
Identifier
Message_Type
Notes
StandardDocumentationIdentification
Standard
Version
Receiver
Name
lok
Location
Sender
Name
lok2
msgref
DocumentIdentifier
HoldInformation
Name
Date
ReleaseInformation
Date
HoldDocumentReference
AlternativeIdentifier
Authority
Identifier
Notes
ReleaseDocumentReference
AlternativeIdentifier
Authority
Identifier
Notes

Answer

The following code works with input document which has four spaces indentation (please take a look at the input document carefully). It's just an example: of course you can implement support for input documents with tab indentation.

private static void ConvertToXml(Stream inputStream, Stream outputStream)
{
    const int oneIndentLength = 4; // One level indentation - four spaces.
    var xmlWriterSettings = new XmlWriterSettings
        {
            Indent = true
        };

    using (var streamReader = new StreamReader(inputStream))
    using (var xmlWriter = XmlWriter.Create(outputStream, xmlWriterSettings))
    {
        int previousIndent = -1; // There is no previous indent.
        string line;
        while ((line = streamReader.ReadLine()) != null)
        {
            var indent = line.TakeWhile(ch => ch == ' ').Count();
            indent /= oneIndentLength;

            var elementName = line.Trim();

            if (indent <= previousIndent)
            {
                // The indent is the same or is less than previous one - write end for previous element.
                xmlWriter.WriteEndElement();

                var indentDelta = previousIndent - indent;
                for (int i = 0; i < indentDelta; i++)
                {
                    // Return: leave the node.
                    xmlWriter.WriteEndElement();
                }
            }

            xmlWriter.WriteStartElement(elementName);

            // Save the indent of the previous line.
            previousIndent = indent;
        }
    }
}

Client code:

using (var inputStream = File.OpenRead(@"Input file path"))
using (var outputStream = File.Create(@"Output file path"))
{
    ConvertToXml(inputStream, outputStream);
}

Input document:

sampleroot
    rootHeader
        miscInformation
            Creation
            DocumentIdentification
                Identifier
                Message_Type
                Notes
            StandardDocumentationIdentification
                Standard
                Version
        Receiver
            Name
            lok
            Location
        Sender
            Name
            lok2
        msgref
            DocumentIdentifier
            HoldInformation
                Name
                Date
            ReleaseInformation
                Date
        HoldDocumentReference
            AlternativeIdentifier
                Authority
                Identifier
            Notes
        ReleaseDocumentReference
            AlternativeIdentifier
                Authority
                Identifier
            Notes

Output document:

<?xml version="1.0" encoding="utf-8"?>
<sampleroot>
  <rootHeader>
    <miscInformation>
      <Creation />
      <DocumentIdentification>
        <Identifier />
        <Message_Type />
        <Notes />
      </DocumentIdentification>
      <StandardDocumentationIdentification>
        <Standard />
        <Version />
      </StandardDocumentationIdentification>
    </miscInformation>
    <Receiver>
      <Name />
      <lok />
      <Location />
    </Receiver>
    <Sender>
      <Name />
      <lok2 />
    </Sender>
    <msgref>
      <DocumentIdentifier />
      <HoldInformation>
        <Name />
        <Date />
      </HoldInformation>
      <ReleaseInformation>
        <Date />
      </ReleaseInformation>
    </msgref>
    <HoldDocumentReference>
      <AlternativeIdentifier>
        <Authority />
        <Identifier />
      </AlternativeIdentifier>
      <Notes />
    </HoldDocumentReference>
    <ReleaseDocumentReference>
      <AlternativeIdentifier>
        <Authority />
        <Identifier />
      </AlternativeIdentifier>
      <Notes />
    </ReleaseDocumentReference>
  </rootHeader>
</sampleroot>
Comments