The Muffin Boy The Muffin Boy - 24 days ago 7
SQL Question

XML parsing in C# (Fixing XML document within C#)

I have an XML document fill of SQL queries and I want to load them into my program and execute them.

public List<SQLQuery> Load_ReportQueries()
{
XmlDocument xmlDoc = new XmlDocument();
using (FileStream fs = new FileStream("Resources\\XReport.xml", FileMode.Open, FileAccess.Read))
{
Dictionary<String, object> parameters = new Dictionary<string, object>();
List<SQLQuery> queries = new List<SQLQuery>();

xmlDoc.Load(fs);

foreach (XmlNode node in xmlDoc.ChildNodes)
{
foreach (XmlNode subNode in node.ChildNodes)
{
if (subNode.Name == "query")
{
string name = HttpUtility.HtmlDecode(subNode.Attributes["name"].InnerXml);
string query = HttpUtility.HtmlDecode(subNode.Attributes["val"].InnerXml);
foreach (XmlNode subNode2 in subNode.ChildNodes)
{
if (subNode2.Name == "varname")
{
try
{
if (subNode2.Attributes["val"].InnerXml == "null")
{
parameters.Add(subNode2.Attributes["name"].InnerXml, null);
}
else
{
parameters.Add(subNode2.Attributes["name"].InnerXml, subNode2.Attributes["val"].InnerXml);
}
}
catch (NullReferenceException e)
{
frmMainForm.showMsg("Error in XML document\n" + e.ToString());
}
}
}
queries.Add(new SQLQuery(name, query, parameters));
parameters.Clear();
}
}
}
return queries;
}


This works until the program encounters special characters inside the Query tag. (its always either " or < )

If I replace the symbols with &#34 or &lt I can fix the issue, but I wanted to know if there was an easy way to fix the document from within the program without relying on the XML to be correct.

Currently it crashes at the xmlDoc.Load line whenever it encounters this issue

Here's an example of a XML entry:

<reports>
<query val ="SELECT * from items where price < :price" name ="ALL Items LESS THAN">
<varname name="price" val="null"/>
</query>
</reports>


Changing the query to:

<query val ="SELECT * from items where price &lt; :price" name ="ALL Items


Will work, but I want to build something that does the replacement automatically.

Answer

Your problem starts here: I have an XML document

No, this is not XML, even if it looks like XML!

Many people think of XML as text surrounded by some extra characters, but this is not true... Your XML is not valid, due to the usage of forbidden characters within the /text().

If I understand this correctly, the XML is manually written... You will not be able to parse this in any XML-way. Especially the usage of <, > and & are very problematic. With the string given you cannot work with Replace() either... This replace command would not be able to distinguish between needed occurances of < and > and those you want to replace with the corresponding entities.

Question: Why is the query text stored within XML tags originally?

Why not store the query just a text and create the XML in the last step in code? Your C# code - if I read this correctly - adds the parameters with values. At this point you could easily create proper XML if the query's text was just simple text...

Approaches

There are onyl two ways in my eyes:

ugly string parsing

You could write some C#-code to find your (hopefully not changing) tag names and separate them on string base.

Change the creation of this XML

Much better was to change the way the XML is generated.

  • Store the query text without XML wrapper and use string methods to find the parameters dynamically

  • Use a tool like Excel to fill in the content as is and use some kind of (VBA-) code to create the XML with all needed replacements.

  • As the last (but maybe easiest) suggestion you might wrap your content in a <![CDATA[Your query here]]>-section (see comment from Damian_The_Unbeliever). Within the CDATA-section you are allowed to use forbidden characters...

Comments