Johan Johan - 4 months ago 30
ASP.NET (C#) Question

upload XML -> read unicode stream and convert it

I have a fileupload control where i can upload xml documents.

The XML files will be encoded in unicode format. I want to convert them to UTF8, so they can render as a proper xml file.

Im saving the uploaded file in a hiddenfield as a hex string and sends it to a generic handler. What i want is a result that i can create an xml from. At the moment my string looks like this:

"??<\0?\0x\0m\0l\0 \0v\0e\0r\0s\0i\0o\0n\0=\0\"\01\0.\00\0\"\0 \0e\0n\0c\0o\0d\0i\0n\0g\0=\0\"\0I\0S\0O\0-

Instead of

<?xml version="1.0".. etc


if (fileUpload.PostedFile.ContentType == "text/xml")
Stream inputstream = fileUpload.PostedFile.InputStream;

byte[] streamAsBytes = (ConvertStreamToByteArray(inputstream));

string stringToSend = BitConverter.ToString(streamAsBytes);

xmlstream.Value = stringToSend;

sendXML.Visible = true;
infoLabel.Text = "<b>Selected XML: </b>" + fileUpload.PostedFile.FileName;


if (HttpContext.Current.Request.Form["xmldata"] != null)
HttpContext.Current.Response.ContentType = "text/xml";
HttpContext.Current.Response.ContentEncoding = Encoding.UTF8;

string xmlstring = HttpContext.Current.Request.Form["xmldata"];

byte[] data = xmlstring.Split('-').Select(b => Convert.ToByte(b, 16)).ToArray();

string complete = System.Text.ASCIIEncoding.ASCII.GetString(data);

XmlDocument doc = new XmlDocument();




It's not at all clear that you really should do this. XML files can declare their own encoding, and it looks like yours is declaring an encoding starting with "ISO" (that's where the data you've given us stops). That's probably not UTF-8.

Basically, I don't think you should be treating the data as text in handler.ashx. Just get XmlDocument to parse it from a stream. It's not really clear exactly how your upload code is sending the data, but you should try to mess with it as little as possible.

It's possible that your current code would actually work fine if you just changed this:

string complete = System.Text.ASCIIEncoding.ASCII.GetString(data);
XmlDocument doc = new XmlDocument();

to this:

XmlDocument doc = new XmlDocument();
doc.Load(new MemoryStream(data));

However, the hex part is pretty ugly. If you really need to represent the binary data as text, I'd strongly recommend using Base64 instead of hex:

string text = Convert.ToBase64String(binary);
byte[] binary = Convert.FromBase64String(text);

... there's no need to convert each byte separately and split the string on hyphens etc.