Bardicer Bardicer - 1 year ago 67
ASP.NET (C#) Question

SQL Server UDF SQLCLR Call Converts Characters Into Question Marks

I've found nothing on Google or SO that quites lines up with my issue.

In SQL Server, I have a scalar function (we'll call it

What this function does is call a utility written in C# that calls an ASP.Net view and returns the HTML as a SqlString.

The function definition in SQL Server is:

AS EXTERNAL NAME [Utils.UserDefinedFunctions].[MySqlStringFunction]

The C# code simplified is:

var request = (HttpWebRequest)WebRequest.Create("");

using (var response = (HttpWebResponse)request.GetResponse())
using (var stream = response.GetResponseStream())
using (var streamReader = new StreamReader(stream, Encoding.UTF8)
return new SqlString(streamReader.ReadToEnd());

When I put the C# code into a console app and run it, I get everything exactly as it should be.

When I access the URL directly in my browser, it displays exactly as it should be.

When I do
SELECT MySqlStringFunction()
however, characters such as ™, §, ¤ display as 2 or 3 question marks each.

It appears that it is somewhere between the
return new SqlString(..)
and the sql function returning the value that something is going wonky. But I'm at a loss as to what it could be.

Answer Source

It seems that the issue was the location of the return. The current code (shown in the Question), is returning in the middle of 3 using blocks, one of which is the UTF-8 stream being read. This probably confused things as SQLCLR is isolated memory from the main SQL Server memory, and usually you can't return via a stream. It is best to close the open stream first and let the using blocks call Dispose(). Hence:

  1. Create a string above the first using (i.e. string _TempReturn = String.Empty;)
  2. Inside the inner-most using, replace return with: _TempReturn = streamReader.ReadToEnd();
  3. Below the last using closing bracket, add: return new SqlString(_TempReturn);

(old answer, will remove in the near future)

The problem is with the encoding difference between the web page and SQL Server. You are using Encoding.UTF8 for the web page (which is quite likely correct given that UTF-8 is the most common encoding for the interwebs), but SQL Server (along with .NET and Windows in general) is UTF-16 Little Endian. This is why you are getting 2 or 3 ?s for each character above Code Point 127: UTF-8 is a multi-byte encoding that uses 1, 2, or 3 bytes per character, whereas UTF-16 is always 2-bytes (well, supplementary characters are 4 bytes, but that is due to being a pair of double-byte values).

You need to convert the encoding to UTF-16 Little Endian before, or as, you pass back the stream. And, UTF-16 Little Endian is the Unicode encoding in .NET, while Big Endian Unicode refers to "UTF-16 Big Endian". So you want to convert to the Unicode encoding.

OR, it could be the reverse: that the web page is NOT UTF-8, in which case you have declared it incorrectly in the StreamReader. If this is true, then you need to specify the correct encoding in the StreamReader constructor.