Dave Dave - 1 month ago 18
JSON Question

Losing non printable ascii character (group separator) when deserializing with JsonConvert.DeserializeObject (Newtonsoft.Json)

I have a frustrating problem that I'm unable to solve. I am using Newtonsoft.Json to deserialize some

.json
data. I can plainly see in the raw json string that there are character sequences

{ '\\', 'u', '0', '0', '1', 'd' }


This represents "group separator" (ascii 29 or 0x1D). However, when I deserialize using
JsonConvert.DeserializeObject<>
these character sequences are not put correctly into their objects. What is put in instead is simply 1D. That is if you look at the character array you see:

29 '\u001d'


The 6 separate characters have been replaced by 1.

Does anyone know why this might happen? I have included a test program below. Unfortunately, it behaves exactly how I would expect. That is, the 6 characters show up in the objects field "Description". Clearly, I've missed something and not captured that actual problem and I realize one is supposed to try as hard as possible to come up with a small program that duplicates the problem in the large body of code. Unfortunately, I'm coming up blank. So, I'm asking for advice on what to look for and how this could possibly happen. What would do this replacement on a deserialization? The actual objects are more complicated than my Tanks example, but there is the ideas of IEnumerable in Tanks.

Many thanks,
Dave

class Program
{
static void Main(string[] args)
{
JsonSerializerSettings settings = new JsonSerializerSettings { ReferenceLoopHandling = ReferenceLoopHandling.Ignore };
Tank tank1 = new Tank();
char[] array1 = new char[] { '1', '2', '\\', 'u', '0', '0', '1', 'd', '3', '4' };
tank1.Description = new string(array1);
tank1.Id = 1;
Console.WriteLine("Original string (showing hidden characters");
for (int i = 0; i < tank1.Description.ToArray().Length; i++)
{
Console.Write(tank1.Description.ToArray()[i] + " ");
}
Console.WriteLine();

string conversion1 = JsonConvert.SerializeObject(tank1);

Tank deserializedTank1 = JsonConvert.DeserializeObject<Tank>(conversion1, settings);


Console.WriteLine("Deserialized string (showing hidden characters");
for (int i = 0; i < deserializedTank1.Description.ToArray().Length; i++)
{
Console.Write(deserializedTank1.Description.ToArray()[i] + " ");
}
Console.WriteLine();


Tank tank2 = new Tank() { Id = 2 };
tank2.Description = new string(array1);
Tank tank3 = new Tank() { Id = 3 };
tank3.Description = new string(array1);

Tanks tanks = new Tanks();
tanks.Group = new [] { tank1, tank2, tank3};
string tanksSerializedString = JsonConvert.SerializeObject(tanks,Formatting.Indented,settings);

Tanks deserializedTanks = JsonConvert.DeserializeObject<Tanks>(tanksSerializedString, settings);

Console.WriteLine("Deserialized Tanks");
foreach (Tank tank in deserializedTanks.Group)
{
Console.WriteLine("Deserialized string (showing hidden characters");
for (int i = 0; i < tank.Description.ToArray().Length; i++)
{
Console.Write(tank.Description.ToArray()[i] + " ");
}
Console.WriteLine();
}
}
}

interface ITank
{
int Id { get; set; }
string Description { get; set; }
}
public class Tank : ITank
{
public Tank() { }
public string Description { get; set; }
public int Id { get; set; }
}

public class Tanks
{
public Tanks() { }
public IEnumerable<Tank> Group { get; set; }
}

dbc dbc
Answer

The serializer is behaving as expected. According to the JSON Standard, a sequence of characters in the pattern \u four-hex-digits represent a single (utf16) Unicode character literal, the group separator character for \u001d:

JSON standard for strings

If you don't want that, the \ character has to be escaped in the string: "\\u001d",