Squidward Squidward - 4 months ago 18
C# Question

Serializing heavily linked data in .NET (customizing JSON.NET references)

I want to avoid reinventing the wheel when serializing data. I know some ways to serialize objects which are linked to each other, but it ranges from writing some code to writing a lot of code for serialization, and I'd like to avoid that. There must be some generic solutions.

Let's say I have a structure like this:

Person
bro = new Person { name = "bro", pos = new Pos { x = 1, y = 5 } },
sis = new Person { name = "sis", pos = new Pos { x = 2, y = 6 } },
mom = new Person { name = "mom", pos = new Pos { x = 3, y = 7 },
children = new List<Person> { bro, sis }
},
dad = new Person { name = "dad", pos = new Pos { x = 4, y = 8 },
children = new List<Person> { bro, sis }, mate = mom
};
mom.mate = dad;
Family family = new Family { persons = new List<Person> { mom, dad, bro, sis } };


I want to serialize data to something like this:

family: {
persons: [
{ name: "bro", pos: { x: 1, y: 5 } },
{ name: "sis", pos: { x: 2, y: 6 } },
{ name: "mom", pos: { x: 3, y: 7 }, mate: "dad", children: [ "bro", "sis" ] },
{ name: "dad", pos: { x: 4, y: 8 }, mate: "mom", children: [ "bro", "sis" ] },
]
}


Here, links are serialized as just names, with the assumption that names are unique. Links can also be "family.persons.0" or generated unique IDs or whatever.

Requirements:


  1. Format must be human-readable and preferably human-writable too. So, in order of preference: JSON, YAML*, XML, custom. No binary formats.

  2. Serialization must support all good stuff .NET offers. Generics are a must, including types like IEnumerable<>, IDictionary<> etc. Dynamic types / untyped objects are desirable.

  3. Format must not be executable. No Lua, Python etc. scripts and things like that.

  4. If unique IDs are generated, they must be stable (persist through serialization-deserialization), as files will be put into a version control system.



* Heard about YAML, but sadly, it seems to be pretty much dead.

Answer

Solved the problem using JSON.NET (fantastic library!). Now objects are, first, serialized and referenced exactly where I want them them to; and second, without numerous "$id" and "$ref" fields. In my solution, the first property of an object is used as its identifier.

I've created two JsonConvertors (for references to objects and for referenced objects):

interface IJsonLinkable
{
    string Id { get; }
}

class JsonRefConverter : JsonConverter
{
    public override void WriteJson (JsonWriter writer, object value, JsonSerializer serializer)
    {
        writer.WriteValue(((IJsonLinkable)value).Id);
    }

    public override object ReadJson (JsonReader reader, Type type, object existingValue, JsonSerializer serializer)
    {
        if (reader.TokenType != JsonToken.String)
            throw new Exception("Ref value must be a string.");
        return JsonLinkedContext.GetLinkedValue(serializer, type, reader.Value.ToString());
    }

    public override bool CanConvert (Type type)
    {
        return type.IsAssignableFrom(typeof(IJsonLinkable));
    }
}

class JsonRefedConverter : JsonConverter
{
    public override void WriteJson (JsonWriter writer, object value, JsonSerializer serializer)
    {
        serializer.Serialize(writer, value);
    }

    public override object ReadJson (JsonReader reader, Type type, object existingValue, JsonSerializer serializer)
    {
        var jo = JObject.Load(reader);
        var value = JsonLinkedContext.GetLinkedValue(serializer, type, (string)jo.PropertyValues().First());
        serializer.Populate(jo.CreateReader(), value);
        return value;
    }

    public override bool CanConvert (Type type)
    {
        return type.IsAssignableFrom(typeof(IJsonLinkable));
    }
}

and a context to hold references data (with a dictionary for each type, so IDs need to be unique only among objects of the same type):

class JsonLinkedContext
{
    private readonly IDictionary<Type, IDictionary<string, object>> links = new Dictionary<Type, IDictionary<string, object>>();

    public static object GetLinkedValue (JsonSerializer serializer, Type type, string reference)
    {
        var context = (JsonLinkedContext)serializer.Context.Context;
        IDictionary<string, object> links;
        if (!context.links.TryGetValue(type, out links))
            context.links[type] = links = new Dictionary<string, object>();
        object value;
        if (!links.TryGetValue(reference, out value))
            links[reference] = value = FormatterServices.GetUninitializedObject(type);
        return value;
    }
}

Some attributes on the properties are necessary:

[JsonObject(MemberSerialization.OptIn)]
class Family
{
    [JsonProperty(ItemConverterType = typeof(JsonRefedConverter))]
    public List<Person> persons;
}

[JsonObject(MemberSerialization.OptIn)]
class Person : IJsonLinkable
{
    [JsonProperty]
    public string name;
    [JsonProperty]
    public Pos pos;
    [JsonProperty, JsonConverter(typeof(JsonRefConverter))]
    public Person mate;
    [JsonProperty(ItemConverterType = typeof(JsonRefConverter))]
    public List<Person> children;

    string IJsonLinkable.Id { get { return name; } }
}

[JsonObject(MemberSerialization.OptIn)]
class Pos
{
    [JsonProperty]
    public int x;
    [JsonProperty]
    public int y;
}

So, when I serialize and deserialize using this code:

JsonConvert.SerializeObject(family, Formatting.Indented, new JsonSerializerSettings {
    NullValueHandling = NullValueHandling.Ignore,
    Context = new StreamingContext(StreamingContextStates.All, new JsonLinkedContext()),
});

JsonConvert.DeserializeObject<Family>(File.ReadAllText(@"..\..\Data\Family.json"), new JsonSerializerSettings {
    Context = new StreamingContext(StreamingContextStates.All, new JsonLinkedContext()),
});

I get this neat JSON:

{
  "persons": [
    {
      "name": "mom",
      "pos": {
        "x": 3,
        "y": 7
      },
      "mate": "dad",
      "children": [
        "bro",
        "sis"
      ]
    },
    {
      "name": "dad",
      "pos": {
        "x": 4,
        "y": 8
      },
      "mate": "mom",
      "children": [
        "bro",
        "sis"
      ]
    },
    {
      "name": "bro",
      "pos": {
        "x": 1,
        "y": 5
      }
    },
    {
      "name": "sis",
      "pos": {
        "x": 2,
        "y": 6
      }
    }
  ]
}

What I don't like in my solution, is that I have to use JObject, even though technically it's unnecessary. It probably creates quite a bit of objects, so loading will be slower. But looks like this is the most widely used approach for customizing convertors of objects. Methods which could be used to avoid this are private anyway.