Andre Borges Andre Borges - 4 months ago 9
JSON Question

Convert key=value string to JSON

I have a string of keys and values in the following format:

KEY1=someValue, KEY2="Hello, World!", SOME.OTHER.KEY=Hello World!, KEY4="Hello, ""World""!"


How can I convert it into a JSON string using C#? This can probably be done using a Regex, but I can't come up with the right pattern. Neither was I able to figure out how to do it using libraries like
Newtonsoft.Json
.

The JSON I want to produce is the following:

{
"KEY1":"someValue",
"KEY2":"Hello, World!",
"SOME.OTHER.KEY":"Hello World!",
"KEY4":"Hello, \"World\"!"
}

Answer

Well, with that nested quotes from the updated question things go match trickier. I can't see any viable way of extracting the values with arbitrary level of nested quotes. (This is true for regex approach -- it's still possible to scan the string manually and count the number of subsequent quotes with respect to nesting level.)

Assuming we limit ourselves with only one level of nested quoted strings, the regex would be:

(?<key>[^=,\s]+)=(?:"(?<value>(?:[^"]|""[^"]*"")*?)"|(?<value>[^,]*))(?:,|$)

Then you can find all matches and reformat the pairs according to JSON rules:

var input = @"KEY1=someValue, KEY2=""Hello, World!"", 
    SOME.OTHER.KEY=Hello ""World""!, 
    KEY4=""Hello, """"World""""!"", 
    KEY5=""Hello, """"World""""!"", 
    KEY6=""""""Hello"""", """"World""""!""";
var pairs = Regex.Matches(input, @"(?<key>[^=,\s]+)=(?:""(?<value>(?:[^""]|""""[^""]*"""")*?)""|(?<value>[^,]*))(?:,|$)")
    .Cast<Match>()
    .Select(m =>
        string.Format("  {0}: {1}",
            JsonConvert.ToString(m.Groups["key"].Value),
            JsonConvert.ToString(m.Groups["value"].Value.Replace("\"\"", "\""))));
var json = "{\n" + string.Join(",\n", pairs) + "\n}";

Regex explanation:

  • (?<key> - start a capture group named key
    • [^=,\s]+ - match any non-empty sequence of characters excluding =, ,, and whitespaces
    • ) - end the key group
  • = - match the equals sign literally
  • (?: - start an unnamed group used to group alternatives
    • the first alternative - quoted value:
    • " - the literal opening quote
    • (?<value> - start a capture group named value
      • (?:[^"]|""[^"]*"")* - match any sequence of non-quotes or quoted string (please not the quotes are doubled)
      • ? - make the previous match non-greedy
      • ) - end the value group
    • " - the literal closing quote
    • | - the alternatives delimiter
    • the second alternative - unquoted value:
    • (?<value> - start another value capture group - .NET regex flavour maintans a stack of named groups so you can access either of alternative capture groups simply by name
      • [^,]* - match any sequence not containing commas
      • ) - end the second value group
    • ) - end the unnamed group
  • (?:,|$) - match either comma or end of string (both are expected to finish the value)