Programmer Programmer - 3 months ago 15
C# Question

Split string in square brackets from Google translator

I am receiving a data from a web service and need help splitting the data.

void translateText(string text, string fromLanguage, string toLanguage)
{
string url = "https://translate.googleapis.com/translate_a/single?client=gtx&sl=" + fromLanguage + "&tl=" + toLanguage + "&dt=t&q=" + Uri.EscapeUriString(text);
StartCoroutine(startTranslator(url));
}


then calling it with
translateText("Hello, This is a test!", "en", "fr");
which converts the English sentence to French.

The received data looks like this:

[[["Bonjour, Ceci est un test!","Hello, This is a test!",,,0]],,"en"]


I want to split it like this:


  • Bonjour, Ceci est un test!

  • Hello, This is a test!

  • 0

  • en



and put them into an array.

I currently use this:

char[] delims = { '[', '\"', ']', ',' };
string[] arr = result.Split(delims, StringSplitOptions.RemoveEmptyEntries);


This works if there is no comma in the received string. If there is a comma, the splitted values are messed up. What's the best way of splitting this?

Answer

You could code up a simple parser yourself. Here's one I threw together (could use some cleaning up, but demonstrates the idea):

private static IEnumerable<string> Parse(string input) {
    bool inToken = false;
    bool inString = false;
    bool escaped = false;
    var seps = ",[]\"".ToArray();
    var current = "";
    foreach (var chr in input) {
        if (!inString && chr == '"') {
            current = "";
            inString = true;
            continue;
        }
        if (inString && !escaped && chr == '"') {
            yield return current;
            current = "";
            inString = false;
            continue;
        }
        if (inString && !escaped && chr == '\\') {
            escaped = true;
            continue;
        }
        if (inString && (chr != '"' || escaped)) {
            escaped = false;
            current += chr;
            continue;
        }
        if (inToken && seps.Contains(chr)) {
            yield return current;
            current = "";
            inToken = false;
            continue;
        }
        if (!inString && chr == '"') {
            inString = true;
            current = "";
            continue;
        }
        if (!inToken && !seps.Contains(chr)) {
            inToken = true;
            current = "";
        }
        current += chr;
    }
}

Here's a jsfiddle demo.