smeeb smeeb - 1 month ago 13
Groovy Question

Recursively removing whitespace from JSON field names in Groovy

I have a Groovy process that is receiving troublesome JSON that has attribute/field names containing whitespaces:

{
"leg bone" : false,
"connected to the" : {
"arm bones " : [
{
" fizz" : "buzz",
"well hello" : "there"
}
]
}
}


Above, fields such as
"leg bone"
and
"well hello"
are causing issues during processing (even though they are technically legal JSON fields). So I want to scan each field (recursively, or in nested fashion) in the incoming JSON and string replace any whitespace with an underscore ("_"). Hence, the above JSON would be converted into:

{
"leg_bone" : false,
"connected_to__the" : {
"arm_bones_" : [
{
"_fizz" : "buzz",
"well_hello" : "there"
}
]
}
}


Typically I use a
JsonSlurper
for parsing JSON strings into maps, but I can't seem to figure out how to get the recursion correct. Here's my best attempt so far:

// In reality 'incomingJson' isn't hardcoded as a string literal, but this helps make my actual use case
// an SSCCE.
class JsonMapExperiments {
static void main(String[] args) {
String incomingJson = """
{
"leg bone" : false,
"connected to the" : {
"arm bones " : [
{
" fizz" : "buzz",
"well hello" : "there"
}
]
}
}
"""
String fixedJson = fixWhitespaces(new JsonSlurper().parseText(incomingJson))
println fixedJson
}

static String fixWhitespaces(def jsonMap) {
def fixedMap = [:]
String regex = ""
jsonMap.each { key, value ->
String fixedKey = key.replaceAll('\\s+', '_')
String fixedValue
if(value in Map) {
fixedValue = fixWhitespaces(value)
} else {
fixedValue = value
}

fixedMap[fixedKey] = fixedValue
}

new JsonBuilder(fixedMap).toString()
}

}


When this runs, the final output is:

{"connected_to_the":"{\"arm_bones_\":\"[{ fizz=buzz, well hello=there}]\"}","leg_bone":"false"}


Which is kinda/sorta close, but not exactly what I need. Any ideas?

Answer

Given your input and this script:

def fixWhitespacesInTree(def tree) {
    switch (tree) {
        case Map:
            return tree.collectEntries { k, v ->
                [(k.replaceAll('\\s+', '_')):fixWhitespacesInTree(v)]
            }
        case Collection:
            return tree.collect { e -> fixWhitespacesInTree(e) }
        default :
            return tree
    }
}

def fixWhitespacesInJson(def jsonString) {
    def tree = new JsonSlurper().parseText(jsonString)
    def fixedTree = fixWhitespacesInTree(tree)
    new JsonBuilder(fixedTree).toString()
}


println fixWhitespacesInJson(json)

I got the following results:

{"connected_to_the":{"arm_bones_":[{"_fizz":"buzz","well_hello":"there"}]},"leg_bone":false}

I would, however, suggest that you change the regular expression \\s+ to just \\s. In the former case. if you have two JSON properties at the same level, one called " fizz" and the other called " fizz" then the translated keys will both be "_fizz" and one will overwrite the other in the final result. In the latter case, the translated keys will be "_fizz" and "__fizz" respectively, and the original content will be preserved.