smeeb smeeb - 4 months ago 24
Groovy Question

Groovy regex PatternSyntaxException when parsing GString-style variables

Groovy here. I'm being given a

String
with GString-style variables in it like:

String target = 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'


Keep in mind, this is not intended to be used as an actual GString!!! That is, I'm not going to have 3 string variables (
animal
,
role
and
bodyPart
, respectively) that Groovy will be resolving at runtime. Instead, I'm looking to do 2 distinct things to these "target" strings:


  • I want to be able to find all instances of these variables refs (
    "${*}"
    ) in the target string, and replace it with a
    ?
    ; and

  • I also need to find all instances of these variables refs and obtain a list (allowing dupes) with their names (which in the above example, would be
    [animal,role,bodyPart]
    )



My best attempt thus far:

class TargetStringUtils {
private static final String VARIABLE_PATTERN = "\${*}"

// Example input: 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
// Example desired output: 'How now brown ?. The ? has oddly-shaped ?.'
static String replaceVarsWithQuestionMarks(String target) {
target.replaceAll(VARIABLE_PATTERN, '?')
}

// Example input: 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
// Example desired output: [animal,role,bodyPart] } list of strings
static List<String> collectVariableRefs(String target) {
target.findAll(VARIABLE_PATTERN)
}
}


...produces
PatternSytaxException
anytime I go to run either method:

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition near index 0
${*}
^


Any ideas where I'm going awry?

Answer

The issue is that you have not escaped the pattern properly, and findAll will only collect all matches, while you need to capture a subpattern inside the {}.

Use

def target = 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
println target.replaceAll(/\$\{([^{}]*)\}/, '?') // => How now brown ?. The ? has oddly-shaped ?.

def lst = new ArrayList<>();
def m = target =~ /\$\{([^{}]*)\}/
(0..<m.count).each { lst.add(m[it][1]) }
println lst   // => [animal, role, bodyPart]

See this Groovy demo

Inside a /\$\{([^{}]*)\}/ slashy string, you can use single backslashes to escape the special regex metacharacters, and the whole regex pattern looks cleaner.

  • \$ - will match a literal $
  • \{ - will match a literal {
  • ([^{}]*) - Group 1 capturing any characters other than { and }, 0 or more times
  • \} - a literal }.