smeeb smeeb - 1 year ago 128
Groovy Question

Groovy regex PatternSyntaxException when parsing GString-style variables

Groovy here. I'm being given a

String
with GString-style variables in it like:

String target = 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'


Keep in mind, this is not intended to be used as an actual GString!!! That is, I'm not going to have 3 string variables (
animal
,
role
and
bodyPart
, respectively) that Groovy will be resolving at runtime. Instead, I'm looking to do 2 distinct things to these "target" strings:


  • I want to be able to find all instances of these variables refs (
    "${*}"
    ) in the target string, and replace it with a
    ?
    ; and

  • I also need to find all instances of these variables refs and obtain a list (allowing dupes) with their names (which in the above example, would be
    [animal,role,bodyPart]
    )



My best attempt thus far:

class TargetStringUtils {
private static final String VARIABLE_PATTERN = "\${*}"

// Example input: 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
// Example desired output: 'How now brown ?. The ? has oddly-shaped ?.'
static String replaceVarsWithQuestionMarks(String target) {
target.replaceAll(VARIABLE_PATTERN, '?')
}

// Example input: 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
// Example desired output: [animal,role,bodyPart] } list of strings
static List<String> collectVariableRefs(String target) {
target.findAll(VARIABLE_PATTERN)
}
}


...produces
PatternSytaxException
anytime I go to run either method:

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition near index 0
${*}
^


Any ideas where I'm going awry?

Answer Source

The issue is that you have not escaped the pattern properly, and findAll will only collect all matches, while you need to capture a subpattern inside the {}.

Use

def target = 'How now brown ${animal}. The ${role} has oddly-shaped ${bodyPart}.'
println target.replaceAll(/\$\{([^{}]*)\}/, '?') // => How now brown ?. The ? has oddly-shaped ?.

def lst = new ArrayList<>();
def m = target =~ /\$\{([^{}]*)\}/
(0..<m.count).each { lst.add(m[it][1]) }
println lst   // => [animal, role, bodyPart]

See this Groovy demo

Inside a /\$\{([^{}]*)\}/ slashy string, you can use single backslashes to escape the special regex metacharacters, and the whole regex pattern looks cleaner.

  • \$ - will match a literal $
  • \{ - will match a literal {
  • ([^{}]*) - Group 1 capturing any characters other than { and }, 0 or more times
  • \} - a literal }.