user2437771 user2437771 - 1 year ago 46
Java Question

Java Regex building

I need a help in building the regex for following pattern where I have to collect the string in a particular pattern.

Sample Input String:

hostname ${hostname} !
ip name-server ${ip-name-server}
no ipv6 cef
voice class codec 1
codec preference 1 ${codec-pref-1} codec preference 2 ${codec-pref-2} codec preference 3 ${codec-pref-3} !
session target dns:${session-targ-DNS} dtmf-relay rtp-nte*

The output should be

i.e the string which is covered in the format ${string} should be collected and retrieved.

I tried code as below

public void fetchKeyword(String inputString) {
String inputString1 = inputString.replace("\n", " ");
Pattern p = Pattern.compile("\\${$1} ");
Matcher m = p.matcher(inputString1);
int i=0;

Also I tried patterns likes
etc but no result came as expected. I got exceptions like below

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition near index 1
at java.util.regex.Pattern.error(Unknown Source)
at java.util.regex.Pattern.closure(Unknown Source)
at java.util.regex.Pattern.sequence(Unknown Source)
at java.util.regex.Pattern.expr(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.util.regex.Pattern.<init>(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at myUtil.ReplaceString.fetchKeyword(
at myUtil.ReplaceString.main(

Can anyone please help on the same?

Answer Source

You can use this solution to retrieve the placeholder text:

// test string
String input = "! hostname ${hostname} ! ! ! ip name-server "
            + "${ip-name-server} no ipv6 cef ! ! "
            + "voice class codec 1 codec preference 1 ${codec-pref-1} "
            + "codec preference 2 ${codec-pref-2} codec preference 3 "
            + "${codec-pref-3} ! ! session target "
            + "dns:${session-targ-DNS} dtmf-relay rtp-nte";

// compiling pattern with one group representing the text inside ${}
Pattern p = Pattern.compile("\\$\\{(.+?)\\}");
// initializing matcher
Matcher m = p.matcher(input);
// iterating find
while (m.find()) {
    // back-referencing group 1 each find




  • The $1 idiom you used is employed in replacements (i.e. String#replaceAll), to back-reference an indexed group.
  • Indexed groups are declared in your pattern as () or since Java 7, as named groups: (?<name>X)
  • The index of a group is defined by the occurrence of a grouping idiom within the pattern, not by iteration of matches as you seem to assume
  • See docs here
  • The pattern I'm showing as example is double escaping the $, { and } characters
  • Also worth noting, it uses a reluctant quantifier (+?) in order to match as much as possible until the next known character: }
  • Finally as stated above, the group #1 is defined within the parenthesis, and represents any character (until the closing })
  • Line breaks in your input text will not impact negatively on this pattern's results as long as no line break occurs within a ${something} idiom
  • If such a case occurred, you would either need to clean up the text of line breaks before parsing, or parametrize your pattern with Pattern.DOTALL and cleanup the line breaks in the matches afterwards (the latter doesn't look like a great solution though)
  • As Thomas mentions, this pattern assumes your expression between {} will never be empty. If you do have an empty expression, it will fail by parsing everything from the start of the empty expression to the end of the next, non-empty one if applicable. So, either you are guaranteed you don't have empty expressions or you want to use .*? instead of .+? (see also Thomas' answer).