Betamoo Betamoo - 5 months ago 19
Java Question

Cannot match string using regex

I am working on some regex and I wonder why this regex

"(?<=(.*?id(( *)=)\\s[\"\']))g"


does not match the string

<input id = "g" />


in Java?

Answer

Not only does Java not allow unbounded lookbehind, it's supposed to throw an exception if you try. The fact that you're not seeing that exception is itself a bug.

You shouldn't be using lookbehind for that anyway. If you want to match the value of a certain attribute, the easiest, least troublesome approach is to match the whole attribute and use a capturing group to extract the value. For example:

String source = "<input id = \"g\" />"; 
Pattern p = Pattern.compile("\\bid\\s*=\\s*\"([^\"]*)\"");
Matcher m = p.matcher(source);
if (m.find())
{
  System.out.printf("Found 'id' attribute '%s' at position %d%n",
                    m.group(1), m.start());
}

Output:

Found 'id' attribute 'g' at position 7

Do yourself a favor and forget about lookbehinds for a while. They're tricky even when they're not buggy, and they're really not as useful as you might expect.