Nice1 Nice1 - 5 months ago 19
Java Question

Regex to detect parentheses containing a weight

I'm having difficulties with getting my regex right.

I used this link, for detecting weight:
regex to get weight

This was the term to only find the weight, which worked:

([\d.]+)\s+(lbs?|oz|g|kg)


I wrote a java-method to color the dosage of medicaments on a html page. It should color all the text in parentheses, if it contains at least one indication of weight. (e.g. below 18: 5.5mg, over 18: 10mg)
Currently it will sometimes color the right part, but most of the time the regex gets to much or ignores a parenthese, that should be colored.

Problem currently: regex also contains every word after the closing parenthese until the end of the line.

Here my current regex:

(\(.[^\(].\d\,?\d+)\s?+(µg|mg|g|kg).*.\)

Here the entire method:

private static String addDosageHighlight(String htmltext) {

String dosage ="";
Pattern pattern = Pattern.compile("(\\(.[^\\(]*.\\d*\\,?\\d+)\\s?+(µg|mg|g|kg).*.\\)");
Matcher matcher = pattern.matcher(htmltext);
// Check all occurrences
if (matcher.find()) {
dosage = matcher.group();
htmltext = htmltext.replace(dosage, "<span style=\"color:magenta;\">" + dosage +"</span>");
}
return htmltext;
}


Examples:
medicament b (under 18: 10 g, over 18: 15 g) works well

medicament c (sometimes 15g if needed) can help

(sometimes 10 g)

Those all get detected, but will color all text until the end of the line, after the parentheses. I couldn't manage to get a parentheses that won't be colored which should be good.

Answer

You didn't specify if you accept decimals, but from your regex, I assume you allow decimal numbers with a comma as decimal mark.

So, I believe that this regex will do what you are looking for:

"\\([^\\)]*\\d+(,\\d+)?\\s*(µg|mg|g|kg)[^\\)]*\\)"