Sebastian Zeki Sebastian Zeki - 29 days ago 12
Java Question

How to use nested non capture groups in java regex

I have a series of lines as follows (which can come in any order)

Distal latency 4.9 N/A N/A 4.0 N/A N/A N/A N/A 6.3 4.4 N/A

% failed Chicago Classification 70 1 1 0 1 1 1 1 0 0 1

% panesophageal pressurization 0 0 0 0 0 0 0 0 0 0 0

% premature contraction 20 0 0 1 0 0 0 0 0 1 0

% rapid contraction 10 0 0 1 0 0 0 0 0 0 0

% large breaks 10 0 0 0 0 0 0 0 1 0 0

% small breaks 10 0 0 1 0 0 0 0 0 0 0


I want to eventually extract the line title and each value into a Hash as follows

Distallatency=4.9,Distallatency=N/A etc.
failedChicagoClassification1=70,failedChicagoClassification1=1,failedChicagoClassification1=1,failedChicagoClassification1=0,failedChicagoClassification1=1 etc.

and so on


My strategy to do this is:

1. join the words together by replacing the \s between words
2. End the joined word with a character eg : so I can then split each line into an array based on \s
3. Loop through the array adding the line title to each value into a Hash


Here is what I have done so far:


Pattern match_patternSwallow2 = Pattern.compile("(?:.*\\d+\\.\\d|N\\/A|\\d*){4,50}");
Matcher matchermatch_patternSwallow2 = match_patternSwallow2.matcher(s);

while (matchermatch_patternSwallow2.find()){
String found = matchermatch_patternSwallow2.group(0).trim();
System.out.println(found);

//Join up the words so can then split by space
found = found.replaceAll("([A-Za-z]+)\\s", "$1_").replaceAll("\\s", ":");
List<String> myList = new ArrayList<String>(Arrays.asList(found.split(":")));

for (int ff=1;ff<myList.size();ff++){
mapSwallow.put(myList.get(0)+"MapSwallowsNum"+ff,myList.get(ff));
}
}


I get no errors with the capture but it only returns an empty string at the System.out line.

What am I doing wrong?

Answer

I can suggest the following regex to get each line that meets your criteria:

"(?m)^\\W*([a-zA-Z].*?)\\s*((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)‌​*)$"

See the regex demo

Once you find a match, use the .group(1).replaceAll("\\s+","") as the key, and split .group(2) with .split("\\s+") to get the values.

See a sample online code:

String s = "Distal latency   4.9 N/A N/A 4.0 N/A N/A N/A N/A 6.3 4.4 N/A\n\n % failed Chicago Classification  70 1 1 0 1 1 1 1 0 0 1\n\n % panesophageal pressurization  0 0 0 0 0 0 0 0 0 0 0\n\n % premature contraction  20 0 0 1 0 0 0 0 0 1 0\n\n % rapid contraction  10 0 0 1 0 0 0 0 0 0 0\n\n % large breaks  10 0 0 0 0 0 0 0 1 0 0\n\n % small breaks  10 0 0 1 0 0 0 0 0 0 0";
Pattern match_patternSwallow2= Pattern.compile("(?m)^\\W*([a-zA-Z].*?)\\s*((?:(?:\\d+(?:\\.\\d+)?|N/A)\\s*)*)$");
Matcher matchermatch_patternSwallow2 = match_patternSwallow2.matcher(s);
HashMap<String, String> mapSwallow = new HashMap<String, String>();
while (matchermatch_patternSwallow2.find()){
    String[] myList = matchermatch_patternSwallow2.group(2).split("\\s+");
    String p1 = matchermatch_patternSwallow2.group(1).replaceAll("\\s+", "");
    int line = 1;
    for (String p2s: myList){
        mapSwallow.put(p1+line, p2s);
        line++;
    }
}
System.out.println(mapSwallow);
Comments