flopr flopr - 1 month ago 4
Java Question

RegEx to extract text between tags in Java

I need to extract the values after

:70:
in the following text file using RegEx. Value may contain line breaks as well.

My current solution is to extract the string between
:70:
and
:
but this always returns only one match, the whole text between the first
:70:
and last
:
.

:32B:xxx,
:59:yyy
something
:70:ACK1
ACK2
:21:something
:71A:something
:23E:something
value
:70:ACK2
ACK3
:71A:something


How can I achive this using Java? Ideally I want to iterate through all values, i.e.

ACK1\nACK2
,
ACK2\nACK3


Thanks :)

Edit: What I'm doing right now,

Pattern pattern = Pattern.compile("(?<=:70:)(.*)(?=\n)", Pattern.DOTALL);
Matcher matcher = pattern.matcher(data);
while (matcher.find()) {
System.out.println(matcher.group())
}

Answer

Try this.

String data = ""
    + ":32B:xxx,\n"
    + ":59:yyy\n"
    + "something\n"
    + ":70:ACK1\n"
    + "ACK2\n"
    + ":21:something\n"
    + ":71A:something\n"
    + ":23E:something\n"
    + "value\n"
    + ":70:ACK2\n"
    + "ACK3\n"
    + ":71A:something\n";
Pattern pattern = Pattern.compile(":70:(.*?)\\s*:", Pattern.DOTALL);
Matcher matcher = pattern.matcher(data);
while (matcher.find())
    System.out.println("found="+ matcher.group(1));

result:

found=ACK1
ACK2
found=ACK2
ACK3