Shi Zhang Shi Zhang - 3 years ago 99
Java Question

Java Regex Tokenizing

new to regex here haha.

Let's say I have a string:

String toMatch = "TargetCompID=NFSC_AMD_Q\n" +

"\n## Bin's verifix details";


Which shows up in a .cfg file as:

TargetCompID=NFSC_AMD_Q

## Bin's verifix details


I want to tokenize this into an array as:

{"TargetCompID", "NFSC_AMD_Q", "## Bin's verifix details"}


Right now I am this working regex:

String previousRegex = "^[^=]*$" + "|" + "^#(.*?)";
String workingRegex = "(?m)^([^=]+)=(.+)\\R+^(#.*)";


EDIT



while (matcherTest.find()) {
for (int i = 1; i < matcherTest.groupCount(); i++) {
System.out.println(matcherTest.group(i));
}


prints:

TargetCompID
NFSC_AMD_Q


but not

## Bin's verifix details


why?

also this code:

while (matcherTest.find()) {
System.out.println(matcherTest.group());
}


only prints

TargetCompID=NFSC_AMD_Q

## Bin's verifix details


Is TargetCompID and NSFC_AMD_Q not separated because we're not doing group(i)? and why is there a \newline printed?

Answer Source

You can use this regex in Java:

(?m)^([^=]+)=(.+)\R+^(#.*)

RegEx Demo

RegEx Breakup:

  • (?m): Enable MULTILINE mode
  • ^([^=]+)=: Match till = and capture in group #1 followed by =
  • (.+): Match rest of line in group #2
  • \R+: Match 1+ line breaks
  • ^(#.*): match a full line starting with # in group #3
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download