Sebastian Zeki Sebastian Zeki - 7 months ago 23
Java Question

Regex to match first occurrence of a string is matching the last

I have the following list

Acid
stuff
goo
nasty
Probable
Acid
more stuff
Probable
Acid
fff
ggg
Probable


I want to match everything between Acid and Probable. However my regex matches only the last match (
Acid,fff,ggg,Probable
) not the first (
Acid,stuff, goo, nasty, Probable
)

The calling class:

public static void main(String[] args) throws IOException {


PDFManager pdfManager = new PDFManager();
pdfManager.setFilePath("MyFile.pdf");
String s=pdfManager.ToText();


if(s.contains("Thresholds")){

BravoaltDoc_ExtractionNonDays Sum = new BravoaltDoc_ExtractionNonDays(s);
Sum.ExtractSumNew(s);


public class BravoaltDoc_ExtractionNonDays {
String doc;
}}

ArrayList<String> Day_arr = new ArrayList<String>();
ArrayList<List<String>> Day_table2d = new ArrayList<List<String>>();
String [] seTab3Landmarks=null;

public BravoaltDoc_ExtractionNonDays(String doc) {
this.doc=doc;
}

public String ExtractSumNew(String doc) {
Pattern Tab3Landmarks_pattern = Pattern.compile("Acid?(.*?)Probable",Pattern.DOTALL);
Matcher matcherTab3Landmarks_pattern = Tab3Landmarks_pattern.matcher(doc);
while (matcherTab3Landmarks_pattern.find()) {
doc=matcherTab3Landmarks_pattern.group(1);
seTab3Landmarks=matcherTab3Landmarks_pattern.group(1).split("\\n|\\r");
}
for (String n:seTab3Landmarks){
System.out.println(n);
}
return docSlim;

}

}

Answer

Your code correctly finds all the matches. However, since each find re-assigns seTab3Landmarks, you only get the last match printed out at the end.

if you only want the first match, you should use an "if" block instead of a "while" block (which finds all matches).