user1650864 user1650864 - 5 months ago 10
Java Question

Regex pattern java with commas

I have a below string which comes from an excel column

"\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\""


I would like to set regex pattern to retrieve the entire string,so that my result would be exactly like

"USE CODE ""Gef, sdf"" FROM 1/7/07"


Below is what I tried

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches
{
public static void main( String args[] ){

// String to be scanned to find the pattern.
String line = "\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\", Delete , Hello , How are you ? , ";
String line2 = "Test asda ds asd, tesat2 . test3";

String dpattern = "(\"[^\"]*\")(?:,(\"[^\"]*\"))*,|([^,]+),";
// Create a Pattern object
Pattern d = Pattern.compile(dpattern);
Matcher md = d.matcher(line2);

Pattern r = Pattern.compile(dpattern);

// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: 0 " + m.group(0) );
// System.out.println("Found value: 1 " + m.group(1) );
//System.out.println("Found value: 2 " + m.group(2) );
} else {
System.out.println("NO MATCH");
}
}
}


and the result out of it breaks after ,(comma) and hence the output is

Found value: 0 "USE CODE ""Gef,


It should be

Found value: 0 "USE CODE ""Gef sdf"" FROM 1/7/07",


and for the second line
Matcher m = r.matcher(line2);
the output should be

Found value: 0 "Test asda ds asd",

Answer

You may use

(?:"[^"]*(?:""[^"]*)*"|[^,])+

See the regex demo

Explanation:

  • " - leading quote
  • [^"]* - 0+ chars other than a double quote
  • (?:""[^"]*)* - 0+ sequences of a "" text followed with 0+ chars other than a double quote
  • " - trailing quote

OR:

  • [^,] - any char but a comma

And the whole pattern is matched 1 or more times as it is enclosed with (?:...)+ and + matches 1 or more occurrences.

IDEONE demo:

String line = "\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\", Delete , Hello , How are you ? , ";
String line2 = "Test asda ds asd, tesat2 . test3";
Pattern pattern = Pattern.compile("(?:\"[^\"]*(?:\"\"[^\"]*)*\"|[^,])+");
Matcher matcher = pattern.matcher(line);
if (matcher.find()){                        // if is used to get the 1st match only
    System.out.println(matcher.group(0)); 
}
Matcher matcher2 = pattern.matcher(line2); 
if (matcher2.find()){
    System.out.println(matcher2.group(0)); 
} 
Comments