KingAnjrey KingAnjrey - 1 month ago 13
Java Question

Reading patterns from a file vs. string literals

I have a problem with my regex. I used the following code to get all my regexes out of an ArrayList, compile it and search for matches:

public boolean match(String command){
for (String regex : regexA) {
System.out.println(regex);
Pattern regPatter = Pattern.compile(regex);
Matcher regMatcher = regPatter.matcher(command);

if(regMatcher.find())
return true;
}

return false;
}


I test it like that:

public static void main(String[] args){
RegexMatcher reg = new RegexMatcher(new File("C:\\Users\\XXX\\Desktop\\regex.txt"));
System.out.println(reg.match("password cisco"));
}


It will return the following:

pas[a-z]\\s*\\w+
er\\w*\\s+(?!s).*
us[a-z]*\\s+((?!cisco).)*$
tr[a-z]*\\s+i[a-z]*\\s+\\w*\\s*
f[a-z]*\\s+f.*\\s*
en[a-z]*\\s+v.*
false


It will return
false
. But if I do it different like that it works:

public boolean match(String command){
Pattern regPatter = Pattern.compile("pas[a-z]\\s*\\w+");
Matcher regMatcher = regPatter.matcher(command);

if(regMatcher.find())
return true;

return false;
}


So my problem is if I enter the string directly in
Pattern.compile()
it works, but if I do like in my
match()
method it won't work.

Answer

Your regex.txt file should contain just single back-slashes "\", not double ones - ie. it should be :

pas[a-z]\s*\w+
er\w*\s+(?!s).*
us[a-z]*\s+((?!cisco).)*$
tr[a-z]*\s+i[a-z]*\s+\w*\s*
f[a-z]*\s+f.*\s*
en[a-z]*\s+v.*

In Java strings, backslashes are used to "escape" special characters - eg. "\n" results in a string containing just a single newline character, not a "\" followed by an "n".

Similarly, the double-backslash "\" results in a string containing a single backslash. That is what you want for a Regex.

Files don't need to escape anything (they have newlines, etc already encoded), so they don't need to escape backslashes - which is why they only need single ones.