Eric Ed Lohmar Eric Ed Lohmar - 1 month ago 5
Java Question

Java Regex and PathMatcher

I'm writing an application in Java that displays a list of files where the first word in the filename matches user defined string, then deletes or rearranges them depending on some preferences. I'm currently in the phase of finding a good way to find my files. Using this Java Tutorial I ended up with something like this:

Path source = Paths.get(sourceText.getText());
Path dest = Paths.get(destText.getText());

System.out.println("Source:" + source.toString());
System.out.println("P/N: " + partNoText.getText());

String matchString = "glob:**" + partNoText.getText() + "*";

System.out.println("Matching: " + matchString);

fileFinder = new FileFinder(matchString);

try {
Files.walkFileTree(source, fileFinder);
} catch (IOException e1) {
e1.printStackTrace();
}
for (Path path : fileFinder.getResult()) {
System.out.println("Moving: " + path.getFileName());
Path target = Paths.get(dest.toString() + "\\" + path.getFileName());

try {
Files.move(path, target, REPLACE_EXISTING);
} catch (IOException e1) {
e1.printStackTrace();
}
}


where FileFinder extends SimpleFileVisitor and has this visitFile method:

public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
System.out.println(file.toString());
System.out.println(fileMatcher.matches(file));
if (fileMatcher.matches(file)) {
result.add(file);
return FileVisitResult.CONTINUE;
}
return FileVisitResult.CONTINUE;
}


My issue with this is that the glob will pick up any file where the filename contains the part no. in any way. So if my file is called "12345 RevA Really Big Part 2: Electric Bugaloo" then the string would match if the user entered "1" or "123" or "Bugaloo". Ideally, it would only match if the user enterred "12345".

I tried switching my matchString to
"regex: .*" + partNoText + "\\b"
, which works in the regex test harness I modified from this other Java Tutorial. What am I doing wrong? Does
PathMatcher
work differently than a regular
Matcher
?

P.S. Any variable that has the word "Text" in it, like
sourceText
and
partNoText
are JTextFields. Hopefull that is the only part of the code that is mostly unclear from what I clipped out of it.

Answer

"Does PathMatcher work differently than a regular Matcher?"
Yes. a PathMatcher works with filename globbing[1], while a Matcher works with regular expressions.

See What Is a Glob? in the tutorial you linked, and compare that with the documentation for java.util.regex.Pattern.
Globbing is quite a bit more limited that regular expression matching.

If you have a strict file naming convention that is rigorously adhere to you probably can use globbing (I take back that last part of my previous comment).

Let's say your files are named as
numeric part number - space - optional revision & space - description

That is, the part number can have a variable number of digits, but the space after the part number is required and always present.

So your example "12345 RevA Really Big Part 2: Electric Bugaloo" fits that with partNum==12345, revision="RevA ", description="Really Big Part 2: Electric Bugaloo"

A user enters a part number P/N: 123 as a variable userPN and you construct a glob as
String glob = userPN + " *"; resulting in glob equalling "123 *"
This will not match 12345, as you desire, because of the space after the 3 will not match the 4.

If there is not a required space after the part number in the filename, but what follows is always alphabetic, either the Revision or the Description, you can construct a glob as
String glob = userPN + "[A-Z,a-z]*"; giving glob = 123[A-Z,a-z]* which also won't match 12345 because an alphabetic must follow the 123 and the 4 is not in that character range.

You can make your character range more complicated, say [A-Z,a-z, ] for an optional space, depending on your needs, but it all really comes down to your file naming convention. You need to state that convention very precisely and adhere to it.


[1] a PathMatcher can use a regular expression instead of globbing if you specify the "syntax" as regex when calling FileSystem.getPathMatcher(String). This would be something like

FileSystem fs = FileSystems.getDefault();
PathMatcher pm = fs.getPathMatcher("regex:\\d{5}\\s.*");
Comments