Hauke Hauke - 3 months ago 8
Java Question

Glob understanding

I need to develop a file scanner in java with the following options / parameters:


  1. One directory

  2. One or more pattern like *.xml, *.txt, *test.csv

  3. Switch for recursive scanning



I think the best way would be something like this:

public class FileScanningTest {

public static void main(String[] args) throws IOException {

String directory = "C:\\tmp\\scanning\\";
String glob = "**/*.xml";
Boolean rekursiv = false;

final PathMatcher pathMatcher = FileSystems.getDefault().getPathMatcher("glob:"+glob);

Files.walkFileTree(Paths.get(directory), new SimpleFileVisitor<Path>() {

@Override
public FileVisitResult visitFile(Path path, BasicFileAttributes attrs) throws IOException {
if (pathMatcher.matches(path)) {
System.out.println(path);
}
return FileVisitResult.CONTINUE;
}

@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
return FileVisitResult.CONTINUE;
}
});

}

}


I do not understand why I have to put "**/" in front of my actual pattern. Also this does make the scanning recursive. If I remove **/, the application is not finding anything anymore.

https://docs.oracle.com/javase/tutorial/essential/io/fileOps.html#glob tells that ** means recursive, but why this is not working if I remove that?

Can somebody give me a hint?

Thanks everyone and have a nice weekend

Answer

To recursively find *.xml using a glob starting from the directory /tmp/scanning/ please have a look at this sample. It works with Linux Ubuntu and does what you want. It works like the Unix find utility. I didn't test it on other OS than Ubuntu, but you should only need to change the filename separator.

import java.io.*;
import java.nio.file.*;
import java.nio.file.attribute.*;

import static java.nio.file.FileVisitResult.*;
import static java.nio.file.FileVisitOption.*;

import java.util.*;


public class FileScanningTest {

    public static class Finder
            extends SimpleFileVisitor<Path> {

        private final PathMatcher matcher;
        private int numMatches = 0;

        Finder(String pattern) {
            matcher = FileSystems.getDefault()
                    .getPathMatcher("glob:" + pattern);
        }

        // Compares the glob pattern against
        // the file or directory name.
        void find(Path file) {
            Path name = file.getFileName();
            if (name != null && matcher.matches(name)) {
                numMatches++;
                System.out.println(file);
            }
        }

        // Prints the total number of
        // matches to standard out.
        void done() {
            System.out.println("Matched: "
                    + numMatches);
        }

        // Invoke the pattern matching
        // method on each file.
        @Override
        public FileVisitResult visitFile(Path file,
                                         BasicFileAttributes attrs) {
            find(file);
            return CONTINUE;
        }

        // Invoke the pattern matching
        // method on each directory.
        @Override
        public FileVisitResult preVisitDirectory(Path dir,
                                                 BasicFileAttributes attrs) {
            find(dir);
            return CONTINUE;
        }

        @Override
        public FileVisitResult visitFileFailed(Path file,
                                               IOException exc) {
            System.err.println(exc);
            return CONTINUE;
        }
    }


    public static void main(String[] args)
            throws IOException {
        boolean recursive = false;
        Path startingDir = Paths.get("/tmp/scanning");
        String pattern = "*.{html,xml}";

        Finder finder = new Finder(pattern);
        if (!recursive) {
            Path dir = startingDir;
            List<File> files = new ArrayList<>();
            try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir, "*.{xml,html}")) {
                for (Path entry : stream) {
                    files.add(entry.toFile());
                }

                for (File xmlfile : files) {
                    System.out.println(xmlfile);
                }
            } catch (IOException x) {
                throw new RuntimeException(String.format("error reading folder %s: %s",
                        dir,
                        x.getMessage()),
                        x);
            }
        } else {    
            Files.walkFileTree(startingDir, finder);
            finder.done();
        }

    }
}

Test

 ~> java FileScanningTest
/tmp/scanning/dir2/test2.xml
/tmp/scanning/blah.xml
Matched: 2

If you want to match either *.xml or test3.html, then you can use this pattern: String pattern = "{*.xml,test3.html}";