Francesco Menzani Francesco Menzani - 1 month ago 8
Java Question

Get large directory content faster (java.io.File alternatives)

I've used the old, obsolete

for too long.

The performance is not so good. It is:


  • Expensive, since it creates a new
    File
    object for each entry.

  • Slow, because you have to wait for the array to be completed before starting the processing.

  • Very bad, especially if you need to work only on subset of the content.



What are the alternatives?

Answer

The Java 7's java.nio.file package can be used to enhance performance.

Iterators

The DirectoryStream<T> interface can be used to iterate over a directory without preloading its content into memory. While the old API creates an array of all filenames in the folder, the new approach loads each filename (or limited size group of cached filenames) when it encounters it during iteration.

To get the instance representing a given Path, the Files.newDirectoryStream(Path) static method can be invoked. I suggest you to use the try-with-resources statement to properly close the stream, but if you can't, remember to do it manually at the end with DirectoryStream<T>.close().

Path folder = Paths.get("...");
try (DirectoryStream<Path> stream = Files.newDirectoryStream(folder)) {
    for (Path entry : stream) {
        // Process the entry
    }
} catch (IOException ex) {
    // An I/O problem has occurred
}

Filters

The DirectoryStream.Filter<T> interface can be used to skip groups of entries during iteration.

Since it's a @FunctionalInterface, starting with Java 8 you could implement it with a lambda expression, overriding the Filter<T>.accept(T) method which decides if the given directory entry should be accepted or filtered. Then you would use the Files.newDirectoryStream(Path, DirectoryStream.Filter<? super Path>) static method with the newly created instance. Or you might prefer the Files.newDirectoryStream(Path, String) static method instead, which can be used for simple filename matching.

Path folder = Paths.get("...");
try (DirectoryStream<Path> stream = Files.newDirectoryStream(folder, "*.txt")) {
    for (Path entry : stream) {
        // The entry can only be a text file
    }
} catch (IOException ex) {
    // An I/O problem has occurred
}

Path folder = Paths.get("...");
try (DirectoryStream<Path> stream = Files.newDirectoryStream(folder,
        entry -> entry.toFile().isDirectory())) {
    for (Path entry : stream) {
        // The entry can only be a directory
    }
} catch (IOException ex) {
    // An I/O problem has occurred
}
Comments