Peter Peter - 22 days ago 6
Java Question

Access files in resources directory in JAR from Apache Spark Streaming context

I have a Java application I have written as a Spark Streaming job which requires some text resources that I have included in the jar in a resources directory (using the default Maven directory structure). With unit tests I have no problem accessing these files but when I run my program with spark-submit I get a FileNotFoundException. How do I access files on the classpath in my JAR when running with spark-submit?

The code I am currently using to access my file looks roughly like this:

InputStream input;

try {
URL url = this.getClass().getClassLoader().getResource("my file");
if (url == null) {
throw new IOException("file does not exist");
}
String path = url.getPath();
input = new FileInputStream(path);
} catch(IOException e) {
throw new RuntimeException(e);
}


Thanks.

Note this is not a duplicate of Reading a resource file from within jar (which was suggested), because this code works when run locally. It only fails when run in a Spark cluster.

Answer

I fixed this by accessing the resources directory a different (and significantly less silly) way:

input = Settings.class.getResourceAsStream("/my file");