Michael Lihs Michael Lihs - 1 year ago 296
Java Question

How to pass parameters / properties to Spark jobs with spark-submit

I am running a Spark job implemented in Java using

. I would like to pass parameters to this job - e.g. a
parameter to parametrize the Spark application.

What I tried was using the

--conf key=value

option of the
script, but when I try to read the parameter in my Spark job with


I get an exception:

Exception in thread "main" java.util.NoSuchElementException: key

Furthermore, when I use
I don't see my value in the output.

Further Notice Since I want to submit my Spark Job via the Spark REST Service I cannot use an OS Environment Variable or the like.

Is there any possibility to implement this?

Answer Source

Since you want to use your custom properties you need to place your properties after application.jar in spark-submit (like in spark example [application-arguments] should be your properties. --conf should be spark configuration properties.

--conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown).

./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # options
  <application-jar> \
  [application-arguments] <--- here our app arguments

so when you do: spark-submit .... app.jar key=value in main method you will get args[0] as key=value.

public static void main(String[] args) {
    String firstArg = args[0]; //eq. to key=value

but you want to use key value pairs you need to parse somehow your app arguments.

You can check Apache Commons CLI library or some alternative.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download