tamjd1 tamjd1 - 6 months ago 66
Scala Question

Loading Java spark config from yaml file

I have a java spark app in which I instantiate a

object with the required configurations for Spark. Currently, it looks like this:

SparkConf conf = new SparkConf()
.set("spark.executor.memory", "8g")

The master and app name come from a
file which contains app configurations, and the rest of the spark configurations are hardcoded and set one at a time.

file also contains these key/value pairs of configurations for Spark. My other (python) apps are using the spark configs directly from here. It looks like this:

master: ...
appname: ...
spark.mesos.executor.home: '/data/spark'
spark.executor.memory: '8g'
spark.network.timeout: '420'
... other spark configs

I'm wondering if I can use these configs from the
file to set the spark configs in the code automatically using
method provided by
, instead of setting them one at a time.

This is how I'm reading the configs from the
file currently but it's not working:

LinkedHashMap<String, String> sparkConf = new LinkedHashMap<>((Map<String, String>) ((Map) yaml.get("spark")).get("conf"));

How can I load
spark: conf
from the
file so it can be used by the
method? Apparently, the method expects a scala object of type:
Traversable<Tuple2<String, String>>


You can add "snakeyaml" dependency in your project to read yaml file in java.


Now if you have "application.yaml" file having configuration defined like you have posted, you can read it and create SparkConf with setAll() method in java like below.

import org.yaml.snakeyaml.Yaml;
import scala.collection.JavaConversions;

Yaml yaml = new Yaml();  
InputStream is = MySparkApplication.class.getClassLoader().getResourceAsStream("application.yaml");
Map<String, Object> yamlParsers = (Map<String, Object>) yaml.load(is);
LinkedHashMap<String,Object> spark = (LinkedHashMap<String,Object>) yamlParsers.get("spark"); 
LinkedHashMap<String,String> config = (LinkedHashMap<String,String>) spark.get("conf");
SparkConf conf = new SparkConf()
             .setAppName((String) spark.get("appname"))
             .setMaster((String) spark.get("master"))