M.Prabhu M.Prabhu - 3 months ago 14
JSON Question

Spark SqlContext output JSON format

I have retrieved data from postgres Database using Spark SqlContext.

Here is the sample code:

Class.forName(dbDriver);

Map<String, String> options = new HashMap<String, String>();
options.put("url", dbUrl);
options.put("dbtable", dbTable);
options.put("driver", dbDriver);

SparkConf conf = new SparkConf().setAppName("JAVA_SPARK")
.setMaster("local[2]").set("spark.ui.port‌​", "7077");

JavaSparkContext jsc = new JavaSparkContext(conf);

SQLContext sqlContext = new SQLContext(jsc);

DataFrame dframe = sqlContext.read().format("jdbc")
.options(options).load();

dframe.show();


I have got the following output:

+------+---+
| name|age|
+------+---+
|abc | 20|
|xyz | 4|
+------+---+


I want the output to be in JSON format.Is there any way to convert this format to JSON or other specific way than this?

Answer

If you want to convert the DF to json then you can use following.

JavaRDD<String> jsonRDD = dframe.toJSON().toJavaRDD();      
jsonRDD.foreach(data -> {
        System.out.println(data);
    });

If you want to save it as json file then use

dframe.write().json("c:\\temp\\myfile.json");

If you want to get it as List then call take() or collect(). Please refer Spark doc for when to use these methods.

List<String> mylist = jsonRDD.collect();