nilesh1212 nilesh1212 - 8 months ago 71
JSON Question

Spark Streaming Sliding Window max and min

I am begineer to Spark; I am working on spark streaming use case where I receive a json messages each json message has an attribute 'value' which is double after parsing json I get a Array[Double].I want to find out max(value) and min (value) for last 15 sec with sliding window of 2 sec.
Here is my code.

val record = KafkaUtils.createStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topicMap, StorageLevel.MEMORY_ONLY_SER_2)

val valueDtsream:DStream[Array[Double]] { jsonRecord => parseJson(jsonRecord) }

rdd =>
if (!rdd.partitions.isEmpty)
//code to find min and max


Answer Source


valueDtsream.transform( rdd => {
  val stats = rdd.flatMap(x => x).stats
  rdd.sparkContext.parallelize(Seq((stats.min, stats.max)))