Shivansh Srivastava Shivansh Srivastava - 1 year ago 150
Scala Question

How to read records from Kafka topic from beginning in Spark Streaming?

I am trying to read records from a Kafka topic using Spark Streaming.

This is my code:

object KafkaConsumer {

import ApplicationContext._

def main(args: Array[String]) = {

val kafkaParams = Map[String, Object](
"bootstrap.servers" -> "localhost:9092",
"key.deserializer" -> classOf[StringDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"" -> s"${UUID.randomUUID().toString}",
"auto.offset.reset" -> "earliest",
"" -> (false: java.lang.Boolean)

val topics = Array("pressure")
val stream = KafkaUtils.createDirectStream[String, String](
Subscribe[String, String](topics, kafkaParams)
stream.print() => (record.key, record.value)).count().print()

It displays nothing when I run this.

To check if data is actually present in the
topic, I used the command line approach and it does display records:

bin/ \
--bootstrap-server localhost:9092 \
--topic pressure \


TimeStamp:07/13/16 15:20:45:226769,{'Pressure':'834'}
TimeStamp:07/13/16 15:20:45:266287,{'Pressure':'855'}
TimeStamp:07/13/16 15:20:45:305694,{'Pressure':'837'}

What's wrong?

Answer Source

You're missing streamingContext.awaitTermination().

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download