Shankar Shankar - 4 years ago 226
Scala Question

Spark Streaming direct approach without Check point location

When we use

Spark Streaming Direct
approach and without specifying the
check point location
, where the offsets will be stored and how?

Is there really any difference between using check point location and without specifying any check point location?

Is there going to be any data loss, if i am not specifying the check point location?

Answer Source

If you don't checkpoint, you won't be able to recover in case your driver crashes. In addition, Kafka offsets won't be checkpointed since there is no checkpoint, you'll need to manually store them yourself.

Is there really any difference between using check point location and without specifying any check point location?

That sentence doesn't make much sense. If you don't provide a checkpoint directory, there'll be not checkpoint, if you do there will. To reach exactly once semantics (if required) you'll need to store offsets manually.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download