We have a spark streaming application that consumes Gnip compliance stream.
In the old version of the API, the compliance stream was provided by one end point but now it is provided by 8 different endpoints.
We could run the same spark application 8 times with different parameters to consume different endpoints.
Is there a way in spark streaming to consume the 8 endpoints and merge them into one in the same application?
Should we use different streaming context for each connection or one context is enough?
I think you are looking for Spark union here.
Read following for examples Concatenating datasets of different RDDs in Apache spark using scala
As per Spark documentation Spark union :
Return a new dataset that contains the union of the elements in the source dataset and the argument.