sandun Tharaka sandun Tharaka -4 years ago 137
Scala Question

How to construct whole rdd into one element?

I have rdd like this

A,1335952933,1
A,1335953754,0
A,1335994294,1
A,1335995228,0
B,1336001513,1
B,1336002622,0
B,1336006905,1
B,1336007462,0


rdd.first
A,1335952933,1


when I get rdd.first it shows A,1335952933,1 but I want to get whole rdd as a one element seperated by commas like this

rdd.first
A,1335952933,1,A,1335953754,0,A,1335994294,1,A,1335995228,0,B,1336001513,1,B,1336002622,0,
B,1336007462,0


I can do it using collect and mkString scala but I heard collect is not a good solution in large data sets Is there any other way to do this using rdd operations ?

Answer Source

but I want to get whole rdd as a one element

collect is not recommended exactly for this reason. collect transfers the entire data of the RDD collection to the driver application (which runs on a single machine) which is not possible for large dataset since you would get out of memory exception. so if you really want this you take the route of collect and mkString and avoid using it on large RDDs.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download