Rãã Møó Rãã Møó - 1 year ago 104
Scala Question

Split and choose in scala

I found some explanation to do this but i still can't do it !!

I want to split

val data=sc.textFile("hdfs://ncdc/isd-history.csv")

have the form :
("949999","00338","PORTLAND (CASHMORE)","AS","","","-38.320","+141.480","+0081.0","19690724","19781113")

I want to split data and take only the 1st
and the 3rd

I have done this ,

val RDD = (data.filter(s => (s.split(',')(0) , s.split(',')(2))))

But,it doesn't work :)

Thank you.

Answer Source

RDD.filter filters records, not "columns" - it expects a function from the record type (String, I assume, in this case) to Boolean, and would filter out all records for which this function returned false.

You're trying to transform each record from a String into a tuple (while "filtering" out parts of that string), so you should use RDD.map instead of RDD.filter:

val RDD = data.map(s => (s.split(',')(0), s.split(',')(2)))

Or better yet:

val RDD = data.map(_.split(',')).map(arr => (arr(0), arr(2)))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download