Scala Question

Split and choose in scala

I found some explanation to do this but i still can't do it !!

I want to split

val data=sc.textFile("hdfs://ncdc/isd-history.csv")

have the form :
("949999","00338","PORTLAND (CASHMORE)","AS","","","-38.320","+141.480","+0081.0","19690724","19781113")

I want to split data and take only the 1st
and the 3rd

I have done this ,

val RDD = (data.filter(s => (s.split(',')(0) , s.split(',')(2))))

But,it doesn't work :)

Thank you.

Answer Source

RDD.filter filters records, not "columns" - it expects a function from the record type (String, I assume, in this case) to Boolean, and would filter out all records for which this function returned false.

You're trying to transform each record from a String into a tuple (while "filtering" out parts of that string), so you should use instead of RDD.filter:

val RDD = => (s.split(',')(0), s.split(',')(2)))

Or better yet:

val RDD =',')).map(arr => (arr(0), arr(2)))