animal animal - 3 months ago 27
Scala Question

Case class with empty parameter

I am new to scala and I am trying to extract few columns based below is my code

case class Extract(e1:String,e2:String,e3:String){
override def toString = e1+","+e2+","+e3
}
object ScalaSpark {
def main(args: Array[String])
{
val textfile = sc.textFile("/user/cloudera/xxxx/File2")
val word = textfile.filter(x => x.length > 0).map(_.split("\\|"))
val pid = word.filter(_.contains("SSS"))
val pidkeys = pid.map(tuple => Extract(tuple(0),tuple(3),tuple(7)))
val obx = word.filter(_.contains("HHH"))
val obxkeys = obx.map(tuple => Extract(tuple(0),tuple(5)))
val rddall = pidkeys.unionAll(obxkeys)
rddall.saveAsTextFile("/user/xxxx/xxxx/rddsum1")
}
}


What I am trying with this code is to extract 3 values from row containing SSS and 2 values from row contatining HHH but when i am executing this i am getting below error

error: not enough arguments for method apply: (e1: String, e2: String, e3: String)Extract in object Extract.


I then tried using Opt[String] = None but that also didn't worked i don't know how to sort out this problem please help.

EDIT:

I used Option[String] and my code is written below

case class Extract(e1:String,e2:String,e3:Option[String]){
override def toString = e1+","+e2+","+e3
}
object ScalaSpark {
def main(args: Array[String])
{
val textfile = sc.textFile("/user/cloudera/xxxx/File2")
val word = textfile.filter(x => x.length > 0).map(_.split('|'))
val pid = word.filter(_.contains("SSS"))
val pidkeys = pid.map(tuple => Extract(tuple(0),tuple(5),tuple(8)))
val obx = word.filter(_.contains("HHH"))
val obxkeys = obx.map(tuple => Extract(tuple(0),tuple(5), None))
val rddall = pidkeys.union(obxkeys)
rddall.coalesce(1).saveAsTextFile("/user/xxx/xxx/rddsum1")
}
}


but i am getting below error

error: type mismatch;
found : String
required: Option[String]
val pidkeys = pid.map(tuple => Header(tuple(0),tuple(5),tuple(8)))
^
<console>:38: error: type mismatch;
found : org.apache.spark.rdd.RDD[Extract]
required: org.apache.spark.rdd.RDD[Nothing]
Note: Header >: Nothing, but class RDD is invariant in type T.
You may wish to define T as -T instead. (SLS 4.5)
val rddall = pidkeys.union(obxkeys)

Answer

As far as I understand your Extract case class have 3 parameters. Last of them is optional.

If so you should declare it this way:

case class Extract(s1: String, s2: String, s3: Option[String])

and use it either Extract("some string", "other string", Some("optional string")) or Extract("some string", "other string", None).

Comments