clay clay - 3 months ago 384
Scala Question

Why is "Unable to find encoder for type stored in a Dataset" when creating a dataset of custom case class?

Spark 2.0 (final) with Scala 2.11.8. The following super simple code yields the compilation error

Error:(17, 45) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.


import org.apache.spark.sql.SparkSession

case class SimpleTuple(id: Int, desc: String)

object DatasetTest {
val dataList = List(
SimpleTuple(5, "abc"),
SimpleTuple(6, "bcd")
)

def main(args: Array[String]): Unit = {
val sparkSession = SparkSession.builder.
master("local")
.appName("example")
.getOrCreate()

val dataset = sparkSession.createDataset(dataList)
}
}

Answer

You have to import required Encoders from SparkSession.implicits to make it work:

val sparkSession: SparkSession = ???
import sparkSession.implicits._
val dataset = sparkSession.createDataset(dataList)