a.moussa a.moussa - 19 hours ago 2
Scala Question

Trouble with DecimalType converting an array of attribute name and DataType to an array of StructField in Spark

I would like to map an array containing different DataTypes to automatically create StructField. But I have some problems with DecimalType. For example, if I test

val myType1 = StringType
val testString = myType1.asInstanceOf[DataType]


I have no problem. But with the line below

val myType2 = DecimalType
val testDecimal = myType2.asInstanceOf[DataType]


I get this exception:

Exception in thread "main" java.lang.ClassCastException: org.apache.spark.sql.types.DecimalType$ cannot be cast to org.apache.spark.sql.types.DataType


I don't understand because in the documentation I thought that DecimalType inherits DataType:

https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/types/DecimalType.html.

So I'm looking for a parent object of all "spark.sql.type".

My goal is to map something like that:

Array(("name",StringType),("size", LongType),("att3",DecimalType),("age",IntegerType))


to an array of StructField.

Does anyone have any idea?

Answer

When you use only DecimalType, you get a reference to an object of DecimalType and not the exact object.

val a = DecimalType
a: org.apache.spark.sql.types.DecimalType.type = org.apache.spark.sql.types.DecimalType$@156bb545

Instead of,

val a = DecimalType(10,0)
a: org.apache.spark.sql.types.DecimalType = DecimalType(10,0)

Alternatives would be to use :

myType2(10,0).asInstanceOf[DataType]
org.apache.spark.sql.types.DataType = DecimalType(10,0)

//or if you want max precision and scala

myType2.Unlimited.asInstanceOf[DataType]
org.apache.spark.sql.types.DataType = DecimalType(38,18)