a.moussa a.moussa - 7 days ago 4
Scala Question

How to properly map an Array of name and type to an Array of StrucField

the simple problem I have is this:

val t = StructField("attName", DecimalType,true)

type mismatch, expected: DataType, actual: DecimalType.type


I want to create a case class wich can be useful to automaticly generate an array of structField. So I begin to try this.

case class MyClass1(attributeName: String, attributeType: DataType)


In order to create my test1:

val test1 = Array(
MyClass1("att1", StringType),
MyClass1("att2", IntegerType),
MyClass1("att3", DecimalType)
)


This is wrong because the third line with DecimalType thrown an error. (DecimalType is not a DataType). So I tried with a second case class like this.

case class MyClass2[T](attributeName: String, attributeType: T)

val test2 = Array(
MyClass2("att1", StringType),
MyClass2("att2", IntegerType),
MyClass2("att3", DecimalType)
)


Now this is ok but lines below does not work because Decimal is not a DataType.

val myStruct = test2.map(x =>
StructField(x.attributeName,x.attributeType, true)
)


So here is my question. How to create a StructField with DecimalType and do you think my caseclass is a good approach? Thanks

Answer

DecimalType is not a singleton - it has to be initialized with a given precision and scale:

import org.apache.spark.sql.types.{DataType, DecimalType}

DecimalType(38, 10): DataType
org.apache.spark.sql.types.DataType = DecimalType(38,10)

That being said StructField is already a case class and provides default arguments for nullable (true) and metadata so another representation seems superfluous.