Bharath K Bharath K - 1 year ago 48
Scala Question

DecimalType issue while creating Dataframe

While I am trying to create a dataframe using a decimal type it is throwing me the below error.

I am performing the following steps:

import org.apache.spark.sql.Row;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;
import org.apache.spark.sql.types.StringType;
import org.apache.spark.sql.types.DataTypes._;

//created a DecimalType
val DecimalType = DataTypes.createDecimalType(15,10)

//Created a schema

val sch = StructType(StructField("COL1",StringType,true)::StructField("COL2",**DecimalType**,true)::Nil)

val src = sc.textFile("test_file.txt")
val row =>x.split(",")).map(x=>Row.fromSeq(x))
val df1= sqlContext.createDataFrame(row,sch)

df1 is getting created without any errors.But, when I issue as df1.collect() action, it is giving me the below error:

scala.MatchError: 0 (of class java.lang.String)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toCatalystImpl(CatalystTypeConverters.scala:326)

test_file.txt content:


Is there any issue with the way that I am creating DecimalType?

Answer Source

You should have an instance of BigDecimal to convert to DecimalType.

val DecimalType = DataTypes.createDecimalType(15, 10)
val sch = StructType(StructField("COL1", StringType, true) :: StructField("COL2", DecimalType, true) :: Nil)

val src = sc.textFile("test_file.txt")
val row = => x.split(",")).map(x => Row(x(0), BigDecimal.decimal(x(1).toDouble)))

val df1 = spark.createDataFrame(row, sch)
df1.collect().foreach { println }

The result looks like this:

 |-- COL1: string (nullable = true)
 |-- COL2: decimal(15,10) (nullable = true)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download