satyambansal117 satyambansal117 - 9 months ago 63
Scala Question

Get elements of type structure of row by name in SPARK SCALA

In a DataFrame object in Apache Spark (I'm using the Scala interface), if I'm iterating over its Row objects, is there any way to extract structure values by name?

I am using the below code to extract by name but I am facing problem on how to read the struct value .

If values had been of type string then we could have done this:

val id=row.getAs[Long]("id")
val values=row.getAs[String]("slotSize")
val feilds=row.getAs[String](values)

But in my case values has the below schema

v1: struct (nullable = true)
| |-- level1: string (nullable = true)
| |-- level2: string (nullable = true)
| |-- level3: string (nullable = true)
| |-- level4: string (nullable = true)
| |-- level5: string (nullable = true)

What shall I replace this line with to make the code work given that value has the above structure.


Answer Source

You can access the struct elements my first extracting another Row (structs are modeled as another Row in spark) from the toplevel Row like this:

val level1 = row.getAs[Row]("struct").getAs[String]("level1")