Georg Heiler Georg Heiler - 5 days ago 7
Scala Question

Spark custom estimator access to Param[T]

I am building a custom estimator for spark. Unfortunately, there seems to be something wrong in how I access

Param[T]
default params for the estimator. Here is a minimal example which compares a
Transformer
with an
Estimator
. The Estimator which has access to the same parameter

trait PreprocessingParam2s extends Params {
final val isInList = new Param[Array[String]](this, "isInList", "list of isInList items")
}


is called like

new ExampleEstimator().setIsInList(Array("def", "ABC")).fit(dates).transform(dates).show


In order to perform

dataset
.withColumn("isInList", when('ISO isin ($(isInList): _*), 1).otherwise(0))


But unlike the
Transformer
which works fine, the
Estimator
fails with
java.util.NoSuchElementException: Failed to find a default value for isInList


https://gist.github.com/geoHeil/218683c6b0f91bc76f71cb652cd746b8

What is wrong here?

Answer

Try adding getter and setDefault in your Parameter:

trait PreprocessingParam2s extends Params {
  final val isInList = new Param[Array[String]](this, "isInList", "list of isInList items")

 setDefault(isInList, /* here put default value */)

  /** @group getParam */
  final def getIsInList: Array[String] = $(isInList)
}
Comments