Martin Tang Martin Tang - 4 months ago 14
Python Question

Converting Python to Scala in Spark ML?

The question is about
Logistic regression with spark ml (data frames)

When I want to change the code Python to Scala

Python:

[stage.coefficients for stage in model.stages
if isinstance(stage, LogisticRegressionModel)]


Scala:(changed)

for (stage<-model.stages){
if(stage.isInstanceOf[LogisticRegressionModel]{
val a = Array(stage.coefficients)
}}


I have already checked
stage.isInstanceOf[LogisticRegressionModel]
, which returned the True. However,
stage.coefficients
has the error message. It says
"value coefficients is not a member of org.apache.spark.ml.Transformer"
.

I only check the stage, it will return

org.apache.spark.ml.Transformer= logreg 382456482


Why the type is different when the isInstanceOf returns true? What should I do? Thanks

Answer

Why the type is different when the isInstanceOf returns true?

Well, Scala is a statically typed language and stages is an Array[Transformer] so each element you access is a Transformer. Transformers in general have no coefficients, hence the error.

What should I do?

Be specific about the types.

import org.apache.spark.ml.classification.LogisticRegressionModel

model.stages.collect { 
  case lr: LogisticRegressionModel => lr.coefficients
}.headOption