Kratos Kratos - 2 months ago 19
Scala Question

Compilation error saving model written in Scala, Apache Spark

I am running the example source code provided by Apache Spark to create an FPGrowth model. I want to save the model for future use, therefore I wrote the ending line of this code (model.save):

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.fpm.FPGrowth
import org.apache.spark.mllib.util._
import org.apache.spark.rdd.RDD
import org.apache.spark.sql._
import java.io._
import scala.collection.mutable.Set


object App {


def main(args: Array[String]) {


val conf = new SparkConf().setAppName("prediction").setMaster("local[*]")
val sc = new SparkContext(conf)

val data = sc.textFile("FPFeatureSeries.txt")

val transactions: RDD[Array[String]] = data.map(s => s.trim.split(' '))
val fpg = new FPGrowth()
.setMinSupport(0.1)
.setNumPartitions(10)
val model = fpg.run(transactions)

val minConfidence = 0.8
model.generateAssociationRules(minConfidence).collect().foreach { rule =>
if(rule.confidence>minConfidence){
println(
rule.antecedent.mkString("[", ",", "]")
+ " => " + rule.consequent .mkString("[", ",", "]")


+ ", " + rule.confidence)
}
}
model.save(sc, "FPGrowthModel");


}
}


The problem is that I get a compilation error: value save is not a member of org.apache.spark.mllib.fpm.FPGrowth

I have tried including libraries and copying the exact examples from the documentation but I am still getting the same error.

I am using Spark 2.0.0 and Scala 2.10.

Answer

i had the same issue. used this to save model

sc.parallelize(Seq(model), 1).saveAsObjectFile("path")

and to load model

val linRegModel = sc.objectFile[LinearRegressionModel]("path").first()

this might help.. what-is-the-right-way-to-save-load-models-in-spark-pyspark

Comments