Guforu Guforu - 2 months ago 22
Scala Question

Implementation of ALS in Spark

I work now with

implemented in Spark. In the directory
are two different packages ml and mllib. Each of both packages has the subfolder
and in this folder the class
(mllib has additional also MatrixFactorizationModel.scala)

My question is, what is a difference between
For example I have found the tutorial of using ALS of Apache Spark in the net. The package mllib is used in this tutorial. When I can use the package ml? Why we need to have two different packages ml and mllib?


Spark ML Lib is being reworked now. Old classes are in mllib packages, new in ml. New classes are basing on DataFrames and could be faster due to Tungsten optimisation.

Generally you should use ml package if it is possible, as in the future mllib package will be deprecated and removed.

Edit: I don't have any link to full tutorial, but here is ALS code used by me:

val als = new ALS()

val model =
val predictions = model.transform (dataFrameToPredict)