majitux majitux - 1 month ago 6
Java Question

Apache Spark Streaming K-means: I need know how many iterations runs in the same data?

I'm newbie with Spark. I'm trying to read the code and to understand how works K-means in Spark Streaming. I do not know where is the key for to know the quantity of iterations that algorithm perform in the same data's group. I can't find the Java file with this information.

Can you help me, please?

Thank you

This is the file that I was searching for... In this file /spark-1.5.0/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala there is a while in the run method that use a variable called iteration and Spark writes it in a log for each run.

Answer

When you initialize KMeans class, you can specify max-iteration parameters.

new KMeans().setMaxIterations(iterations)

then it will use that parameter for each prediction