George C George C - 4 years ago 95
Apache Configuration Question

Apache Flink ALS with ids in Long instead of Int

I am trying the code of ALS in Flink version 1.1.3 using:

mvn archetype:generate \
-DarchetypeGroupId=org.apache.flink \
-DarchetypeArtifactId=flink-quickstart-scala \
-DarchetypeVersion=1.1.3 \
-DgroupId=org.apache.flink.quickstart \
-DartifactId=flink-scala-project \
-Dversion=0.1 \
-Dpackage=org.apache.flink.quickstart \

I am following the example code in: and changed the Int for the Long in the Dataset

val env = ExecutionEnvironment.getExecutionEnvironment
val csvInput: DataSet[(Long, Long, Double)] = env.readCsvFile[(Long, Long, Double)]("tmp-contactos.csv")

// Setup the ALS learner
val als = ALS()

// Set the other parameters via a parameter map
val parameters = ParameterMap()
.add(ALS.Lambda, 0.9)
.add(ALS.Seed, 42L)

// Calculate the factorization, parameters)

But it throws in runetime:

Exception in thread "main" java.lang.RuntimeException: There is no FitOperation defined for which trains on a DataSet[(Long, Int, Double)]
at org.apache.flink.quickstart.BatchJob$.main(BatchJob.scala:119)
at org.apache.flink.quickstart.BatchJob.main(BatchJob.scala)

It is posible to use Longs instead of Ints??

I searched and found this for the 0.9 version but nothing for 1.1.13:

Answer Source

So far it is not officially supported but I've created a branch where I've fixed this limitation. You can try out this branch. I'll contribute it to Flink so that it should become part of the master in the next time.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download