germanium germanium - 4 years ago 167
Scala Question

Spark: unpersist RDDs for which I have lost the reference

How can I unpersist RDD that were generated in an MLlib model for which I don't have a reference?

I know in pyspark you could unpersist all dataframes with

sqlContext.clearCache()
, is there something similar but for RDDs in the scala API? Furthermore, is there a way I could unpersist only some RDDs without having to unpersist all?

Answer Source

You can call

val rdds = sparkContext.getPersistentRDDs(); // result is Map[Int, RDD]

and then filter values to get this value that you want (1) :

rdds.filter (x => filterLogic(x._2)).foreach (x => x._2.unpersist())

(1) - written by hand, without compiler - sorry if there's some error, but there shouldn't be ;)

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download