javadba javadba - 2 months ago 6
Python Question

Show partitions on a pyspark RDD

The pyspark RDD documentation

does not show any method(s) to display partition information for an RDD.

Is there any way to get that information without executing an additional step e.g.:

myrdd.mapPartitions(lambda x: iter[1]).sum()

The above does work .. but seems like extra effort.


I missed it: very simple:


Not used to the java-ish getFooMethod() anymore ;)