val animals = sc.parallelize(List("cat", "dog", "tiger", "lion", "gnu", "crocodile", "ant", "whale", "dolphin", "spider"), 3)
animals.foreachPartition(x => println(x.mkString(", ") + " are animals"))
lion, gnu, crocodile are animals
cat, dog, tiger are animals
ant, whale, dolphin, spider are animals
animals: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD at parallelize at <console>:20
16/05/17 09:33:32 [WARN] o.a.t.k.p.v.s.KernelOutputStream - Suppressing empty output: ''
Generally speaking, you don't. Even if you don't work with Jupyter any output created inside action or transformation will appear somewhere but, unless it is a local mode, it won't be your local shell.
If you want to reliably inspect some part of the data you should fetch data to the driver and inspect locally.
On a side note I would avoid printing anyways. Unlike logging it is not easily configurable and can become a serious bottleneck in your code.