How to access SparkContext in pyspark script

The following SOF question How to run script in Pyspark and drop into IPython shell when done? tells how to launch a pyspark script:

%run -d

But how do we access the existin spark context?

Just creating a new one does not work:

----> sc = SparkContext("local", 1)

ValueError: Cannot run multiple SparkContexts at once; existing
SparkContext(app=PySparkShell, master=local) created by <module> at

But trying to use an existing one .. well what existing one?

In [50]: for s in filter(lambda x: 'SparkContext' in repr(x[1]) and len(repr(x[1])) < 150, locals().iteritems()):
print s
('SparkContext', <class 'pyspark.context.SparkContext'>)

i.e. there is no variable for a SparkContext instance

Answer Source

Standalone python script for wordcount : write a reusable spark context by using contextmanager

from contextlib import contextmanager
from pyspark import SparkContext
from pyspark import SparkConf


def spark_manager():
    conf = SparkConf().setMaster(SPARK_MASTER) \
                      .setAppName(SPARK_APP_NAME) \
                      .set("spark.executor.memory", SPARK_EXECUTOR_MEMORY)
    spark_context = SparkContext(conf=conf)

        yield spark_context

with spark_manager() as context:
    File = "/home/ramisetty/sparkex/"  # Should be some file on your system
    textFileRDD = context.textFile(File)
    wordCounts = textFileRDD.flatMap(lambda line: line.split()).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a+b)

print "WordCount - Done"

to launch:

