javadba javadba - 1 year ago 366
Scala Question

How to set hadoop configuration values from pyspark

The Scala version of SparkContext has the property


I have successfully used that to set hadoop properties (in scala..)



However the python version of SparkContext lacks that accessor. Is there any way to set hadoop configuration values into the Hadoop Configuration used by the pyspark context?

Answer Source

I looked into the pyspark source code ( and there is not a direct equivalent. Instead some specific methods support sending in a map of (key,value) pairs:

fileLines = sc.newAPIHadoopFile('dev/*',