user3541869 user3541869 -4 years ago 95
Python Question

How to set specific Hadoop version for Spark, Python

I need help with setting a specific hadoop version in my spark config. I read somewhere that you can use the hadoop.version property. It doesn't say where to find it.

I need to set it from current/default to 2.8.0. Im coding in PyCharm. Please help, preferebly with a step-by-step guide.


Answer Source

You can build like that, for Apache Hadoop 2.7.X and later, so the above answer is correct. [

 ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.0 -DskipTests clean package


Or you could modify this in the pom.xml of your spark downloaded distribution before performing the maven build, so that the building gets done with the version you want.


Take a look at this post for a step-by-step guidance.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download