I need help with setting a specific hadoop version in my spark config. I read somewhere that you can use the hadoop.version property. It doesn't say where to find it.
I need to set it from current/default to 2.8.0. Im coding in PyCharm. Please help, preferebly with a step-by-step guide.
You can build like that, for Apache Hadoop 2.7.X and later, so the above answer is correct. [
./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.0 -DskipTests clean package
Or you could modify this in the pom.xml of your spark downloaded distribution before performing the maven build, so that the building gets done with the version you want.
<profile> <id>hadoop2.8</id> <properties> <hadoop.version>2.8</hadoop.version> ... </properties> </profile>
Take a look at this post for a step-by-step guidance.