javadba javadba - 3 months ago 28
Scala Question

How to avoid requiring any native libraries for compression in Spark

We are doing a POC on a variety of server machines/architectures. We do not have the ability to rebuild native compression libraries for all of them.

Which codec is software only? The default snappy is giving the following error:

Caused by: java.lang.IllegalArgumentException
at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152)

Answer
 lz4

Is software only. That is working.

The way to configure: in $SPARK_HOME/conf/spark-defaults.conf:

 spark.io.compression.codec lz4