TH22 TH22 - 3 months ago 17
Python Question

Pyspark logging with log4j conversionpattern not working

I am having a problem with logging in spark (pyspark) when changing the format of logs using log4j. I have edited the ConversionPattern in log4j.properties, but it is not working properly. When writing logs, log4j will only use the first letter of the pattern I am trying to use. As an example, I cannot use %replace, because it only picks up on %r, and then will output 'eplace' after the output of %r. What am I doing wrong? Here is the current output I am getting:

2016-06-20 10:06:59,095 hostname="" client_ip="127.0.0.1" service_name="" event_type="" event_status="" event_severity="INFO{WARN=medium,DEBUG=info,ERROR=high,TRACE=info,INFO=info,FATAL=critical}" event_description="[INFO] Spark - Slf4jLogger: Slf4jLogger started


As you can see, after event_severity, it does not replace the level like it is suppossed to.

Below is my log4j.properties file. I am running python 2.7 and spark 1.6.1 on centos 7.

\# Set everything to be logged to the console
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{ISO8601} hostname="" client_ip="127.0.0.1" service_name="" event_type="" event_status="" event_severity="%p{WARN=medium,DEBUG=info,ERROR=high,TRACE=info,INFO=info,FATAL=critical}" event_description="[%p] Spark - %c{1}: %m%n

\# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR

\# SPARK-9183: Settings to avoid annoying messages when looking up nonexistent UDFs in SparkSQL with Hive support
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR


I have tried using EnhancedPatternLayout, but it didn't seem to do anything.

Answer

Have you tried with Log4j 2? Log4j 2 is actively maintained, while Log4j 1.x is End Of Life.

There is an adapter module log4j-1.2-api that you can add to the classpath that delegates application calls to the Log4j 1.2 API to the Log4j 2 implementation. If the application relies on Log4j 1.2 internals the adapter may not be enough.

Recently work has started in Log4j 2 to support configuration files in the Log4j 1.2 format.

Separately all projects are encouraged to migrate to Log4j 2 since Log4j 1.2 is broken in Java 9.

It seems that Spark migration to Log4j 2 is in progress.