Serendipity Serendipity - 8 months ago 120
Scala Question

Spark2 - LogisticRegression training finished but the result is not converged because: line search failed

While training a Logistic Regression classifier I get the following error:

2016-08-16 20:50:23,833 ERROR [main] optimize.LBFGS (Logger.scala:error(27)) - Failure! Resetting history: breeze.optimize.FirstOrderException: Line search zoom failed
2016-08-16 20:50:24,009 INFO [main] optimize.StrongWolfeLineSearch (Logger.scala:info(11)) - Line search t: 0.9 fval: 0.4515497761131565 rhs: 0.45154977611314895 cdd: 3.4166889881493167E-16


Then the program continues for a while but then I encounter this error:

2016-08-16 20:50:24,365 ERROR [main] optimize.LBFGS (Logger.scala:error(27)) - Failure again! Giving up and returning. Maybe the objective is just poorly behaved?
2016-08-16 20:50:24,367 WARN [main] classification.LogisticRegression (Logging.scala:logWarning(66)) - LogisticRegression training finished but the result is not converged because: line search failed!
2016-08-16 20:50:27,143 INFO [main] optimize.StrongWolfeLineSearch (Logger.scala:info(11)) - Line search t: 0.4496001808762097 fval: 0.5641490068577 rhs: 0.6931115872739131 cdd: 0.01924752705390458
2016-08-16 20:50:27,143 INFO [main] optimize.LBFGS (Logger.scala:info(11)) - Step Size: 0.4496
2016-08-16 20:50:27,144 INFO [main] optimize.LBFGS (Logger.scala:info(11)) - Val and Grad Norm: 0.564149 (rel: 0.186) 0.622296
2016-08-16 20:50:27,181 INFO [main] optimize.LBFGS (Logger.scala:info(11)) - Step Size: 1.000
2016-08-16 20:50:27,181 INFO [main] optimize.LBFGS (Logger.scala:info(11)) - Val and Grad Norm: 0.484949 (rel: 0.140) 0.285684
2016-08-16 20:50:27,226 INFO [main] optimize.LBFGS (Logger.scala:info(11)) - Step Size: 1.000
2016-08-16 20:50:27,226 INFO [main] optimize.LBFGS (Logger.scala:info(11)) - Val and Grad Norm: 0.458425 (rel: 0.0547) 0.0789000
2016-08-16 20:50:27,263 INFO [main] optimize.LBFGS (Logger.scala:info(11)) - Step Size: 1.000


But then the training continues.

Even though it looks like the training is completed successfully (I get a model, I do predictions on the testset, validate classifier, etc.), I'm worried about this error.
Any ideas what does the error mean? Any recommendations how to overcome it? (I use 10, 000 as max number of iterations)

Answer

The problem was with LBFGS optimizer which is being used by the Logistic Regression algorithm.

This error occurs most likely when the gradient is wrong or the convergence tolerance is set too tightly.

In my case, I was running the algorithm as following:

new LogisticRegression().
        setFitIntercept(true).
        setRegParam(0.3).
        setMaxIter(100000).
        setTol(0.0).
        setStandardization(true).
        setWeightCol("classWeightCol").setLabelCol("label").setFeaturesCol("features")

Where the convergence tolerance of iterations was set to 0 (setTol(0.0)) The Spark documentation state:

"Smaller value will lead to higher accuracy with the cost of more iterations. Default is 1E-6. "

But once changing the setter to setTol(0.1) the line search error doesn't occur anymore.