David Duncan - 3 months ago 22

Python Question

I have a bit of code that fits theoretical prediction to experimental data, and I want to run a LMA (Levenberg-Marquardt Algorithm) to fit the theory to experiment. However the calculations are non-trivial, with each model taking ~10-30 minutes to calculate on a single processor, however the problem is embarrassingly parallelisable and the code is currently set up to submit the different components (of a single itteration) to a cluster computer (this calculation still takes ~1-2 minutes).

Now this submission script is set up within a callable function within python - so for setting it up with the scipy LMA (scipy.optimise.leastsq) it is relatively trivial - however the scipy LMA will, I imagine, pass each individual calculation (for gauging the gradient) in serial, and wait for the return, whereas I'd prefer the LMA to send an entire set of calculations at a time, and then await the return. The python submission script looks a bit like:

`def submission_script(number_iterations,number_parameters,value_parameters):`

fitness_parameter = [0]*number_iterations

<fun stuff>

return (fitness_parameter)

Where the "value_parameters" is a nested list of dimensions [number_iterations][number_parameters] which contains the variables that are to be calculated for each model, "number_parameters" is the number of parameters that are to be fitted, "number_iterations" is the number of models to be calculated (so each step, to gauge the gradient, the LMA calculates 2*number_parameters models), and "fitness_parameter" is the value that has to be minimised (and has the dimensions [iterations]).

Now, obviously, I could write my own LMA, but that is a little bit of reinventing the wheel - I was wondering if there was anything out there that would satisfy my needs (or if the scipy LMA can be used in this way).

A Gauss-Newton algorithm

Answer

The scipy.optimize.leastsq function gives you the opportunity to provide a function `J`

for evaluating the jacobian for a given parameter vector. You could implement a multiprocessing solution for calculating this matrix instead of having scipy.optimize.leastsq approximate it by serially calling your function `f`

.

Unfortunately the LMA implementation in scipy uses separate functions for `f`

and `J`

. You may want to cache information you calculate in `f`

in order to reuse it in `J`

if it is called with the same parameter vecor. Alternatively you can implement a own LMA version that uses a single `fJ`

call.