user3191569 user3191569 - 11 months ago 56
Python Question

Multiprocessing python function for numerical calculations

Hoping to get some help here with parallelising my python code, I've been struggling with it for a while and come up with several errors in whichever way I try, currently running the code will take about 2-3 hours to complete, The code is given below;

import numpy as np
from scipy.constants import Boltzmann, elementary_charge as kb, e
import multiprocessing
from functools import partial
Tc = 9.2
x = []
g= []
def Delta(T):
Delta(T) takes a temperature as an input and calculates a
temperature dependent variable based on Tc which is defined as a
global parameter
d0 = (pi/1.78)*kb*Tc
D0 = d0*(np.sqrt(1-(T**2/Tc**2)))
return D0

def element_in_sum(T, n, phi):
D = Delta(T)
matsubara_frequency = (np.pi * kb * T) * (2*n + 1)
factor_d = np.sqrt((D**2 * cos(phi/2)**2) + matsubara_frequency**2)
element = ((2 * D * np.cos(phi/2))/ factor_d) * np.arctan((D * np.sin(phi/2))/factor_d)
return element

def sum_elements(T, M, phi):
sum_elements(T,M,phi) is the most computationally heavy part
of the calculations, the larger the M value the more accurate the
results are.
T: temperature
M: number of steps for matrix calculation the larger the more accurate the calculation
phi: The phase of the system can be between 0- pi
X = list(np.arange(0,M,1))
Y = [element_in_sum(T, n, phi) for n in X]
return sum(Y)

def KO_1(M, T, phi):
Iko1Rn = (2 * np.pi * kb * T /e) * sum_elements(T, M, phi)
return Iko1Rn

def main():
for j in range(1, 92):
T = 0.1*j
for i in range(1, 314):
phi = 0.01*i
pool = multiprocessing.Pool()
result = pool.apply_async(KO_1,args=(26000, T, phi,))
A = max(g);
del g[:]

My approach was to try and send the KO1 function into a multiprocessing pool but I either get a
error or a
too many files open
, Any help is greatly appreciated, and if multiprocessing is the wrong approach I would love any guide.

Answer Source

I haven't tested your code, but you can do several things to improve it.

First of all, don't create arrays unnecessarily. sum_elements creates three array-like objects when it can use just one generator. First, np.arange creates a numpy array, then the list function creates a list object and and then the list comprehension creates another list. The function does 4 times the work it should.

The correct way to implement it (in python3) would be:

def sum_elements(T, M, phi):
    return sum(element_in_sum(T, n, phi) for n in range(0, M, 1))

If you use python2, replace range with xrange. This tip will probably help you in any python script you'll write.

Also, try to utilize multiprocessing better. It seems what you need to do is to create a multiprocessing.Pool object once, and use the function.

The main function should look like this:

def job(args):
   i, j = args
   T = 0.1*j
   phi = 0.01*i
   return K0_1(26000, T, phi)

def main():        
    pool = multiprocessing.Pool(processes=4) # You can change this number
    x = [max(pool.imap(job, ((i, j) for i in range(1, 314)) for j in range(1, 92)]

Notice that I used a tuple in order to pass multiple arguments to job.