user52705 user52705 - 27 days ago 20
Python Question

Cython numpy array with Openmp (No GIL)

I am not sure if this has been addressed previously, i tried searching but didnt find exactly what i was looking for. I would like to get the following code working (name of file being some_CD.pyx)

import numpy as np
cimport numpy as np
cimport cython
from cython.parallel import *

ctypedef np.float64_t DTYPE

cdef DTYPE evaluate_objective(np.ndarray[DTYPE, ndim=2] A,np.ndarray[DTYPE, ndim=1] b,np.ndarray[DTYPE, ndim=1] x):
return x.dot(A.dot(x) - b.dot(x))

cpdef DTYPE coordinate_descent(np.ndarray[DTYPE, ndim=2] A,np.ndarray[DTYPE, ndim=1] b,int nDim, int nIter ):
cdef int i, iter
cdef np.ndarray[DTYPE,
ndim=1] x = \
np.zeros(nDim, )
cdef DTYPE temp
for iter in prange(nIter,nogil=True):
i = (iter%nDim)
temp = (b[i]-( A[:,i].dot(x)*x[i] ) + (A[i,i]*x[i]*x[i]))/A[i,i]
x[i] = np.max([0,np.min([temp,1])])
return evaluate_objective(A,b,x)


My setup.py looks as following

from distutils.core import setup, Extension
from Cython.Build import cythonize
import numpy

ext_modules = [
Extension(
"some_CD",
["some_CD.pyx"],
extra_compile_args=['-fopenmp'],
extra_link_args=['-fopenmp'],
)
]

setup(
name='some_parallel',
ext_modules=cythonize(ext_modules),
include_dirs=[numpy.get_include()]
)


There are many things i am not sure in this code. First of all whether am i using numpy arrays the right way? Is using float64, np.int typed variables allowed within prange?

Answer

This loop can't be parallelized because it's an iterative algorithm. In other words, later iterations depend on the result of the earlier iterations.

Ignoring this problem, here's how you would change the code..

Without the GIL you can only use basic indexing (no slicing) and a very small number of Python functions special cased by Cython (mainly those listed here). So you just have to make the body of the prange more plain and explicit.

for iter in prange(nIter, nogil=True):
    i = iter % nDim

    Ai_dot_x = 0
    for j in range(x.shape[0]):
        Ai_dot_x += A[j,i] * x[j]

    temp = (b[i] - Ai_dot_x*x[i] + A[i,i]*x[i]*x[i]) / A[i,i]
    x[i] = max(0, min(temp, 1))

Ai_dot_x is an additional temporary variable that you have to cdef.

Comments