M-V - 3 months ago 21

Python Question

I implemented computation of average RGB value of a Python Imaging Library image in 2 ways:

`def getAverageRGB(image):`

"""

Given PIL Image, return average value of color as (r, g, b)

"""

# no. of pixels in image

npixels = image.size[0]*image.size[1]

# get colors as [(cnt1, (r1, g1, b1)), ...]

cols = image.getcolors(npixels)

# get [(c1*r1, c1*g1, c1*g2),...]

sumRGB = [(x[0]*x[1][0], x[0]*x[1][1], x[0]*x[1][2]) for x in cols]

# calculate (sum(ci*ri)/np, sum(ci*gi)/np, sum(ci*bi)/np)

# the zip gives us [(c1*r1, c2*r2, ..), (c1*g1, c1*g2,...)...]

avg = tuple([sum(x)/npixels for x in zip(*sumRGB)])

return avg

`def getAverageRGBN(image):`

"""

Given PIL Image, return average value of color as (r, g, b)

"""

# get image as numpy array

im = np.array(image)

# get shape

w,h,d = im.shape

# change shape

im.shape = (w*h, d)

# get average

return tuple(np.average(im, axis=0))

I was surprised to find that #1 runs about 20% faster than #2.

Am I using numpy correctly? Is there a better way to implement the average computation?

Answer

Surprising indeed.

You may want to use:

```
tuple(im.mean(axis=0))
```

to compute your mean `(r,g,b)`

, but I doubt it's gonna improve things a lot. Have you tried to profile `getAverageRGBN`

and find the bottleneck?