AboodXD AboodXD - 3 years ago 86
Python Question

Why is this loop so slow in Cython?

This code rearranges the bits in a 534x713 RGBA4 texture.

cpdef bytes toDDSrgba4(bytearray data):
cdef bytes new_data = b''

cdef int pixel
cdef int red
cdef int green
cdef int blue
cdef int alpha
cdef int new_pixel
cdef int i

for i in range(len(data) // 2):
pixel = int.from_bytes(data[2*i:2*i+2], "big")

red = (pixel >> 12) & 0xF
green = (pixel >> 8) & 0xF
blue = (pixel >> 4) & 0xF
alpha = pixel & 0xF

new_pixel = (red << 8) | (green << 4) | blue | (alpha << 12)

new_data += (new_pixel).to_bytes(2, "big")

return new_data


It's just as fast as it's Python equivalent, which is this:

def toDDSrgba4(data):
new_data = b''

for i in range(len(data) // 2):
pixel = int.from_bytes(data[2*i:2*i+2], "big")

red = (pixel >> 12) & 0xF
green = (pixel >> 8) & 0xF
blue = (pixel >> 4) & 0xF
alpha = pixel & 0xF

new_pixel = (red << 8) | (green << 4) | blue | (alpha << 12)

new_data += (new_pixel).to_bytes(2, "big")

return new_data


Both of them are really slow.

I have written a very complex swizzle code that isn't even optimized and tested it on this texture, and it's still waaay faster than this.

Answer Source

You're appending to a bytes object with +=. That's really slow, since it has to copy the whole existing bytes object every time.

Don't do that. One better option would be to use a bytearray, and only build a bytes object out of the bytearray at the end.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download