Ryex Ryex - 1 month ago 4
Python Question

Increasing speed of python list operations and comparisons in a custom table class

I'm using the following class to create a table and I need to find a way to not only make it faster but make interactions with it faster:

class Table(object):
"""a three dimensional table object"""
def __init__(self, xsize=1, ysize=1, zsize=1):
self.xsize = xsize
self.ysize = ysize
self.zsize = zsize
self.data = [0] * (xsize * ysize * zsize)

def __getitem__(self, key):
x, y, z = self.__extractIndices(key)
return self.data[x + self.xsize * (y + self.ysize * z)]

def __setitem__(self, key, value):
x, y, z = self.__extractIndices(key)
self.data[x + self.xsize * (y + self.ysize * z)] = value

def __extractIndices(self, key):
x = y = z = 0
if (self.ysize > 1):
if (self.zsize > 1):
if len(key) != 3:
raise IndexError
else:
x, y, z = key
elif len(key) != 2:
raise IndexError
else:
x, y = key
elif not isinstance(key, int):
raise IndexError
else:
x = key
return (x, y, z)

def resize(self, xsize=1, ysize=1, zsize=1):
"""resize the table preserving data"""
oldlist = list(self.data)
self.data = [0] * (xsize * ysize * zsize)
self.xsize = xsize
self.ysize = ysize
self.zsize = zsize
for i in range(0, oldlist):
self.data[1] = oldlist[i]


at on point I need to find if the data in two lists is equivalent of each of the z's
so I did this.
self.data
and
self.map.data
are table class instances from above

for x in range(self.map.width - 1):
for y in range(self.map.height - 1):
tempflag = False
#layer 1
if self.data[x, y, 0] != self.map.data[x, y, 0]:
tempflag = True
layer1flag = True
#layer 2
if self.data[x, y, 1] != self.map.data[x, y, 1]:
tempflag = True
layer2flag = True
#layer 3
if self.data[x, y, 2] != self.map.data[x, y, 2]:
tempflag = True
layer3flag = True
#copy the data if it changed
if tempflag:
self.data = copy.deepcopy(self.map.data)
previewflag = True


clearly this is the slowest way I could conceivably do this and considering that some of these tables I'm comparing have a size of 200 * 200 * 3 = 120,000 entries. I NEED this to be as fast as possible.

I've considered rewriting the above comparison to slice all the entries for one z like so

tempflag = False
#layer 1
slicepoint1 = 0
slicepoint2 = self.data.xsize * self.data.ysize * 1
data1 = self.data.data[slicepoint1:slicepoint2]
data2 = self.map.data.data[slicepoint1:slicepoint2]
if data1 != data2:
tempflag = True
layer1flag = True
#layer 2
slicepoint1 = self.data.xsize * self.data.ysize * 1
slicepoint2 = self.data.xsize * self.data.ysize * 2
data1 = self.data.data[slicepoint1:slicepoint2]
data2 = self.map.data.data[slicepoint1:slicepoint2]
if data1 != data2:
tempflag = True
layer2flag = True
#layer 3
slicepoint1 = self.data.xsize * self.data.ysize * 2
slicepoint2 = self.data.xsize * self.data.ysize * 3
data1 = self.data.data[slicepoint1:slicepoint2]
data2 = self.map.data.data[slicepoint1:slicepoint2]
if data1 != data2:
tempflag = True
layer3flag = True
#copy the data if it changed
if tempflag:
self.data = copy.deepcopy(self.map.data)
previewflag = True


and while this seems like it would go faster it still seems like it's could be significantly improved. for example could a not use numpy to build the data list inside the Table class?

I need this class and this check to run as fast as it possibly can

it would also be nice if the use of numpy would allow me to loop through the table really fast so I could use the data in it for blit operations to build a tilemap

I do need to keep the general interface of the table class particularly the fact that the table data is stored in self.data

In summary Can the speed of the operations be increased by using numpy? If so how can I do it?

Answer

This is definitely an application for NumPy! It will not only speed up your code, it will also simplify your code considerably, because indexing and comparison are already handled by NumPy. You will have to read some tutorial to learn NumPy -- just a few hints to get you going in this case.

Usually, I would simply derive from numpy.ndarray to define a custom array class, but you stated that you definitely need the data attribute, which clashes with numpy.ndarray.data. Your class simplifies to

class Table(object):
    def __init__(self, xsize=1, ysize=1, zsize=1):
        self.data = numpy.zeros((xsize, ysize, zsize))

    def __getitem__(self, key):
        return self.data[key]

    def __setitem__(self, key, value):
        self.data[key] = value

    def resize(self, xsize=1, ysize=1, zsize=1):
        # This only works for increasing the size of the data,
        # but is easy do adapt to other cases
        newdata = numpy.zeros((xsize, ysize, zsize))
        shape = self.data.shape
        newdata[:shape[0], :shape[1], :shape[2]] = self.data
        self.data = newdata

Your comparison code simplifies to

eq = self.data == self.data.map
layerflags = eq.reshape(-1, 3).any(axis=0)
if layerflags.any():
    self.data[:] = self.map.data

And it will be much faster too!

Comments