minion minion - 1 month ago 16
Python Question

How to know which segment a value reside in

L = [('key1', 14), ('key2', 20), ('key3', 13), ('key4', 10), ('key5', 11)]


Assume I got a
list
like the above. The the second element of each tuple conceptually represents the size of a block memory. These block memories are adjacent and sequential. Their relationship could be represented as :


0____14____14+20____14+20+13____14+20+13+10____14+20+13+10+11


Now the user will give me the 'absolute address' and I should return a key to him to decrypt the whole memory block. For instance, if he wants to access the address '10', then I should return key1 for him. If he wants to access '15', and then I should return key2.

My implementation now has to search from the beginning every time. In reality, there are many memory blocks and users want to access all the time. The performance is bad.

L = [('key1', 14), ('key2', 20), ('key3', 13), ('key4', 10), ('key5', 11)]
def get_key(address):
c = 0
offset = 0
for i in L:
if address < i[1] + offset:
break
c += 1
offset += i[1]
return c

print(get_key(15))


How could I improve the performance?


The data structure doesn't have
to be a
list
(L). But the known things should be (1) the block size,
(2) the key, (3)user will access the 'absolute address'.


As per Burhan Khalid's instruction, the final code is as follows (see also How to find the cumulative sum of numbers in a list?):

from bisect import bisect
from itertools import accumulate

L = [('key1', 14), ('key2', 20), ('key3', 13), ('key4', 10), ('key5', 11)]
markers = [i[0] for i in L]
boundaries = list(accumulate([i[1] for i in L]))

##offsets = []
##offsets.append(boundaries[0])
##for offset in boundaries[1:]:
## offsets.append(sum(offsets)+offset)

def buckets(value, boundaries, markers):
i = bisect(boundaries, value)
return markers[i]

print(buckets(67, boundaries, markers))

Answer

You can use an array bisection algorithm to drop the requested amount into the right offset.

The documentation for the bisect module provides the following example which can be modified easily for this:

>>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
        i = bisect(breakpoints, score)
        return grades[i]

>>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
['F', 'A', 'C', 'C', 'B', 'A', 'A']

Here is how you would use it (untested):

markers = [i[0] for i in initial_list_of_tuples]
boundaries = [i[1] for i in initial_list_of_tuples]
offsets = []
offsets.append(boundaries[0])

for offset in boundaries[1:]:
   offsets.append(sum(offsets)+offset)

def buckets(value, boundaries, markers):
   i = bisect(boundaries, value)
   return markers[i]
Comments