Merlin - 9 months ago 38

Python Question

Doc's are lacking an example.... How do you use

`bisect.insort_left`

Trying to insert based on key.

`bisect.insort_left(data, ('brown', 7))`

this puts insert at data[0].

From docs....

`bisect.insort_left(list, item[, lo[, hi]])`

Insert item in list in sorted order. This is equivalent to

list.insert(bisect.bisect_left(list, item, lo, hi), item). This assumes that

list is already sorted.

>>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]

>>> data.sort(key=lambda r: r[1])

>>> keys = [r[1] for r in data] # precomputed list of keys

>>> data[bisect_left(keys, 0)]

('black', 0)

>>> data[bisect_left(keys, 1)]

('blue', 1)

>>> data[bisect_left(keys, 5)]

('red', 5)

>>> data[bisect_left(keys, 8)]

('yellow', 8)

>>>

I want to put

`('brown', 7)`

`('red', 5)`

`data`

`bisect.insort_left`

`bisect.insort_left(data, ('brown', 7))`

`('brown', 7)`

`data[0]`

Answer

This does essentially the same thing the `SortedCollection`

recipe does that the `bisect`

documentation mentions at the end that supports a key-function.

What's being done is a separate sorted `keys`

list is maintained in parallel with the sorted `data`

list to improve performance (it's faster than creating the keys list before each insertion, but keeping it around and updating it isn't a strict requirement). The ActiveState recipe encapsulated this for you within a class, but in the code below they're just two separate independent lists being passed around (so it'd be easier for them to get out of sync than it would be if they were both held in an instance of the recipe's class).

```
from bisect import bisect_left
# Based on code in the SortedCollection recipe:
# http://code.activestate.com/recipes/577197-sortedcollection/
def insert(seq, keys, item, keyfunc):
"""Insert an item into the sorted list using separate corresponding
keys list and a keyfunc to extract key from each item.
"""
k = keyfunc(item) # get key
i = bisect_left(keys, k) # determine where to insert item
keys.insert(i, k) # insert key of item in keys list
seq.insert(i, item) # insert the item itself in the corresponding spot
# initialize data and keys lists
data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
data.sort(key=lambda r: r[1]) # sort data by key value
keys = [r[1] for r in data] # initialize keys list
print data
# --> [('black', 0), ('blue', 1), ('red', 5), ('yellow', 8)]
insert(data, keys, ('brown', 7), keyfunc=lambda x: x[1])
print data
# --> [('black', 0), ('blue', 1), ('red', 5), ('brown', 7), ('yellow', 8)]
```

**Follow Up**

You simply can't use the `bisect.insort_left()`

function to do this because the way it was written doesn't support a key-function and instead just compares the whole item passed to it to insert, `x`

, with one of the whole items in the array in its `if a[mid] < x:`

statement. You can see exactly what I mean by looking at the source for the `bisect`

module in `Lib/bisect.py`

.

Here's the relevant excerpt:

```
def insort_left(a, x, lo=0, hi=None):
"""Insert item x in list a, and keep it sorted assuming a is sorted.
If x is already in a, insert it to the left of the leftmost x.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
a.insert(lo, x)
```

You could modify the above to have a key-function argument and use it:

```
def my_insort_left(a, x, lo=0, hi=None, keyfunc=lambda v: v):
x_key = keyfunc(x)
. . .
if keyfunc(a[mid]) < x_key: lo = mid+1
. . .
```

...and call it like this:

```
my_insort_left(data, ('brown', 7), key=lambda v: v[1])
```

*Actually*, if you're going to write a custom function, for the sake of more efficiency at the expense of generality not required, you could dispense with the adding of a generic key function argument and just hardcoded things to operate the way needed with the data format you're using:

```
def my_insort_left(a, x, lo=0, hi=None):
x_key = x[1] # key on second element of each item sequence
. . .
if a[mid][1] < x_key: lo = mid+1
. . .
```

...called this way:

```
my_insort_left(data, ('brown', 7))
```