StrikeW - 1 year ago 52

C Question

Here are two implementation of "forgetful" binary search, since they don't check for the exact match until they're finished.

1)

`int bsearch1(int A[], int n, int target)`

{

int low = 0, high = n - 1, mid = 0;

while (low < high)

{

mid = (low + high) >> 1;

if (target > A[mid])

low = mid + 1;

else

high = mid;

}

if (low == high)

{

if (A[low] == target)

return low;

}

return -1;

}

2)

`int bsearch2(int A[], int n, int target)`

{

int low = 0, high = n - 1, mid = 0;

while (low < high)

{

mid = (low + high) >> 1;

if (target < A[mid])

high = mid - 1;

else

low = mid;

}

if (low == high)

{

if (A[low] == target)

return low;

}

return -1;

}

NOTES:

`n`

`target`

`bsearch1`

`bsearch2`

`bsearch2`

`bsearch1`

`bsearch2`

`bsearch2`

EDIT:

I have evaluated the whole process of the example A=[1,3,5,6], target=5:

1.low = 0, high = 3, mid = 1, A[mid] = 3

2.low = 1, high = 3, mid = 2, A[mid] = 5

3.low = 2, high = 3, mid = 2, A[mid] = 5

...

n.low = 2, high = 3, mid = 2, A[mid] = 5

I found that

`bsearch2`

`low == high`

`low`

`high`

`low == high`

`bsearch1`

Answer Source

Your second algorithm suffers from a repeating cycle as soon as you encounter a partition where `high == (low+1)`

. When that happens, you essentially have `mid = (low + low + 1)/2`

, which is equivalent to `(2*low)/2 + 1/2`

. With integer division, this results in `mid = low + 0`

. Since your only movement on the low side is `low = mid`

, but they're already equivalent, you have an infinite loop.

The reason this does *not* happen on the first implementation is the direction of the integer division loss. It is always *down*. Therefore `high`

moving *down* does not suffer from this, and in-fact actually takes *advantage* of it.

To account for this in `bsearch2`

the same way `bsearch1`

takes advantage of the natural low-direction bias, the disgruntled rounding has to be accounted for in the mid-point calculation so it *always* moves in favor of the high-side. To do that, force the error out by biasing the calculation in the opposite direction. I.e. for `bsearch2`

, do this:

```
mid = (low + high + 1) >> 1;
```

and truth be told, to avoid overflow, this really should be

```
mid = low + ((high - low + 1) >> 1);
```

This will achieve the same effect adjusting `bsearch2`

midpoints that `bsearch1`

does. An example is worth noting:

```
#include <stdio.h>
int bsearch2(int A[], int n, int target)
{
int low = 0, high = n - 1, mid = 0;
while (low < high)
{
mid = low + ((high - low + 1) >> 1);
if (target < A[mid])
high = mid - 1;
else
low = mid;
}
if (low == high)
{
if (A[low] == target)
return low;
}
return -1;
}
int main()
{
// build a sorted array from 1...20
int A[20];
for (int i=0; i<sizeof(A)/sizeof(*A); ++i)
A[i] = i+1;
for (int i=0; i<=sizeof(A)/sizeof(*A)+1; ++i)
printf("Search for %d found index %d\n", i, bsearch2(A, sizeof(A)/sizeof(*A), i));
return 0;
}
```

**Output**

```
Search for 0 found index -1
Search for 1 found index 0
Search for 2 found index 1
Search for 3 found index 2
Search for 4 found index 3
Search for 5 found index 4
Search for 6 found index 5
Search for 7 found index 6
Search for 8 found index 7
Search for 9 found index 8
Search for 10 found index 9
Search for 11 found index 10
Search for 12 found index 11
Search for 13 found index 12
Search for 14 found index 13
Search for 15 found index 14
Search for 16 found index 15
Search for 17 found index 16
Search for 18 found index 17
Search for 19 found index 18
Search for 20 found index 19
Search for 21 found index -1
```

I hope that made sense.