bjd2385 - 1 year ago 42

Perl Question

I'm a little confused by the following results and I was hoping some of you might be able to shed some light on why linear search appears to be faster than both binary and interpolation in Perl.

`Benchmark: timing 1000000 iterations of Binary, Interpolation, Linear...`

Binary: 17 wallclock secs (16.33 usr + 0.00 sys = 16.33 CPU) @ 61236.99/s (n=1000000)

Interpolation: 4 wallclock secs ( 3.65 usr + 0.00 sys = 3.65 CPU) @ 273972.60/s (n=1000000)

Linear: 2 wallclock secs ( 1.52 usr + 0.00 sys = 1.52 CPU) @ 657894.74/s (n=1000000)

Each of the functions are below. I'm trying to write a bunch of well known algorithms and following along in

`sub LinearSearch {`

# Search linearly for a value

my $val = $_[0];

my $arrptr = $_[1];

for (my $i=0; $i<ARR_LENGTH; ++$i) {

if ($arrptr->[$i] == $val) {

return $i;

}

}

return -1;

}

sub BinarySearch {

my $val = $_[0];

my $arrptr = $_[1];

my $low = 0;

my $high = ARR_LENGTH; # to be modified

while ($low <= $high) {

my $middle = int(($low + $high) / 2);

my $midValue = $arrptr->[$middle];

if ($midValue < $val) {

$low = $middle + 1;

} elsif ($midValue > $val) {

$high = $middle - 1;

} else {

return $middle;

}

}

return -1;

}

sub InterpolationSearch {

my $val = $_[0];

my $arrptr = $_[1];

my $low = 0;

my $high = ARR_LENGTH; # to be modified

while ($val >= $arrptr->[$low] && $val <= $arrptr->[$high]) {

# solve for the middle value again

my $middle = int($low + ($high - $low)*(($val - @{$arrptr}[$low])

/ (@{$arrptr}[$high] - @{$arrptr}[$low] + 1)));

my $middleVal = $arrptr->[$middle];

if ($middleVal < $val) {

$low = $middle + 1;

} elsif ($middleVal > $val) {

$high = $middle - 1;

} else {

return $middle;

}

}

return -1; # Not found

}

Additionally,

`ARR_LENGTH`

`use constant ARR_LENGTH => 10_000;`

at the outset. It's really just odd that a binary search would take so long, and then interpolation less so, but still twice as long as Linear search.

Benchmarking code (just what I found online):

`my @array = OrderedArray();`

my $random_val = $array[int(rand(ARR_LENGTH))];

timethese(1_000_000, {

Interpolation => 'InterpolationSearch($random_val, \@array)',

Binary => 'BinarySearch($random_val, \@array)',

Linear => 'LinearSearch($random_val, \@array)' }

);

where

`OrderedArray()`

`sub OrderedArray {`

# Create a random ordered array

my @arr;

for (my $i=1; $i<=ARR_LENGTH; ++$i) {

push @arr, $i;

}

return @arr;

}

Answer Source

When you pass a string to timethese instead of a sub ref, your @array and $random_val variables aren't in the scope where Benchmark evals it. So it is not actually running with the data you specify.

Try running it as:

```
use Benchmark 'timethese';
use strict;
use warnings 'all';
use constant ARR_LENGTH => 10000;
my @array = OrderedArray();
my $random_val = $array[int(rand(ARR_LENGTH))];
timethese(
-5,
{
'Interpolation' => sub { InterpolationSearch($random_val, \@array) },
'Binary' => sub { BinarySearch($random_val, \@array) },
'Linear' => sub { LinearSearch($random_val, \@array) },
}
);
```

Enabling warnings reveals an error in InterpolationSearch. Enabling warnings and setting $random_val to ARR_LENGTH +1 reveals an error in BinarySearch. You might consider writing some test cases and validating your code before you worry about benchmarking.

You might prefer cmpthese to timethese; I don't find timethese output as helpful.