Jord123 Jord123 - 4 months ago 23
MySQL Question

Binary search in a MySQL table with deleted rows PHP

I have a large MySQL table with sorted data. When I need to find a starting point, I perform a binary search to find the lower bound ID (auto increment). The only problem is once some data is deleted, I need to look at the first existing row with a lower ID if the ID given by the algorithm doesn't exist. How should I modify this code to achieve that?

$l = 1;
$h = $max; //SELECT MAX(id)
while ($h - $l > 1){
$m = ($h + $l) / 2;
$q = mysqli_query($db, "SELECT col FROM tab WHERE id=". floor($m));
$result = array();
while($result[] = mysqli_fetch_row($q)){;}
if ($result[0][0] < $val) $l = $m;
else $h = $m;
echo round($m);

For example I want to find which rows have the value of col greater than 12345 and the table has max ID 10000. I start by looking at row 5000, where the col = 9000, then 7500 (col = 13000), then 6250 has been deleted, so I start looking for the 1st existing row with ID < 6250 and I find that 6245 has col = 10500. Now I'm looking between IDs 6873 and 7500 etc.


The right way to do this

So you have a table like this:

| ID  | col |
| 1   | 15  |
| 3   | 155 |
| 18  | 9231|
| 190 |14343|
| 500 |16888|

You can get find 14343 with the following query:

SELECT ID, col FROM the_table WHERE col>12345 LIMIT 1;

To make it faster, you'd need to add an index (index word is worth googling)

ALTER TABLE `the_table` ADD INDEX `col` (`col`);

After that mysql will create a tree structure internally and will be doing binary searches on it for you.

This will be working much faster as you'll avoid multiple network roundtrips + other per request expenses (query parsing, optimization, all the locks & mutexes, ...)

Answer to your question

I need to look at the first existing row with a lower ID

E.g. you'd like to get first row with an ID < than 300, you do this (limit is what makes the query return only 1 result):

SELECT col FROM the_table WHERE ID < 300 LIMIT 1;