Vor Vor - 3 months ago 8
MySQL Question

Update rows if total column sum is less than X

I'm trying to figure out how to update rows in the table if their total size less than

x
.

Here is my setup:

create table test_limit (
id int not null auto_increment primary key,
folder varchar(255),
status varchar(32) DEFAULT 'awaiting',
size bigint unsigned default 0,
request_id varchar(32)
)
ENGINE=InnoDB;

insert into test_limit
(folder, status, size)
values
('/tmp/AAA/bar', 'awaiting', 200 ),
('/tmp/AAA/bar', 'awaiting', 200 ),
('/tmp/AAA/bar', 'awaiting', 200 ),
('/tmp/BBB/bar', 'awaiting', 200 ),
('/tmp/BBB/bar', 'awaiting', 200 );


I have a table with 5 rows, each row has a size what I want to do is to update a group of rows that:


  • has the same
    folder
    value

  • status is not in
    in_progress
    or
    created

  • has total limit of 400



I came up with the following update command:

SET @request_id='bbb';
UPDATE test_limit t1
JOIN
( SELECT folder FROM test_limit WHERE status = 'awaiting' GROUP BY folder limit 1) t2
ON t1.folder = t2.folder
LEFT JOIN
( SELECT folder FROM test_limit WHERE status IN ('in_progress', 'created') GROUP BY folder limit 1) t3
ON t1.folder = t3.folder
JOIN
( SELECT id, @total := @total + size AS total FROM (test_limit, (select @total := 0) t) WHERE @total < 400 and status='awaiting') t4
ON t1.id=t4.id
SET t1.status = 'in_progress',
t1.request_id = @request_id
WHERE t1.status = 'awaiting' AND t3.folder is NULL;


But the problem is it is working first time, but doesn't work any other times:

mysql> select * from test_limit;
+----+--------------+-------------+------+------------+
| id | folder | status | size | request_id |
+----+--------------+-------------+------+------------+
| 1 | /tmp/AAA/bar | in_progress | 200 | bbb |
| 2 | /tmp/AAA/bar | in_progress | 200 | bbb |
| 3 | /tmp/AAA/bar | awaiting | 200 | NULL |
| 4 | /tmp/BBB/bar | awaiting | 200 | NULL |
| 5 | /tmp/BBB/bar | awaiting | 200 | NULL |
+----+--------------+-------------+------+------------+
5 rows in set (0.07 sec)


UPDATE:

The above result is correct for the first run. What I want to achieve in the second run (say request_id = 'aaa' ):

mysql> select * from test_limit;
+----+--------------+-------------+------+------------+
| id | folder | status | size | request_id |
+----+--------------+-------------+------+------------+
| 1 | /tmp/AAA/bar | in_progress | 200 | bbb |
| 2 | /tmp/AAA/bar | in_progress | 200 | bbb |
| 3 | /tmp/AAA/bar | awaiting | 200 | NULL |
| 4 | /tmp/BBB/bar | in_progress | 200 | aaa |
| 5 | /tmp/BBB/bar | in_progress | 200 | aaa |
+----+--------------+-------------+------+------------+
5 rows in set (0.07 sec)


And in the third run it should not update anything because all the values are "in_progress".

How can I achieve this?

Answer

got it took me a while to think through the logic. Here is the sql fiddle http://sqlfiddle.com/#!9/227dd0/1

UPDATE test_limit u
JOIN
(
  SELECT
    t1.*
    ,f.NonAwaitingFolderTotal
    ,(@runtot := @runtot + t1.size) as RunningTotal
  FROM
    (
      SELECT
        folder
        ,SUM(CASE WHEN status <> 'awaiting' THEN size ELSE 0 END) as NonAwaitingFolderTotal
      FROM
        test_limit t
      GROUP BY
        folder
      HAVING
        SUM(CASE WHEN status <> 'awaiting' THEN size ELSE 0 END) <= 400
      ORDER BY
        NonAwaitingFolderTotal, folder
      LIMIT 1
    ) f
    INNER JOIN test_limit t1
    ON f.folder = t1.folder
    CROSS JOIN (SELECT @runtot:=0) var
  WHERE
    t1.status = 'awaiting'
)  t2
ON u.id = t2.id
AND (t2.NonAwaitingFolderTotal + t2.RunningTotal) <= 400
SET
  u.status = 'in_progress'
  ,u.request_id = @request_id
;

The logic goes like this

  • find out folder to use and find the Non Awaiting Total Size that is currently in that folder. Then select a folder by the lowest non awaiting size (in_progress,created) and if tied by folder name and then limit 1.
  • Get a Running Total of All awaiting records in that folder to be used to determine which rows can be updated before hitting the max allowed.
  • Do the update with a join to the results of the running total query where Total Size of the NonAwaiting records + the running total of that record are less than the 400 maximum.

And just because I want to keep this around somewhere the main issue was the running total you where using wasn't grouped by the right level. here are a few running total & row number functions I worked through thinking about it.

SELECT 
  *
  ,(@foldercount := IF(@prevfolder=folder,@foldercount,@foldercount+1)) as FolderNum
  ,(@rownum := @rownum + 1) as RowNum
  ,(@grouprownum := IF(@prevfolder=folder,@grouprownum+1,1)) as GroupRowNum
  ,(@total := IF(@prevfolder=folder,@total + t.size,t.size)) as GroupRunningTotal
  ,(@GroupAwaitRunningTotal := IF(
        @prevfolder=folder
        ,IF(t.status = 'awaiting',@GroupAwaitRunningTotal + t.size,@GroupAwaitRunningTotal)
        ,IF(t.status = 'awaiting',t.size,0)
      )
   ) as GroupAwaitRunningTotal
   ,(@GroupNonAwaitRunningTotal := IF(
        @prevfolder=folder
        ,IF(t.status != 'awaiting',@GroupNonAwaitRunningTotal + t.size,@GroupNonAwaitRunningTotal)
        ,IF(t.status != 'awaiting',t.size,0)
      )
   ) as GroupNonAwaitRunningTotal
  ,(@runtot := @runtot + t.size) as RunningTotal
  ,@prevfolder:=folder
FROM 
  test_limit t
  CROSS JOIN
  (SELECT @prevfolder:=NULL, @GroupAwaitRunningTotal := 0
     ,@GroupNonAwaitRunningTotal := 0
     ,@total:=0, @rownum:=0, @grouprownum:=0, @runtot:=0, @foldercount:=0) var