JW52761 JW52761 - 11 months ago 43
MySQL Question

MySQL Summation of Group inside larger Recordset

I have an interesting situation. I will post the SQL then explain.

`c`.`clusterName` AS `clusterName`,
DATE_FORMAT(`p`.`insertedTS`, '%Y-%m-%d') AS `insertedDate`,
MAX(`p`.`maxMemory`) AS `peakMaxMemory`,
MIN(`p`.`maxMemory`) AS `minMaxMemory`,
MAX(`p`.`usedMem`) AS `peakUsedMem`,
MIN(`p`.`usedMem`) AS `minUsedMem`,
MAX(`p`.`availMem`) AS `peakAvailMem`,
MIN(`p`.`availMem`) AS `minAvailMem`,
MAX(`p`.`numPoweredOnVms`) AS `peakPoweredOnVms`,
MIN(`p`.`numPoweredOnVms`) AS `minPoweredOnVms`,
(MAX(`p`.`maxMemory`) - MIN(`p`.`maxMemory`)) AS `deltaMaxMemory`,
(MAX(`p`.`usedMem`) - MIN(`p`.`usedMem`)) AS `deltaUsedMem`,
(MAX(`p`.`availMem`) - MIN(`p`.`availMem`)) AS `deltaAvailMem`,
(MAX(`p`.`numPoweredOnVms`) - MIN(`p`.`numPoweredOnVms`)) AS `deltaPoweredOnVms`
(`vtrend`.`clusters` `c`
JOIN `vtrend`.`capacityDataRawPOSH` `p` ON ((`c`.`clusterID` = `p`.`clusterID`)))
GROUP BY `c`.`clusterName` , DATE_FORMAT(`p`.`insertedTS`, '%Y-%m-%d')
ORDER BY `c`.`clusterName`

The results look like this:


So what I am trying to accomplish (and failing) is to add a column at the end that will show wither the numPoweredOnVms count has increased or decreased over the day. Since each of the records above represents the grouping of an entire day, which the data is entered every 15 minutes, I'm hoping to be able to look at numPoweredOnVms for the first record and the day and subtract the last record of the day from that and either get a positive (increase) or negative (decrease) count that I can use an IF statement on to present the "increase" or "decrease" text with.

Here's a sample of the raw data:

739,"2016-09-20 16:50:12",29,384,221,111,111,111,268032,178688,3394,17
976,"2016-09-20 21:50:42",29,384,221,111,111,111,268032,178688,4072,17
1074,"2016-09-20 22:13:07",29,384,221,111,111,111,268032,178688,4683,17
1172,"2016-09-20 22:35:36",29,384,221,111,111,111,268032,178688,3916,17
1270,"2016-09-20 22:58:01",29,384,221,111,111,111,268032,178688,3857,17
1365,"2016-09-20 23:21:06",29,384,221,111,111,111,268032,178688,7867,17
1463,"2016-09-20 23:43:37",29,384,221,111,111,111,268032,178688,3971,17
1575,"2016-09-21 00:11:05",29,384,221,111,111,111,268032,178688,8196,17
2149,"2016-09-21 19:13:33",29,384,221,111,111,111,268032,178688,3758,17
2495,"2016-09-21 21:01:48",29,384,221,111,111,111,268032,178688,3927,16

So basically I would want to be able to pull the numPoweredOnVms from the first record and the last record of the day and subtract them. Now the fun part is how to do this for multiple clusters over multiple days? I'm thinking that I would maybe need to do a separate view of the raw data and calculate that record by record then use that view as the backing for this query, just not sure.

Answer Source

Use your original query as a subquery, and have it return the first and last timestamps from each day. Then join back with the table to get the value of that column at each of these times.

SELECT c.clusterName, p.*,
       CASE SIGN(plast.numPoweredOnVms  > pfirst.numPoweredOnVms)
            WHEN -1 THEN 'Decrease'
            WHEN 0 THEN 'No change'
            ELSE 'Increase'
       END AS changePoweredOnVms
FROM vtrend.clusters AS c
        DATE(`insertedTS`) AS `insertedDate`,
        MAX(`maxMemory`) AS `peakMaxMemory`,
        MIN(`maxMemory`) AS `minMaxMemory`,
        MAX(`usedMem`) AS `peakUsedMem`,
        MIN(`usedMem`) AS `minUsedMem`,
        MAX(`availMem`) AS `peakAvailMem`,
        MIN(`availMem`) AS `minAvailMem`,
        MAX(`numPoweredOnVms`) AS `peakPoweredOnVms`,
        MIN(`numPoweredOnVms`) AS `minPoweredOnVms`,
        (MAX(`maxMemory`) - MIN(`maxMemory`)) AS `deltaMaxMemory`,
        (MAX(`usedMem`) - MIN(`usedMem`)) AS `deltaUsedMem`,
        (MAX(`availMem`) - MIN(`availMem`)) AS `deltaAvailMem`,
        (MAX(`numPoweredOnVms`) - MIN(`numPoweredOnVms`)) AS `deltaPoweredOnVms`
        MIN(insertedTS) AS firstTS,
        MAX(insertedTS) AS lastTS
    FROM `vtrend`.`capacityDataRawPOSH`
    GROUP BY clusterID, insertedDate) AS p ON p.clusterID = c.clusterID
JOIN vtrend.capacityDataRawPOSH AS pfirst ON pfirst.clusterID = p.clusterID AND pfirst.insertedTS = p.firstTS
JOIN vtrend.capacityDataRawPOSH AS plast ON plast.clusterID = p.clusterID AND plast.insertedTS = p.lastTS

Also, use DATE(insertedTS) rather than DATE_FORMAT(insertedTS, '%Y-%m-%d') to convert a timestamp to a date. %Y-%m-%d is the default format used when displaying dates.