webeno webeno - 1 year ago 46
SQL Question

Cumulative sum of unique elements sorted by other element in a nested array

I have the following MySQL query (

is in Unix time, obviously):

SELECT usr_id, CONCAT(YEAR(FROM_UNIXTIME(timestamp)), "/", MONTH(FROM_UNIXTIME(timestamp)), "/", DAY(FROM_UNIXTIME(timestamp))) as date_stamp
FROM table

This produces something like this:

$arr = array(
array('usr_id'=>3, 'date_stamp'=>'2011/6/6'),
array('usr_id'=>2, 'date_stamp'=>'2011/6/20'),
array('usr_id'=>2, 'date_stamp'=>'2011/6/20'), // same id and date as above
array('usr_id'=>5, 'date_stamp'=>'2011/6/20'), // same date as above
array('usr_id'=>1, 'date_stamp'=>'2011/6/21'),
array('usr_id'=>4, 'date_stamp'=>'2011/6/21'), // same date as above
array('usr_id'=>2, 'date_stamp'=>'2011/6/21'), // same date as above...
//... and same id as a day before

I would like to turn it into something like this:

$arr = array(
array('sum'=>1, 'date_stamp'=>'2011/6/6'),
array('sum'=>3, 'date_stamp'=>'2011/6/20'), // +2 as one of the 3...
//... for this date was a duplicate
array('sum'=>5, 'date_stamp'=>'2011/6/21'), // +2 as one of the 3...
//... was already there on a different day

This is what I've tried, but I only later realised that it's only taking care of uniqueness for a given date, not the entire array as I intend it to do:

$sum = 0;
$tempRes = array();
$result = array();
$date = null;
foreach($arr as $row)
$date = $row['date_stamp'];
if (!in_array($row['usr_id'], $tempRes))
$tempRes[$date][] = $row2['usr_id'];
foreach ($tempRes as $date2 => $ids)
$sum += count($ids);
$result[] = array($date2, $sum);

Basically the purpose is to produce the cumulative sum of the number of
's per day and ensure the same
is only counted as one throughout the entire array, ie. make a cumulative sum of unique
' sorted by day.

If you have ideas of better optimising the MySQL query, that is welcome too.

EDIT: I would like the "cumulation" to happen over the entire array, not just per day, as in my example output, ie. day 1 is 1, day 2 is 3 (1+2), day 3 is 5 (3+2)... etc.

Answer Source

You can first group unique users by day like this:

foreach ($arr as $item) {
    $days[$item['date_stamp']][$item['usr_id']] = 1; // value is irrelevant

Then you can create an array of all the users, union each day's users onto it and count the result of that to get the cumulative sum.

$all_users = array();
foreach ($days as $day => $users) {
    $all_users = $all_users + $users;
    $result[] = array('sum' => count($all_users), 'date_stamp' => $day);