webeno webeno - 7 months ago 14
SQL Question

Cumulative sum of unique elements sorted by other element in a nested array

I have the following MySQL query (

timestamp
is in Unix time, obviously):

SELECT usr_id, CONCAT(YEAR(FROM_UNIXTIME(timestamp)), "/", MONTH(FROM_UNIXTIME(timestamp)), "/", DAY(FROM_UNIXTIME(timestamp))) as date_stamp
FROM table
ORDER BY YEAR(FROM_UNIXTIME(timestamp)), MONTH(FROM_UNIXTIME(timestamp)), DAY(FROM_UNIXTIME(timestamp));


This produces something like this:

$arr = array(
array('usr_id'=>3, 'date_stamp'=>'2011/6/6'),
array('usr_id'=>2, 'date_stamp'=>'2011/6/20'),
array('usr_id'=>2, 'date_stamp'=>'2011/6/20'), // same id and date as above
array('usr_id'=>5, 'date_stamp'=>'2011/6/20'), // same date as above
array('usr_id'=>1, 'date_stamp'=>'2011/6/21'),
array('usr_id'=>4, 'date_stamp'=>'2011/6/21'), // same date as above
array('usr_id'=>2, 'date_stamp'=>'2011/6/21'), // same date as above...
//... and same id as a day before
);


I would like to turn it into something like this:

$arr = array(
array('sum'=>1, 'date_stamp'=>'2011/6/6'),
array('sum'=>3, 'date_stamp'=>'2011/6/20'), // +2 as one of the 3...
//... for this date was a duplicate
array('sum'=>5, 'date_stamp'=>'2011/6/21'), // +2 as one of the 3...
//... was already there on a different day
);


This is what I've tried, but I only later realised that it's only taking care of uniqueness for a given date, not the entire array as I intend it to do:

$sum = 0;
$tempRes = array();
$result = array();
$date = null;
foreach($arr as $row)
{
$date = $row['date_stamp'];
if (!in_array($row['usr_id'], $tempRes))
$tempRes[$date][] = $row2['usr_id'];
}
foreach ($tempRes as $date2 => $ids)
{
$sum += count($ids);
$result[] = array($date2, $sum);
}


Basically the purpose is to produce the cumulative sum of the number of
usr_id
's per day and ensure the same
usr_id
is only counted as one throughout the entire array, ie. make a cumulative sum of unique
usr_id
' sorted by day.

If you have ideas of better optimising the MySQL query, that is welcome too.

EDIT: I would like the "cumulation" to happen over the entire array, not just per day, as in my example output, ie. day 1 is 1, day 2 is 3 (1+2), day 3 is 5 (3+2)... etc.

Answer

You can first group unique users by day like this:

foreach ($arr as $item) {
    $days[$item['date_stamp']][$item['usr_id']] = 1; // value is irrelevant
}

Then you can create an array of all the users, union each day's users onto it and count the result of that to get the cumulative sum.

$all_users = array();
foreach ($days as $day => $users) {
    $all_users = $all_users + $users;
    $result[] = array('sum' => count($all_users), 'date_stamp' => $day);
}
Comments