Lee Lee - 20 days ago 8
MySQL Question

User-variable to select value from previous row fine in MySQL 5.5 but in 5.6 it seems to be running the SELECT part of the query in a different order

I have a rather unusual issue which has come up when moving to a staging server from a local environment. The MySQL version is different but I am unable to find any reason why this issue would have occurred.

The code I am using is:

DECLARE _first_event_time datetime;

SELECT event_time INTO _first_event_time
FROM session_events
WHERE session_id = _session_id
ORDER BY event_id ASC LIMIT 0,1;

-- Select Data

SELECT event_id,
event_time,
timediff(event_time, _first_event_time) AS event_time_from_start,
(@previous := unix_timestamp(event_time) - @previous) AS event_duration,
(@previous := unix_timestamp(event_time)) AS previous_event_timestamp
FROM session_events e,

(SELECT @previous:=0) c
ORDER BY event_id ASC LIMIT 0, 400;


What I would expect is in the following results, which is what I am getting from 5.5 running on my DEV version running MAMP:

+----------+------------------+-----------------------+----------------+--------------------------+
| event_id | event_time | event_time_from_start | event_duration | previous_event_timestamp |
+----------+------------------+-----------------------+----------------+--------------------------+
| 5074 | 24/09/2015 20:35 | 00:00:00 | 1443123315 | 1443123315 |
| 5075 | 24/09/2015 20:35 | 00:00:02 | 2 | 1443123317 |
| 5076 | 24/09/2015 20:35 | 00:00:03 | 1 | 1443123318 |
| 5077 | 24/09/2015 20:35 | 00:00:03 | 0 | 1443123318 |
| 5078 | 24/09/2015 20:35 | 00:00:04 | 1 | 1443123319 |
| 5079 | 24/09/2015 20:35 | 00:00:05 | 1 | 1443123320 |
| 5080 | 24/09/2015 20:35 | 00:00:06 | 1 | 1443123321 |
| 5081 | 24/09/2015 20:35 | 00:00:06 | 0 | 1443123321 |
| 5082 | 24/09/2015 20:35 | 00:00:07 | 1 | 1443123322 |
| 5083 | 24/09/2015 20:35 | 00:00:08 | 1 | 1443123323 |
| 5084 | 24/09/2015 20:35 | 00:00:09 | 1 | 1443123324 |
| 5085 | 24/09/2015 20:35 | 00:00:10 | 1 | 1443123325 |
+----------+------------------+-----------------------+----------------+--------------------------+


But instead the data is coming out like this (please note that the data values are different as I have created the same values on both servers but you can see where the data is being offset, the code in the stored procedure is identical on both servers):

+----------+------------------+-----------------------+----------------+--------------------------+
| event_id | event_time | event_time_from_start | event_duration | previous_event_timestamp |
+----------+------------------+-----------------------+----------------+--------------------------+
| 5307 | 24/09/2015 20:32 | 00:00:00 | -17 | 1443123174 |
| 5308 | 24/09/2015 20:33 | 00:00:07 | 1443123181 | 1443123181 |
| 5309 | 24/09/2015 20:33 | 00:00:08 | -7 | 1443123182 |
| 5310 | 24/09/2015 20:33 | 00:00:09 | 2 | 1443123183 |
| 5311 | 24/09/2015 20:33 | 00:00:11 | 3 | 1443123185 |
| 5312 | 24/09/2015 20:33 | 00:00:12 | 3 | 1443123186 |
| 5313 | 24/09/2015 20:33 | 00:00:13 | 2 | 1443123187 |
| 5314 | 24/09/2015 20:33 | 00:00:14 | 2 | 1443123188 |
| 5315 | 24/09/2015 20:33 | 00:00:15 | 2 | 1443123189 |
| 5316 | 24/09/2015 20:33 | 00:00:15 | 1 | 1443123189 |
| 5317 | 24/09/2015 20:33 | 00:00:17 | 2 | 1443123191 |
| 5318 | 24/09/2015 20:33 | 00:00:18 | 18 | 1443123192 |
+----------+------------------+-----------------------+----------------+--------------------------+


The timezone shouldn't make a difference as this is being run from data and a stored procedure. I have also tried changing from a timestamp to just using a datediff using the follow:

, ( TIME_TO_SEC(TIMEDIFF(event_time, @previous))) AS event_duration
, ( @previous := event_time ) AS previous_event_timestamp


Looking into this a bit further it seems to be that issue is somewhere with the ORDER BY clause, it seems to be selecting the results in an unsorted order when populating the @previous field on the initial run through on this server as if I remove the ORDER BY clause and just set @previous to event_time then the order is the one that is used that is causing the issue.

If I run just a select statement without the @previous and no ORDER BY clause the dataset returned by the 5.6 server seems to match the unusual order being shown in the 2nd example where the @previous seems to be picking random values.

I can only thing that the part of the SELECT statement that is running through the data and setting the @previous value is running first then when it returns to data it then runs the ORDER BY which is where the issue is coming from.

Looking through the MySQL manual I have not found anything (at least not what I have been able to attribute to why this is happening) between 5.5 and 5.6. I presume there is a way to get this working.

Any suggestions would be greatly appreciated as I can not figure out why it is electing in the order it is prior to the ORDER BY clause.

Answer

MySQL explicitly warns against using variables and assigning them in different expressions in the same select. I would suggest you do this using a subquery:

SELECT t.*,
       (unix_timestamp(event_time) - previous_event_timestamp) as event_duration
FROM (SELECT event_id, event_time,
             timediff(event_time, _first_event_time) AS event_time_from_start,
             if((@pp := @p) = null, null, -- never happens
                if(@p := unix_timestamp(event_time), @pp, @pp)
               ) AS previous_event_timestamp
      FROM session_events e CROSS JOIN
           (SELECT @p := 0, @pp := 0) c
      ORDER BY event_id ASC
      LIMIT 0, 400
     ) t;

The if logic is just a way of putting the assignments in the same statement. The first assigns @pp to be the current previous value (in @p). The second reassign @p, and then returns @pp.

The outer query then does the calculations on the value.

The documentation says:

As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement.

It is best to avoid "features" that the documentation explicitly warns against using.