Logan Logan - 2 months ago 11
MySQL Question

How do you display data from multiple entries in a mysql table that are JOINED with another via a single value match?

I need to select and display information from a pair of MySQL tables but the syntax eludes me. Specifically, I need to JOIN the data from the cwd_user table with the data from the cwd_user_attribute table on the field cwd_user.id == cwd_user_attribute.user_id, but I also need to display values from several entries in the cwd_user_attribute table in a single line. It's the latter that eludes me. Here are the gory details:

Given two tables:

mysql (crowd@prod:crowddb)> desc cwd_user;
+---------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+-------+
| id | bigint(20) | NO | PRI | NULL | |
| user_name | varchar(255) | NO | | NULL | |
| active | char(1) | NO | MUL | NULL | |
| created_date | datetime | NO | | NULL | |
| updated_date | datetime | NO | | NULL | |
| display_name | varchar(255) | YES | | NULL | |
| directory_id | bigint(20) | NO | MUL | NULL | |
+---------------------+--------------+------+-----+---------+-------+


mysql (crowd@prod:crowddb)> desc cwd_user_attribute;
+-----------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+--------------+------+-----+---------+-------+
| id | bigint(20) | NO | PRI | NULL | |
| user_id | bigint(20) | NO | MUL | NULL | |
| directory_id | bigint(20) | NO | MUL | NULL | |
| attribute_name | varchar(255) | NO | | NULL | |
| attribute_value | varchar(255) | YES | | NULL | |
+-----------------------+--------------+------+-----+---------+-------+


Assume that there are up to seven possible values for cwd_user_attribute.attribute_name and I'm interested in four of them: lastAuthenticated, Team, Manager Notes. Example:

mysql (crowd@prod:crowddb)> select * from cwd_user_attribute where user_id = (select id from cwd_user where user_name = 'gspinrad');
+---------+---------+--------------+-------------------------+----------------------------------+
| id | user_id | directory_id | attribute_name | attribute_value |
+---------+---------+--------------+-------------------------+----------------------------------+
| 65788 | 32844 | 1 | invalidPasswordAttempts | 0 |
| 65787 | 32844 | 1 | lastAuthenticated | 1473360428804 |
| 65790 | 32844 | 1 | passwordLastChanged | 1374005378040 |
| 65789 | 32844 | 1 | requiresPasswordChange | false |
| 4292909 | 32844 | 1 | Team | Engineering - DevOps |
| 4292910 | 32844 | 1 | Manager | Matt Karaffa |
| 4292911 | 32844 | 1 | Notes | Desk 32:2:11 |
+---------+---------+--------------+-------------------------+----------------------------------+
5 rows in set (0.00 sec)


I can get a list of the users sorted by lastAuthenticated with this query:

SELECT cwd_user.user_name, cwd_user.id, cwd_user.display_name, from_unixtime(cwd_user_attribute.attribute_value/1000) as last_login FROM cwd_user JOIN cwd_directory ON cwd_user.directory_id = cwd_directory.id JOIN cwd_user_attribute ON cwd_user.id = cwd_user_attribute.user_id AND cwd_user_attribute.attribute_name='lastAuthenticated' WHERE DATEDIFF((NOW()), (from_unixtime(cwd_user_attribute.attribute_value/1000))) > 90 and cwd_user.active='T' order by last_login limit 4;


Result:

+-----------------------+---------+-----------------------+---------------------+
| user_name | id | display_name | last_login |
+-----------------------+---------+-----------------------+---------------------+
| jenkins-administrator | 1605636 | Jenkins Administrator | 2011-10-27 17:28:05 |
| sonar-administrator | 1605635 | Sonar Administrator | 2012-02-06 15:59:59 |
| jfelix | 1605690 | Joey Felix | 2012-02-06 19:15:15 |
| kbitters | 3178497 | Kitty Bitters | 2013-09-03 10:09:59 |


What I need to add to the output is the value of cwd_user_attribute.attribute_value where cwd_user_attribute.attribute_name is Team, Manager, and/or Notes. The output would look something like this:

+-----------------------+---------+-----------------------+-------------------------------------------------------------------+
| user_name | id | display_name | last_login | Team | Manager | Notes |
+-----------------------+---------+-----------------------+-------------------------------------------------------------------+
| jenkins-administrator | 1605636 | Jenkins Administrator | 2011-10-27 17:28:05 | Internal | Internal | |
| sonar-administrator | 1605635 | Sonar Administrator | 2012-02-06 15:59:59 | Internal | Internal | |
| jfelix | 1605690 | Joey Felix | 2012-02-06 19:15:15 | Hardware Eng. | Gary Spinrad | Desk 32:1:51 |
| kbitters | 3178497 | Kitty Bitters | 2013-09-03 10:09:59 | Software QA | Matt Karaffa | Desk 32:2:01 |
+-----------------------+---------+-----------------------+-------------------------------------------------------------------+

Answer

You can achieve that result with an additional LEFT JOIN with the attribute table. Then use GROUP BY and aggregated CASE statements to pivot the result (rows to columns).

SELECT
    cwd_user.user_name,
    cwd_user.id,
    cwd_user.display_name,
    from_unixtime(cwd_user_attribute.attribute_value/1000) as last_login,
    MIN(CASE WHEN attr2.attribute_name = 'TEAM'    THEN attr2.attribute_value END) as Team,
    MIN(CASE WHEN attr2.attribute_name = 'Manager' THEN attr2.attribute_value END) as Manager,
    MIN(CASE WHEN attr2.attribute_name = 'Notes'   THEN attr2.attribute_value END) as Notes
FROM cwd_user 
JOIN cwd_user_attribute
  ON  cwd_user.id = cwd_user_attribute.user_id
  AND cwd_user_attribute.attribute_name='lastAuthenticated'
LEFT JOIN cwd_user_attribute attr2
  ON  cwd_user.id = attr2.user_id
  AND attr2.attribute_name IN ('Team', 'Manager', 'Notes')
WHERE DATEDIFF((NOW()), (from_unixtime(cwd_user_attribute.attribute_value/1000))) > 90
  and cwd_user.active='T'
GROUP BY cwd_user.id
order by last_login limit 4

With strict mode you would need to list all not aggregated colums in the GROUP BY clause

GROUP BY 
    cwd_user.user_name,
    cwd_user.id,
    cwd_user.display_name,
    cwd_user_attribute.attribute_value

Another way is just to use three LEFT JOINs (one join per attribute name):

SELECT
    cwd_user.user_name,
    cwd_user.id,
    cwd_user.display_name,
    from_unixtime(cwd_user_attribute.attribute_value/1000) as last_login,
    attr_team.attribute_value as Team,
    attr_manager.attribute_value as Manager,
    attr_notes.attribute_value as Notes
FROM cwd_user 
JOIN cwd_user_attribute
  ON  cwd_user.id = cwd_user_attribute.user_id
  AND cwd_user_attribute.attribute_name='lastAuthenticated'
LEFT JOIN cwd_user_attribute attr_team
  ON  cwd_user.id = attr2.user_id
  AND attr2.attribute_name = 'Team'
LEFT JOIN cwd_user_attribute attr_manager
  ON  cwd_user.id = attr2.user_id
  AND attr2.attribute_name = 'Manager'
LEFT JOIN cwd_user_attribute attr_notes
  ON  cwd_user.id = attr2.user_id
  AND attr2.attribute_name = 'Notes'
WHERE DATEDIFF((NOW()), (from_unixtime(cwd_user_attribute.attribute_value/1000))) > 90
  and cwd_user.active='T'
order by last_login limit 4

Note: I have removed the join with directory table because you seem not to use it. Add it again, if you need it for filtering.

Note 2: Some attributes that you often use for a search (like lastAuthenticated) should be converted to indexed columns in the users table to improve the search performance.

Comments