fishbone fishbone - 4 years ago 97
SQL Question

Merging related data when a column should become unique

Given a mysql-database with tables as follows:

author:
+----+----------+
| id | name |
+----+----------+
| 1 | John |
| 2 | Peter |
| 3 | Peter |
+----+----------+

article:
+----+-----------+------+
| id | author_id | text |
+----+-----------+------+
| 1 | 2 | ... |
| 2 | 3 | ... |
| 3 | 3 | ... |
+----+-----------+------+


The author-table's name-column wasn't set to unique by accident. Now I have to "merge" related articles into one of the related authors, i.e. set author_id of articles 2 and 3 to 2. I want to make the name-column unique afterwards.

I cannot reassign the articles manually, because there are too many affected records. But I thought there may be a ready solution / snippet for this problem.

Answer Source

To update your article table, this will do the trick:

update article art
   set art.author_id = (select min(aut.id)
                          from author aut
                         where aut.name = (select a.name
                                             from author a
                                            where a.id = art.author_id));

select * from article;    
+ ------- + -------------- + --------- +
| id      | author_id      | text      |
+ ------- + -------------- + --------- +
| 1       | 2              |           |
| 2       | 2              |           |
| 3       | 2              |           |
+ ------- + -------------- + --------- +
3 rows

if you prefer a more compact update (and more optimized), then you can use this one, that works the same way:

update article art
   set art.author_id = (select min(aut.id)
                          from author aut
                         inner join author a on a.name = aut.name
                         where a.id = art.author_id);

Finally, to delete the extra authors, you need

delete a
  from author a
 inner join (
    select name, min(id) as min -- this subquery returns all repeated names and their smallest id
      from author
     group by name
    having count(*) > 1) repeated on repeated.name = a.name
 where a.id > repeated.min;     -- delete all repeateds except the first one

select * from author;    
+ ------- + --------- +
| id      | name      |
+ ------- + --------- +
| 1       | John      |
| 2       | Peter     |
+ ------- + --------- +
2 rows

This works for any number of repeated sets of authors.

Hope this helps

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download