RkdL RkdL - 5 months ago 14
SQL Question

concat fields that have a duplicate key in common and remove those dupes leaving one

This is bugging me for the last week now, and with basic knowledge i simply can't get to to a solution that brings us where we need to be. I've searched on updateing, group_concat, concat, joins, but i cant get the result we need.

The problem is as follows:

we have 2 columns that we work with to get to one result we need:

The DB we're working on:

+------------------------------------+-------------+
| id | value |
+------------------------------------+-------------+
| D44172cb5c086c19a4c0286270916bc52 | A |
| D44172cb5c086c19a4c0286270916bc52 | C |
| D44172cb5c086c19a4c0286270916bc52 | B |
| D87d93570cbc9a3edc18601e0aff6e261 | D |
| D87d93570cbc9a3edc18601e0aff6e261 | A |
| D87d93570cbc9a3edc18601e0aff6e261 | F |
+------------------------------------+-------------+


What we expect to get:

+-------------------------------------+-------------+
| id | value |
+-------------------------------------+-------------+
| D44172cb5c086c19a4c0286270916bc52 | A|C|B |
| D87d93570cbc9a3edc18601e0aff6e261 | D|A|F |
+-------------------------------------+-------------+


we're currently using a SELECT command that works and is giving the data as expected in a temporary(since we're only selecting) column named valuenew:

SELECT * GROUP_CONCAT(value SEPARATOR '|') AS valuenew FROM db.table GROUP BY id ;


How do we make a (
UPDATE
or
DELETE
) query that deletes the duplicates, after it concatenated the needed values? Because, using
GROUP_CONCAT
the returned number of rows will be less than the actual rows present.

We need to combine the concatenated values in a later stage with other columns so the rows need to be equal to the returned rows.

SOLUTION

CREATE TABLE IF NOT EXISTS `temp`
SELECT *, GROUP_CONCAT(`value` SEPARATOR '|') AS `values`
FROM `db`.`table`
GROUP BY `id`;

TRUNCATE TABLE `db`.`table`;

UPDATE `temp` SET value = values;
ALTER TABLE `temp` DROP COLUMN values;

INSERT INTO `db`.`table`([all column titles here])
SELECT *
FROM `db`.`temp`
;

DROP TABLE `db`.`temp`;

Answer

If you really must, this is probably the most direct way of doing it.

CREATE TABLE `db`.`temp`
SELECT `id`, GROUP_CONCAT(`value` SEPARATOR '|') AS `value`
FROM `db`.`table` 
GROUP BY `id`;

TRUNCATE TABLE db.table;

INSERT INTO `db`.`table`(`id`, `value`)
SELECT `id`, `value` 
FROM `db`.`temp`
;

DROP TABLE `db`.`temp`

Of course, if there are other fields, you'll have to account for them appropriately to prevent data loss. You could also make the temp table a true TEMPORARY table, but if you lose connection between the truncate and the re-insert you'd lose all data.

Edit: Also make sure the value field can hold the longest GROUP_CONCAT() result.