Bill Zeller Bill Zeller - 6 months ago 47
MySQL Question

MySQL GROUP_CONCAT escaping

(NOTE: This question is not about escaping queries, it's about escaping results)

I'm using GROUP_CONCAT to combine multiple rows into a comma delimited list. For example, assume I have the two (example) tables:

CREATE TABLE IF NOT EXISTS `Comment` (
`id` int(11) unsigned NOT NULL auto_increment,
`post_id` int(11) unsigned NOT NULL,
`name` varchar(255) collate utf8_unicode_ci NOT NULL,
`comment` varchar(255) collate utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `post_id` (`post_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=6 ;

INSERT INTO `Comment` (`id`, `post_id`, `name`, `comment`) VALUES
(1, 1, 'bill', 'some comment'),
(2, 1, 'john', 'another comment'),
(3, 2, 'bill', 'blah'),
(4, 3, 'john', 'asdf'),
(5, 4, 'x', 'asdf');


CREATE TABLE IF NOT EXISTS `Post` (
`id` int(11) NOT NULL auto_increment,
`title` varchar(255) collate utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=7 ;

INSERT INTO `Post` (`id`, `title`) VALUES
(1, 'first post'),
(2, 'second post'),
(3, 'third post'),
(4, 'fourth post'),
(5, 'fifth post'),
(6, 'sixth post');


And I want to list all posts along with a list of each username who commented on the post:

SELECT
Post.id as post_id, Post.title as title, GROUP_CONCAT(name)
FROM Post
LEFT JOIN Comment on Comment.post_id = Post.id
GROUP BY Post.id


gives me:

id title GROUP_CONCAT( name )
1 first post bill,john
2 second post bill
3 third post john
4 fourth post x
5 fifth post NULL
6 sixth post NULL


This works great, except that if a username contains a comma it will ruin the list of users. Does MySQL have a function that will let me escape these characters? (Please assume usernames can contain any characters, since this is only an example schema)

Answer

If there's some other character that's illegal in usernames, you can specify a different separator character using a little-known syntax:

...GROUP_CONCAT(name SEPARATOR '|')...

... You want to allow pipes? or any character?

Escape the separator character, perhaps with backslash, but before doing that escape backslashes themselves:

group_concat(replace(replace(name, '\\', '\\\\'), '|', '\\|') SEPARATOR '|')

This will:

  1. escape any backslashes with another backslash
  2. escape the separator character with a backslash
  3. concatenate the results with the separator character

To get the unescaped results, do the same thing in the reverse order:

  1. split the results by the separator character where not preceded by a backslash. Actually, it's a little tricky, you want to split it where it isn't preceded by an odd number of blackslashes. This regex will match that:
    (?<!\\)(?:\\\\)*\|
  2. replace all escaped separator chars with literals, i.e. replace \| with |
  3. replace all double backslashes with singe backslashes, e.g. replace \\ with \
Comments