JCZ JCZ - 5 months ago 151
MySQL Question

Sorting a subtree in a closure table hierarchical-data structure

I would like to ask you to help me with the problem with sorting of the hierarchical data structure stored as a closure table.

I wanted to use this structure to store my website menu. Everything works fine, but the problem is that I do not know how to sort the exact subtree in a custom order. At the moment the tree is sorted in the order in which the items were added to the database.

My structure is based on Bill Karwin's article about Closure Tables and some other posts.

Here is my MySQL database structure with some DEMO data:

--
-- Table `category`
--

CREATE TABLE IF NOT EXISTS `category` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) COLLATE utf8_czech_ci NOT NULL,
`active` tinyint(1) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;


INSERT INTO `category` (`id`, `name`, `active`) VALUES
(1, 'Cat 1', 1),
(2, 'Cat 2', 1),
(3, 'Cat 1.1', 1),
(4, 'Cat 1.1.1', 1),
(5, 'Cat 2.1', 1),
(6, 'Cat 1.2', 1),
(7, 'Cat 1.1.2', 1);

--
-- Table `category_closure`
--

CREATE TABLE IF NOT EXISTS `category_closure` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`ancestor` int(11) DEFAULT NULL,
`descendant` int(11) DEFAULT NULL,
`depth` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `fk_category_closure_ancestor_category_id` (`ancestor`),
KEY `fk_category_closure_descendant_category_id` (`descendant`)
) ENGINE=InnoDB;

INSERT INTO `category_closure` (`id`, `ancestor`, `descendant`, `depth`) VALUES
(1, 1, 1, 0),
(2, 2, 2, 0),
(3, 3, 3, 0),
(4, 1, 3, 1),
(5, 4, 4, 0),
(7, 3, 4, 1),
(8, 1, 4, 2),
(10, 6, 6, 0),
(11, 1, 6, 1),
(12, 7, 7, 0),
(13, 3, 7, 1),
(14, 1, 7, 2),
(16, 5, 5, 0),
(17, 2, 5, 1);


Here is my SELECT query for one tree:

SELECT c2.*, cc2.ancestor AS `_parent`
FROM category AS c1
JOIN category_closure AS cc1 ON (cc1.ancestor = c1.id)
JOIN category AS c2 ON (cc1.descendant = c2.id)
LEFT OUTER JOIN category_closure AS cc2 ON (cc2.descendant = c2.id AND cc2.depth = 1)
WHERE c1.id = __ROOT__ AND c1.active = 1
ORDER BY cc1.depth


For the DEMO instance with __ROOT_ = 1 that query gets:

id name active _parent
1 Cat 1 1 NULL
3 Cat 1.1 1 1
6 Cat 1.2 1 1
4 Cat 1.1.1 1 3
7 Cat 1.1.2 1 3


But what if I for example need to change the order of Cat 1.1 and Cat 1.2 (according to name, or some custom order)?

I have seen some breadcrumbs solution (how to sort by breadcrumbs), but I do not know how to generate and change them.

Answer

This question comes up frequently not only for Closure Table but also for other methods of storing hierarchical data. It's not easy in any of the designs.

The solution I've come up with for Closure Table involves one additional join. Every node in the tree joins to the chain of its ancestors, like a "breadcrumbs" type query. Then use GROUP_CONCAT() to collapse the breadcrumbs into a comma-separated string, sorting the id numbers by depth in the tree. Now you have a string by which you can sort.

SELECT c2.*, cc2.ancestor AS `_parent`,
  GROUP_CONCAT(breadcrumb.ancestor ORDER BY breadcrumb.depth DESC) AS breadcrumbs
FROM category AS c1
JOIN category_closure AS cc1 ON (cc1.ancestor = c1.id)
JOIN category AS c2 ON (cc1.descendant = c2.id)
LEFT OUTER JOIN category_closure AS cc2 ON (cc2.descendant = c2.id AND cc2.depth = 1)
JOIN category_closure AS breadcrumb ON (cc1.descendant = breadcrumb.descendant)
WHERE c1.id = 1/*__ROOT__*/ AND c1.active = 1
GROUP BY cc1.descendant
ORDER BY breadcrumbs;

+----+------------+--------+---------+-------------+
| id | name       | active | _parent | breadcrumbs |
+----+------------+--------+---------+-------------+
|  1 | Cat 1      |      1 |    NULL | 1           |
|  3 | Cat  1.1   |      1 |       1 | 1,3         |
|  4 | Cat  1.1.1 |      1 |       3 | 1,3,4       |
|  7 | Cat 1.1.2  |      1 |       3 | 1,3,7       |
|  6 | Cat 1.2    |      1 |       1 | 1,6         |
+----+------------+--------+---------+-------------+

Caveats:

  • The id values should have uniform length, because sorting "1,3" and "1,6" and "1,327" might not give the order you intend. But sorting "001,003" and "001,006" and "001,327" would. So you either need to start your id values at 1000000+, or else use ZEROFILL for ancestor and descendant in the category_closure table.
  • In this solution the display order depends on the numeric order of category id's. That numeric order of id values may not represent the order you want to display the tree. Or you may want the freedom to change the display order irrespective of the numeric id values. Or you may want the same category data to appear in more than one tree, each with different display order.
    If you need more freedom, you need to store the sort-order values separately from the id's, and the solution gets even more complex. But in most projects, it's acceptable to use a short-cut, giving the category id's double-duty as the tree display order.

Re your comment:

Yes, you could store "sibling sort order" as another column in the closure table, then use that value instead of ancestor to build the breadcrumbs string. But if you do that, you end up with a lot of data redundancy. That is, a given ancestor is stored on multiple rows, one for each path descending from it. So you have to store the same value for sibling sort order on all of those rows, which creates the risk of an anomaly.

The alternative would be to create another table, with only one row per distinct ancestor in the tree, and join to that table to get the sibling order.

CREATE TABLE category_closure_order (
  ancestor INT PRIMARY KEY,
  sibling_order SMALLINT UNSIGNED NOT NULL DEFAULT 1
);

SELECT c2.*, cc2.ancestor AS `_parent`,
  GROUP_CONCAT(o.sibling_order ORDER BY breadcrumb.depth DESC) AS breadcrumbs
FROM category AS c1
JOIN category_closure AS cc1 ON (cc1.ancestor = c1.id)
JOIN category AS c2 ON (cc1.descendant = c2.id)
LEFT OUTER JOIN category_closure AS cc2 ON (cc2.descendant = c2.id AND cc2.depth = 1)
JOIN category_closure AS breadcrumb ON (cc1.descendant = breadcrumb.descendant)
JOIN category_closure_order AS o ON breadcrumb.ancestor = o.ancestor
WHERE c1.id = 1/*__ROOT__*/ AND c1.active = 1
GROUP BY cc1.descendant
ORDER BY breadcrumbs;

+----+------------+--------+---------+-------------+
| id | name       | active | _parent | breadcrumbs |
+----+------------+--------+---------+-------------+
|  1 | Cat 1      |      1 |    NULL | 1           |
|  3 | Cat  1.1   |      1 |       1 | 1,1         |
|  4 | Cat  1.1.1 |      1 |       3 | 1,1,1       |
|  7 | Cat 1.1.2  |      1 |       3 | 1,1,2       |
|  6 | Cat 1.2    |      1 |       1 | 1,2         |
+----+------------+--------+---------+-------------+