Handling Circular Data Structures in Postgres

Let’s say we’re setting up a data structure for security, with group membership:

create table groups (
  id int unique,
  name varchar unique,
  member_groups int[]
);

insert into groups values (1, 'admins', '{}');

-- automatically include any admins as users
insert into groups values (2, 'users', '{1}'); 

-- grant admins access to write
insert into groups values (3, 'writers', '{1}'); 

-- grant writers access to read
insert into groups values (4, 'readers', '{3}'); 

Now, we want to flatten this structure to find all transitive memberships. To do this, we take each group’s children, find their children, and concatenate the result recursively.

Note that the column order in each part of the recursive sql must be the same, as well as in the column definition on the first line.

WITH RECURSIVE all_groups(id, member_groups) AS (
    select
      id,
      member_groups
    from groups
  UNION ALL
    SELECT groups.id,
           all_groups.member_groups || groups.member_groups
    FROM all_groups
    JOIN groups
      ON all_groups.id = ANY (groups.member_groups)
  )
SELECT id, uniq(flatMap(member_groups))
FROM all_groups
GROUP BY id;

Note that here, I’m using some custom functions I’ve introduced, to match some functionality of the Scala collections API.

And this is what we get, as expected:

4,{1,3}
1,
3,{1}
2,{1}

Lets say someone adds a circular reference to this array:

insert into groups values (5, 'cycle1', '{3}', '{3, 7}');
insert into groups values (6, 'cycle2', '{3}', '{5}');
insert into groups values (7, 'cycle3', '{3}', '{6}');

We can easily fix this, by adding an additional array to track what we’ve seen already in the recursion:


WITH RECURSIVE all_groups(found, id, member_groups) AS (
    select
      array[id] found,
      id,
      member_groups
    from groups
  UNION ALL
    SELECT array[groups.id] || all_groups.found found,
          groups.id,
          all_groups.member_groups || groups.member_groups
    FROM all_groups
    JOIN groups
      ON all_groups.id = ANY (groups.member_groups)
      AND NOT (groups.id = ANY (found))
  )
SELECT id, uniq(flatMap(member_groups))
FROM all_groups
GROUP BY id;

And we’re done! Since a cycle just indicates an equivalence class of groups, they should all get the same group members, and we can detect them by seeing that they contain their own IDs:

8,{5,8}
4,{1,3}
1,
5,{5,8}
3,{1}
9,{5,8}
6,{5,8}
2,{1}
7,{5,6,8}