Examples of recursive queries for PostgreSQL tree structures
- 2020-05-06 11:55:48
- OfStack
background
For hierarchies dealing with depth of uncertainty, such as organizations, a common design is to keep ID and Parent_ID in a single table, and to construct a tree by self-joining. This approach is friendly to the process of writing data, but the query process becomes relatively complex. Without introducing the MPTT model, a recursive algorithm must be used to query a node and its children.
Oracle provides connect by extension syntax, simple and easy to use. But the other RDBMS are not so personal (or I don't know). I recently used PostgreSQL in my project to query tree data.
constructs sample data
drop table if exists demo.tree_data;
create table demo.tree_data (
id integer,
code text,
pid integer,
sort integer
);
insert into demo.tree_data values(1, ' China ', null, 1);
insert into demo.tree_data values(2, ' sichuan ', 1, 1);
insert into demo.tree_data values(3, ' yunnan ', 1, 2);
insert into demo.tree_data values(4, ' chengdu ', 2, 1);
insert into demo.tree_data values(5, ' mianyang ', 2, 2);
insert into demo.tree_data values(6, ' Wuhou district ', 4, 1);
insert into demo.tree_data values(7, ' kunming ', 3, 1);
connectby function
If you have the tablefunc extension installed, you can use the PG version of the connectby function. This isn't as powerful as Oracle, but it meets the basic requirements.
-- API The following
connectby(text relname, -- The name of the table
text keyid_fld, -- id field
text parent_keyid_fld -- The father id field
[, text orderby_fld ], -- Sort field
text start_with, -- Starting line id value
int max_depth -- Depth of the tree, 0 Said the infinite
[, text branch_delim ]) -- Path separator
-- The basic usage is as follows and must be passed AS The clause defines the returned field name and type
select *
from connectby('demo.tree_data', 'id', 'pid', 'sort', '1', 0, '~')
as (id int, pid int, lvl int, branch text, sort int);
-- The query results
id | pid | lvl | branch | sort
----+-----+-----+---------+------
1 | | 0 | 1 | 1
2 | 1 | 1 | 1~2 | 2
4 | 2 | 2 | 1~2~4 | 3
6 | 4 | 3 | 1~2~4~6 | 4
5 | 2 | 2 | 1~2~5 | 5
3 | 1 | 1 | 1~3 | 6
7 | 3 | 2 | 1~3~7 | 7
(7 rows)
-- Using only basic usage, you can only query out id Relevant information if you want to query code And so on other fields that need to go through extra join Operation to implement.
select
t.id, n.code, t.pid, p.code as pcode, lvl, branch
from (
select * from connectby('demo.tree_data', 'id', 'pid', 'sort', '1', 0, '~')
as (id int, pid int, lvl int, branch text, sort int)
) as t
left join demo.tree_data as n on (t.id = n.id)
left join demo.tree_data as p on (t.pid = p.id)
order by t.sort ;
id | code | pid | pcode | lvl | branch
----+--------+-----+-------+-----+---------
1 | China | | | 0 | 1
2 | sichuan | 1 | China | 1 | 1~2
4 | chengdu | 2 | sichuan | 2 | 1~2~4
6 | Wuhou district | 4 | chengdu | 3 | 1~2~4~6
5 | mianyang | 2 | sichuan | 2 | 1~2~5
3 | yunnan | 1 | China | 1 | 1~3
7 | kunming | 3 | yunnan | 2 | 1~3~7
(7 rows)
PS: although the code of a node can be queried through join, the branch part cannot be directly converted to the corresponding code, which is still not very convenient to use.
CTE syntax
Using CTE syntax, the tree data is recursively queried by with recursive. This approach is not as straightforward as connectby, but it is more flexible and displays better.
--
with recursive cte as
(
-- First, the query root node
select
id, code, pid, '' as pcode,
code as branch
from demo.tree_data where id = 1
union all
-- through cte Recursive query root The immediate child of a node
select
origin.id, origin.code, cte.id as pid, cte.code as pcode,
cte.branch || '~' || origin.code
from cte
join demo.tree_data as origin on origin.pid = cte.id
)
select
id,code, pid, pcode, branch,
-- The depth of the tree is calculated by calculating the number of separators
(length(branch)-length(replace(branch, '~', ''))) as lvl
from cte;
--
id | code | pid | pcode | branch | lvl
----+--------+-----+-------+-----------------------+-----
1 | China | | | China | 0
2 | sichuan | 1 | China | China ~ sichuan | 1
3 | yunnan | 1 | China | China ~ yunnan | 1
4 | chengdu | 2 | sichuan | China ~ sichuan ~ chengdu | 2
5 | mianyang | 2 | sichuan | China ~ sichuan ~ mianyang | 2
7 | kunming | 3 | yunnan | China ~ yunnan ~ kunming | 2
6 | Wuhou district | 4 | chengdu | China ~ sichuan ~ chengdu ~ Wuhou district | 3
(7 rows)
execution process description
As you can see from the above example, the WITH RECURSIVE statement contains two parts
Es115en-recursive term (non-recursive part), union all, recursive term (recursive part), union all,The steps are as follows:
Execute non-recursive term. (if union is used instead of union all, the result will be deduplicated.) the result is a reference to result in recursive term, and this part of the result is put into temporary working table, repeat the following steps until working table is empty: Replace the recursive self-reference with the content of working table, perform recursive term, (remove duplicate data if union is used instead of union all), and replace working table with the result (if union is used instead of union union, the result of de-duplication)Taking query above as an example, let's look at the specific process
Perform non - recursive query
-- step 1 perform
select
id, code, pid, '' as pcode,
code as branch
from demo.tree_data where id = 1
-- The result set and working table for
id | code | pid | pcode | branch
----+------+-----+-------+--------
1 | China | | | China
Perform recursive query
-- step 2 Execute the recursion, at which point the self - reference cte The data in step 1 The results of the
select
origin.id, origin.code, cte.id as pid, cte.code as pcode,
cte.branch || '~' || origin.code
from cte
join demo.tree_data as origin on origin.pid = cte.id
-- The result set and working table for
id | code | pid | pcode | branch
----+--------+-----+-------+---------------------
2 | sichuan | 1 | China | China ~ sichuan
3 | yunnan | 1 | China | China ~ yunnan
3. Continue to execute recursive query until the result set and working table are empty
4. End the recursion, and combine the result set of the first three steps to get the final WITH RECURSIVE result set.
Strictly speaking, the implementation is iterative rather than recursive, but the RECURSIVE keyword was determined by the SQL standards committee, so PostgreSQL also USES RECURSIVE.
Summary