Wednesday, May 27, 2015

sql recursive functions (linked lists in sql)

In modern programming languages it makes sense to work with lists, stacks, dictionaries and other collections. This can be crucial when you want to persist the data into the database, because some considerations have to be made about sorting and stuff like that.

Now implementing a linked list in a table is easy:

table: mylist

  • ID: int (not null)
  • data: nvarchar(50)
  • nextID: int (null)
... more tricky is to walk through the list and find e.g.: the leaf of a corresponding root (seeing a list as a special sort of tree). 

the solution is described at: 

http://stackoverflow.com/questions/16749095/sql-recursive-function-that-gets-all-ancestors-of-an-item

...but the other way round using parent_id instead of a next_id

data_table:
ID       parent_id   name
---------------------
1        2            first 
2        4            second
3        3            third
4        5            fourth
5        -            fifth

the solution looks as follows:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
with parents as 
(
  select ID, parent_ID
  from data_table t
  where parent_ID is not null
  union all 
  select p.ID, t.parent_ID
  from parents p
    inner join data_table t on p.parent_ID = t.ID
      and t.parent_ID is not null
      and t.ID <> t.parent_ID
)
select *
  , parents = '(' + stuff
    (
      (
        select ', ' + cast(p.parent_ID as varchar(100))
        from parents p 
        where t.ID = p.ID
        for xml path('')
      ), 1, 2, ''
    ) + ')'
from data_table t
order by ID

  • line 1 gives the inner statement a name to call it e.g.: more often like in
    with y as (select * from x where id > 12) select * from y, y
    ... here you have a smaller set of data to cross join
  • line 3 to 5 makes a select as a starting point for the recursion (the original records)
  • union all in line 6 adds to the starting point a data set defined below
  • line 7 to 11 was at first sight simply magical to me
    • line 7 selects the columns corresponding to line 2 (required by the union and makes sense with the recursive algorithm) ... the own id and the parent of parent
    • line 8 requests data from the data table itself (the recursion using the name), but
    • line 9 joins parent data to the original data
  • line 13 to 24 uses the "expanded" data evaluated and shows it nicely in an own column using "for xml path".
the expanded result is:
| ID | parent_ID |
|----|-----------|
|  1 |         2 |
|  2 |         4 |
|  3 |         3 |
|  4 |         5 |
|  2 |         5 |
|  1 |         4 |
|  1 |         5 |

this will be transformed to

| ID | parent_id |   name |   parents |
|----|-----------|--------|-----------|
|  1 |         2 |  first | (2, 4, 5) |
|  2 |         4 | second |    (4, 5) |
|  3 |         3 |  third |       (3) |
|  4 |         5 | fourth |       (5) |
|  5 |    (null) |  fifth |    (null) |

...

There is an addition for select statements to overwrite the max recursion depth (in sql server it is default set to 100) to work with "deep" lists or trees.

    OPTION (MAXRECURSION 0)

kind regards,
Daniel

No comments: