ASA SQL User's Guide
Common Table Expressions
You can use recursive common table expressions to find desirable paths on a directed graph. Each row in a database table represents a directed edge. Each row specifies an origin, a destination, and a cost of traveling from the origin to the destination. Depending on the problem, the cost may represent distance, travel time, or some other measure. Recursion permits you to explore possible routes through this graph. From the set of possible routes, you can then select the ones that interest you.
For example, consider the problem of finding desirable way to drive between the cities of Kitchener and Pembroke. There are quite a few possible routes, each of which takes you through a different set of intermediate cities. The goal is to find the shortest routes, and to compare them to reasonable alternatives.
First, define a table to represent the edges of this graph and insert one row for each edge. Since all the edges of this graph happen to be bi-directional, the edges that represent the reverse directions must be inserted also. This is done by selecting the initial set of rows, but interchanging the origin and destination. For example, one row must represent the trip from Kitchener to Toronto, and another row the trip from Toronto back to Kitchener.
CREATE TABLE travel ( origin AS VARCHAR(10), destination AS VARCHAR(10), distance AS INT, PRIMARY KEY (origin, destination) );
INSERT INTO travel SELECT 'Kitchener', 'Toronto', 105 UNION SELECT 'Kitchener', 'Barrie', 155 UNION SELECT 'North Bay', 'Pembroke', 220 UNION SELECT 'Pembroke', 'Ottawa', 150 UNION SELECT 'Barrie', 'Toronto', 90 UNION SELECT 'Toronto', 'Belleville', 190 UNION SELECT 'Belleville', 'Ottawa', 230 UNION SELECT 'Belleville', 'Pembroke', 230 UNION SELECT 'Barrie', 'Huntsville', 125 UNION SELECT 'Huntsville', 'North Bay', 130 UNION SELECT 'Huntsville', 'Pembroke', 245;
INSERT INTO travel -- Insert the return trips SELECT destination, origin, distance FROM travel;
The next task is to write the recursive common table expression. Since the trip will start in Kitchener, the initial subquery begins by selecting all the possible paths out of Kitchener, along with the distance of each.
The recursive subquery extends the paths. For each path, it adds segments that continue along from the destinations of the previous segments, adding the length of the new segments so as to maintain a running total cost of each route. For efficiency, routes are terminated if they meet either of the following conditions:
The path returns to the starting location.
The path returns to the previous location.
The path reaches the desired destination.
In the current example, no path should return to Kitchener and all paths should be terminated if they reach Pembroke.
It is particularly important to guarantee that recursive queries will terminate properly when using them to explore cyclic graphs. In this case, the above conditions are insufficient, as a route may include an arbitrarily large number of trips back and forth between two intermediate cities. The recursive query below guarantees termination by limiting the maximum number of segments in any given route to seven.
Since the point of the example query is to select a practical route, the main query selects only those routes that are less than 50 percent longer than the shortest route.
WITH RECURSIVE trip (route, destination, previous, distance, segments) AS ( SELECT CAST(origin || ', ' || destination AS VARCHAR(256)), destination, origin, distance, 1 FROM travel WHERE origin = 'Kitchener' UNION ALL SELECT route || ', ' || v.destination, v.destination, -- current endpoint v.origin, -- previous endpoint t.distance + v.distance, -- total distance segments + 1 -- total number of segments FROM trip t JOIN travel v ON t.destination = v.origin WHERE v.destination <> 'Kitchener' -- Don't return to start AND v.destination <> t.previous -- Prevent backtracking AND v.origin <> 'Pembroke' -- Stop at the end AND segments -- TERMINATE RECURSION! < ( SELECT count(*)/2 FROM travel ) ) SELECT route, distance, segments FROM trip WHERE destination = 'Pembroke' AND distance < 1.5 * ( SELECT min(distance) FROM trip WHERE destination = 'Pembroke' ) ORDER BY distance, segments, route;
When run with against the above data set, this statement yields the following results.
route | distance | segments |
---|---|---|
Kitchener, Barrie, Huntsville, Pembroke | 525 | 3 |
Kitchener, Toronto, Belleville, Pembroke | 525 | 3 |
Kitchener, Toronto, Barrie, Huntsville, Pembroke | 565 | 4 |
Kitchener, Barrie, Huntsville, North Bay, Pembroke | 630 | 4 |
Kitchener, Barrie, Toronto, Belleville, Pembroke | 665 | 4 |
Kitchener, Toronto, Barrie, Huntsville, North Bay, Pembroke | 670 | 5 |
Kitchener, Toronto, Belleville, Ottawa, Pembroke | 675 | 4 |