## Dijkstra's Algorithm

[Up]

### Graph

Graph is a connection of nodes with edges. Node is sometimes called as vertex, and edges is sometimes called branch.

An example graph is shown in Figure 1. In the figure, nodes are represened by letters and circles, and edges are represented by lines coneecting nodes. The number near the edge is the weight of the edge.

 Figure 1: Example of Graph

The graph of Figure 1 can be represented by the adjacency matrix in Table 1. Here, each element of the matrix is its weight if there is an edge between the nodes, or "-" if there is none. We also assume that the weight between the same nodes is 0.

abcdefgh
a0172----
b10--24--
c7-0--23-
d2--0--5-
e-2--01--
f-42-10-6
g--35--02
h-----620

### Example: Find the shortest path

A number of cities and the road distances connecting them are given. Write a program that finds the shortest route from the starting city to the destination city.

#### Input Format

$N$   $R$
$A_1$   $B_1$   $L_1$
$A_2$   $B_2$   $L_2$
$\quad\cdots$
$A_R$   $B_R$   $L_R$
$S$   $D$

The first line contains 2 integers $N$ and $R$ which represent the number of cities, and the number of roads connecting those cities, respectively.

$1 \leqq N \leqq 100$

Each of the $R$ lines after the second line contains 3 integers, $A_i$, $B_i$, and $L_i$ $~~$ ($1 \leqq i \leqq R$).
$A_i$ and $B_i$ represent both ends of the $i$-th road, and $L_i$ represents the distance of the road.

$0 \leqq A_i \leqq N-1$
$0 \leqq B_i \leqq N-1$

The last line contains 2 integers, $S$, $D$ which represent the starting city and the destination city, respectively.

$0 \leqq S \leqq N-1$
$0 \leqq D \leqq N-1$

#### Output Format

If the shortest path is found, print it as follows.

On the first line, print an integer representing the distance of the shortest path. And on the next line, print the cities along the shortest path from the starting city to the destination city, with a space between them.

If there are multiple shortest path, one of them should be the answere.

If there is no shortest path, print "no route".

## Dijkstra's Algorithm

Dijkstra's is an algorithm that finds the shortest path to each node one bye one from the vicinity of the starting node, and gradually expands the range. Can be used when all edge weights are non-negative ($\geqq 0$).

1. At first, "the shortest distance" of each node is set to "unfixed", and assume it as $\infty$ (infinity).
2. Assume the distance of the starting node as 0.
3. Among nodes whose shortest distance are "unfixed", select the node whose value is smallest. Then you fix the selected node's "shortest distance" as its current value.
4. Calculate the distance of node "directly connected" from the most recently selected node, and its "shortest distance" is "unfixed". If the result is less than current value, update it.
5. Once shortest distance of all the nodes are "fixed", you are finished. Otherwise goto 3.
 Let's find the shortest path from node a to node h. Assume that the shortest distance of each node is "unfixed" and its value is infinity ($\infty$). Then, assume the shortest distance of the starting node (a) is 0. Select the shortest node (a marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (a marked with ☆) is "fixed" to the current value (0). The "fixed" node is separated from "unfixed" nodes by a red boundary line (frontier). Calculate the distance of each node (b, c, d ), if and only if      the node is "directly connected" from the most recently fixed node (a with ☆), and      its "shortest distance" is "unfixed". If the result is less than current value, update it (b → 1, c → 7, d → 2). Edges that update the shortest distance of unfixed nodes across the boundary are expressed by light blue arrows. Select the shortest node (b marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (b marked with ☆) is "fixed" to the current value (1). If there are multiple nodes with the shortest distance value, it doesn't matter which one you choose. Calculate the distance of each node (e, f), if and only if      the node is "directly connected" from the most recently fixed node (b with ☆), and      its "shortest distance" is "unfixed". If the result is less than current value, update it (e → 3, f → 5). Select the shortest node (d marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (d marked with ☆) is "fixed" to the current value (2). Calculate the distance of node (g), if and only if      the node is "directly connected" from the most recently fixed node (d with ☆), and      its "shortest distance" is "unfixed". If the result is less than current value, update it (g → 7). Select the shortest node (e marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (e marked with ☆) is "fixed" to the current value (3). Calculate the distance of node (f), if and only if      the node is "directly connected" from the most recently fixed node (e with ☆), and      its "shortest distance" is "unfixed". If the result is less than current value, update it (f → 4). Select the shortest node (f marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (f marked with ☆) is "fixed" to the current value (4). Calculate the distance of each node (c, h), if and only if      the node is "directly connected" from the most recently fixed node (f with ☆), and      its "shortest distance" is "unfixed". If the result is less than current value, update it (c → 6, h → 10). Select the shortest node (c marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (c marked with ☆) is "fixed" to the current value (6). Calculate the distance of each node (g), if and only if      the node is "directly connected" from the most recently fixed node (g with ☆), and      its "shortest distance" is "unfixed". The shortest distance of g is not updated, becase the computed result(9 is not less than the current value (7). Select the shortest node (g marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (g marked with ☆) is "fixed" to the current value (7). Calculate the distance of each node (h), if and only if      the node is "directly connected" from the most recently fixed node (g with ☆), and      its "shortest distance" is "unfixed". If the result is less than current value, update it (h → 9). Select the shortest node (h marked with ☆) among the "unfixed" nodes. The shortest distance of the selected node (h marked with ☆) is "fixed" to the current value (9). The shortest distance (9) from a to h and shortest path (a → d → g → h) are found.
Data , Execution Log

### Why Dijkstra's Algorithm Works Correctly?

Consider the example below to explain why Dijkstra's algorithm works correctly.

The above figure is immediately after "node b is fixed and the estimated distance of nodes e and f are updated".

There are 3 types of node status.

• the shortest distance of nodes a and b are "fixed",
• the shortest distance of some nodes (c, d, e, f) are "unfixed but estimated" via "fixed" nodes (a, b),
• the shortest distance of other nodes (g, h) remain to be "unfixed and infinity ($\infty$)".
The nodes are divided into "fixed" nodes on the left side of the boundary line (frontier) and "unfixed" nodes on the right side.

Dijkstra's algorithm asserts that "the shortest node (d) among the unfixed nodes can be fixed with the current value (2)". And it is clear from the fact that "all edge weights are non-negative" in the graph Dijkstra's targets.

Consider the "unfixed" nodes on the right side of the boundary line. The "unfixed but estimated" nodes (c, d, e, f) hold the shortest distance via the current "fixed" nodes (a, b). Then, the shortest node (d) among the "unfixed but estimated" nodes (c, d, e, f) has the shortest distance even including "unfixed and infinity ($\infty$)" nodes (g, h). Because all edge weights are non-negative, going to d through the current "unfixed" nodes is far.

In other words, the shortest distance from a to d is currently estimated correctly and can be fixed at the current value.

Thinking like this, we can see that Dijkstra's algorithm claims something very obvious.

#### [Point to Understand]

It is important to focus on the boundary line (frontier) between the "fixed" nodes and "unfixed" nodes. The "unfixed but estimated" nodes are connected directly with some "fixed" nodes. The shortest "unfixed but estimated" node (d) cannot be shorter via other unfixed nodes. Therefore, its true minimum value has already been found.

### Computational complexity of Dijkstra's

The shortest distance of one node is fixed in one loop. Since this is repeated for $n$ nodes, the complexity of this iterative operation is $O(n)$.

Once the shortest distance to a node is fixed, update the distance estimates for each node connected to that node. The maximum number nodes connected to a node is $n-1$, so the complexity of this operation is $O(n)$.

So the total complexity is $O(n) \times O(n) = O(n^2)$.

### Excercises 1

In the following graph, answer the distance of the shortest route from node a to node i.

a: Start node
i: Destination node

Data , Execution Log

### Excercised 2

Run dijkstraPath.py on the following data and measure the computation time.

### Program of Dijkstra's (Pyton 3)

#### Find the shortest distance. dijkstra.py

 dijkstra.py #!/usr/bin/env python import queue G = [{}] def setNode(n): global G G = [{} for i in range(n)] def addEdge(s,t,w): global G G[s][t]=w def solve(s,t): global G fixed = {} q = queue.PriorityQueue() q.put((0,s)) while q.empty() == False: w,x = q.get() if x in fixed: continue; fixed[x] = w if x == t: return w for y in G[x]: if (y in fixed) == False: q.put((w + G[x][y], y)) return None n,m = map(int,input().split()) setNode(n) for i in range(m): s,t,w = map(int,input().split()) addEdge(s,t,w) addEdge(t,s,w) src, dst = map(int,input().split()) print(solve(src,dst)) 
 Execution Log of dijkstra.py $./dijkstra.py < route03.data 213  The execution log for othe data is here. #### Find the shortest distance and shortest path. dijkstraPath.py When updating the distance to a node, record the previous node in via[]. Once the shortest distance to the destination node is fixed, the shortest path can be found by tracing the transit nodes in reverse order.  dijkstraPath.py #!/usr/bin/env python import queue G = [{}] def setNode(n): global G G = [{} for i in range(n)] def addEdge(s,t,w): global G G[s][t]=w def getRoute(via,s,t): r = [t] y = t while y != s: y = via[y] r.append(y) r.reverse() return r def solve(s,t): global G fixed = {} via = {} q = queue.PriorityQueue() q.put((0,s,None)) while q.empty() == False: w,x,prev = q.get() if x in fixed: continue; fixed[x] = w via[x] = prev if x == t: return w, getRoute(via,s,t) for y in G[x]: if (y in fixed) == False: q.put((w + G[x][y],y,x)) return None, None n,m = map(int,input().split()) setNode(n) for i in range(m): s,t,w = map(int,input().split()) addEdge(s,t,w) addEdge(t,s,w) src, dst = map(int,input().split()) d,r = solve(src,dst) print(d) print(r)   Execution log of dijkstraPath.py $ ./dijkstraPath.py < route03.data 213 [0, 4, 5, 10, 15, 20, 24, 25, 29] 
Execution log for other data is here.

### Program of Dijkstra's (java, naive and slow version)

The slow Java program that simply implements Dijkstra's algorithm is as follows.

 Dijkstra.java import java.util.*; public class Dijkstra { public static void main(String[] args) { Scanner sc = new Scanner(System.in); // read from Standard Input int n = sc.nextInt(); // number of nodes int m = sc.nextInt(); // number of edges int[][] mat = new int[n][n]; // Adjacency matrix of node connections for (int i=0; i0 && !fixed[y]) { // if y is directly connected with x and y is unfixed int d2 = distance[x]+mat[x][y];// calculate distance of y via x // if shorter, update the shortest distance of y if (d2 < distance[y]) distance[y] = d2; } } } } // returns index of minimum distance[] whose fixed[] is false static int minIndex(int[] distance,boolean[] fixed) { int idx = -1; for (int i=0; i distance[i]) idx = i; // index of shortest node } return idx; } } 
 Execution log of Dijkstra.java $javac Dijkstra.java$ java Dijkstra < route03.data 213 
Execution log for other data is here.

#### Out-of-Memory Exception during Program Execution

In "Dijkstra.java", node connections are kept in memory as a 2-dimensional array, so large data may cause an Out-of-Memory Exception and terminate program execution. In that case, expand the heap memory of the java virtual machine with the -Xms option of java command.

 route13.data(part) 62500 124906 0 1 15 0 250 58 1 2 79 1 251 1 2 3 144 ...(omitted)... 

## Priority First Search

In Dijkstra's algorithm, when dealing with sparse graphs (that is, graphs with fewer edges than the number of nodes), it is expected that the calculation speed will be increased by the following measures. Although the complexity order does not change.

• Use a Priority Queue (heap) to hold unfixed nodes.
• Use an adjacency list instead of an adjacency matrix as a deta structure representing node connectivity.
Although the algorithm itself is Dijkstra's, this implementation is sometimes called "Priority First Search".

The computational complexity of Priority First Seach is:
$\quad$ "for each of $n$ nodes"
$\quad$$\quad "to retrieve the shortest node from the heap" \quad$$\quad$ and
$\quad$$\quad$ "to update the distance of nodes directly connected by the most recently fixed node",
where
"for each of $n$ nodes" :      $O(n)$
"to retrieve the shortest node from the heap":     $O(\log n)$
"to update the distance of nodes directly connected by most recently fixed node":      $O(n)$   ← can be regarded as a constant for sparse graphs.
The practical execution time is expected to be
$O(n) \times O( \log n + \mbox{Constant}) = O(n \log n)$ .

Priority Queue is a data structure called a "heap" that allows access to specific data in $O(\log n)$.

### Program of Priority First Search (Java)

The Java program of Priority First Search written using the PriorityQueue class is as follows.

 PFSPath.java import java.util.*; public class PFSPath { static ArrayList graph = new ArrayList<>(); public static void main(String[] args) { Scanner sc = new Scanner(System.in); // read from Standard Input int n = sc.nextInt(); // nubmer of nodes int m = sc.nextInt(); // nubmer of edges for (int i=0; i fixed = new HashSet<>(); // fixed node src.d = 0; // distance of start node is 0 var queue = new PriorityQueue(); queue.add(src); while (queue.size() > 0) { Node x = queue.poll(); // get minimum node from queue assert !fixed.contains(x); if (x == dst) return x.d; // find the answer fixed.add(x); // set node x as fixed for (Map.Entry entry: x.edges.entrySet()) { // each edge from node x Node y = graph.get(entry.getKey()); // edge's endpoint int w = entry.getValue(); // edge's weights int d2 = x.d + w; // distance to y via x if (fixed.contains(y) || d2 >= y.d) continue;　// do nothing if fixed or further if (queue.contains(y)) queue.remove(y); // remove y from queue y.d = d2; // update distance to y y.via = x; // set the previous node of y is x queue.add(y); // add y into queue } } return Integer.MAX_VALUE; // unconnected } static ArrayList getRoute(Node src, Node dst) { // shortest path from src to dst ArrayList r = new ArrayList<>(); r.add(dst); // start from dst Node x = dst; while (x != src) { // trace back to src x = x.via; r.add(x); } Collections.reverse(r); // reverse the list (src->dst) return r; } static class Node implements Comparable { // node int id; // node identity int d; // distance Node via; // previous node HashMap edges; // edge: (edge's endpoint, weights) public Node(int id, int d) { this.id = id; this.d = d; edges = new HashMap<>(); } void addEdge(int t, int w) { edges.put(t, w); } public int compareTo(Node x) { return d - x.d; } } } 
 Execution log of PFSPath.java $java PFSPath < route03.data 213 0 4 5 10 15 20 24 25 29  Execution log for other data is here. Let's run "PFSPath.java" against data representing a sparse graph with many nodes, say "route13.data" Unlike the case of "DijkstraPath.java", the answer can be obtained with "small memory consumption" and "fast".  Execution time of PFSPath.java for route13.dataCPU:Ryzen 5900HX, Memory: DDR4 32GB $ time (java PFSPath < route13.data >/dev/null) real 0m0.322s user 0m0.802s sys 0m0.399s 

### Notice

Dijkstra's algorithm cannot solve when there are edges with negative weights in the graph. In such cases, use the Bellman-Ford algorithm to solve.

Yoshihisa Nitta

http://nw.tsuda.ac.jp/