25 Mar 2021

Graph revision

Basics

Graph consists of

nodes/ vertices (V)
edges (E)

Types of graph

hypergraph: each edge connects >= 2 nodes, each edge is unique. Contains at least one node.
multigraph: Contains at least one node, each edge connects two nodes, two nodes may be connected by more than one edge

Graph G = <V,E>

Terminology

Degree of a node : number of adjacent edges.

Degree of a graph : max number of adjacent edges.

Diameter: max distance between two ndoes, following the shortest path

Special graph

	Star	Clique/Complete Graph	Line/Path	Cycle	Bipartite Graph
Diameter	2	1	n-1	n/2 or n/2-1	Max: n-1
Degree	n-1	n-1	2	2	2

Bipartite Graph

Nodes divided into two sets, with no edges between nodes in the same set.

Max diameter: n-1 In cycle, count no of nodes -> even yes, odd not a bipartite

Representing a graph

Adjacency List

Graph consists of nodes + edges

Nodes: stored in an array
Edges: linked list per node

Adjacency Matrix

Nodes
Edges: pairs of nodes

A[v][w] = 1 iff (v,w) subset of E A^2 = length 2 paths

Adj List VS Adj Matrix

Using Adj List:

for a graph G = (V,E)

array of size V
linked list of size E
total: O(V + E)
cycle: O(V)
clique: O(V^2)

Using Adj Matrix:

array of size V^2
total: O(V^2)
cycle: O(V^2)
clique: O(V^2)

Rule: If the graph is dense, using adjacent matrix or else use adjacent list

dense: E = theta(V^2)

What is the worst time complexity of checking node u and v are directly connected by an edge in a graph G = (V, E) when adjacency list is used:
- O(min(V, E))

	Adjacency Matrix	Adjacency List
fast query	are v,w neighbours	find me any neighbours && enumerate all neighbours
slow query	find me any neighbour of v && enumerate all neighbours	are v,w neighbours

Searhcing a graph

Breadth-first search

Concept

Explore level by level
Frontier: current level
Initially: {S}
Advance Frontier
Dont backtrack
Find sahortest path

Code implementation

Build levels
Calculate level[i] from level[i-1]
Skip visited nodes

BFS(Node[] nodeList, int startId) {
    boolean[] visited = new boolean[nodeList.length];
    Arrays.fill(visited, false);
    int[] parent = new int[nodelist.length];
    Arrays.fill(parent, -1);

    for (int start = 0; start < nodeList.length; start++) {
        if (!visited[start]){
            // Main code goes here!
            Collection<Integer> frontier = new Collection<Integer>;
            frontier.add(startId);

            while (!frontier.isEmpty()){
                Collection<Integer> nextFrontier = new … ;
                for (Integer v : frontier) {
                    for (Integer w : nodeList[v].nbrList) {
                        if (!visited[w]) {
                            visited[w] = true;
                            parent[w] = v;
                            nextFrontier.add(w);
                        }
                    }
                }
                frontier = nextFrontier;
            }
        }
    }
}

Analysis

When does BFS failed to visit every node?

when the graph has two components

Running time using adjacency List: O(V+E)

Vertex v = “start” once ==> O(V)

Vertex v add to nextFrontier once ==> O(V)

after added never re-added

Each v.nbrList is enumerated once. ==> O(E)

when v is removed from frontier

BFS always run in O(V^2) on a complete graph as O(V+E) –> O(V^2) in a clique

using an adjacency matrix: O(V^2)

Shortest Path

Parent pointers store the shortest path
Shortest path is a tree
(Possibly high degrees; possibly high diameter)

Applications

Source of applications

Depth-First Search

Concept

Follow path until get struck
Backtrack until find a new node that is not struck
recursively explore it
Dont repeat a vertex

Code

DFS-visit(Node[] nodeList, boolean[] visited, int startId){
    for (Integer v : nodeList[startId].nbrList) {
        if (!visited[v]) {
            visited[v] = true;
            DFS-visit(nodeList, visited, v);
        }
    }
}

DFS(Node[] nodeList){
    boolean[] visited = new boolean[nodeList.length];
    Arrays.fill(visited, false);
    for (start = i; start<nodeList.length; start++) {
        if (!visited[start]){
            visited[start] = true;
            DFS-visit(nodeList, visited, start);
        }
    }
}

Analysis

DFS parent graph is a Tree. DFS does not contain the shortest path.

Running time of DFS is O(V+E) using Adj List.

DFS visit called once per node ==> O(V)
In one DFS visit, each neighbour is enumerated ==> O(E) in total

Running time of DFS is O(V^2) using Adj Matrix.

DFS visit called once per node ==> O(V)
In one DFS visit, each neighbour is enumerated ==> O(V) per node

BFS + DFS

using same algorithm

BFS -> queue

Add start node to queue
Repeat until queue is empty
- Remove node v from front of queue
- visit v
- explore v
- add unvisited neighbours to queue

DFS -> stack

Add start node to stack
Repeat until stack is empty
- Pop node v from front of the stack
- visit v
- explore v
- push all unvisited neighbours of v to front of stack

DFS + BFS: visited every node and edge but not every path!

Shortest Paths for Direted Graph

Bellman-Ford

Code

n = V.length;
for (i=0; i<n; i++)
    for (Edge e : graph)
        relax(e)

Analysis

When to stop early

When an entire sequencce of E relax operations have no effect.

Running Time:

O(EV) Negative weights:

No issues, but negative weight cycle cannot Detect negative weights cycles:

Run bell-man ford for v+1 iterations, if last iteration has changes -> negative weight cycle if all same weight -> Use bfs

Dijkstra

Concept

Maintain distance estimate for every node
begin with shortest-path-tree
relax all outgoing edges

Code

public Dijkstra{
    private Graph G;
    private IPriorityQueue pq = new PriQueue();
    private double[] distTo;
    searchPath(int start) {
        pq.insert(start, 0.0);
        distTo = new double[G.size()];
        Arrays.fill(distTo, INFTY);
        distTo[start] = 0;
        while (!pq.isEmpty()) {
            int w = pq.deleteMin();
            for (Edge e : G[w].nbrList)
                relax(e);
        }
    }

    relax(Edge e) {
        int v = e.from();
        int w = e.to();
        double weight = e.weight();
        if (distTo[w] > distTo[v] + weight) {
            distTo[w] = distTo[v] + weight;
            parent[w] = v;
        if (pq.contains(w))
            pq.decreaseKey(w, distTo[w]);
        else
            pq.insert(w, distTo[w]);
        }
    }
}

Analysis

AVL tree for priority queue

insert, deletemin, decreasekey, contain

logn , logn, logn, 1 Running time:

O(E log v)

edge is relaxed once, node is added to pq once

O((V+E)log V) = O(E log V)

Can we stop as soon as we dequeue the dest

yes. A vertex is finished when it is dequeued. No negative weights and No reweight!

Directed Graph

Nodes
- at least one
Edges
- each edge connects two nodes
- each edge is unique
- each edge is directed

Topological order

Sequential total ordering of all nodes
Edges only point forward
Not every directed graph has such ordering

MUST BE ACYCLIC

DAG Topological Order using DFS

Use DFS (post-order) to find
process each node when it is last visited
Analysis

Running time
O(V+E)

Unique?
No

Using Kahn’s Algo

Concept

Repeat:
S = nodes in G that have no incoming edge
Add nodes in S to topo-order
remove all edges adj to nodes in S
remove nodes in S from graph

Implementation

Maintain all nodes in pq
keys are incoming edges
remove min-degree node u
for each outgoing edge(u,v), decrease key of v by 1

Analysis

Running time:

O(E log V)

Shortest Path for DAG

topological sort
relax in such order

Running time
O(E)

Longest path?
negate the weight and find shortest

Shortest Path for Tree

dfs / bfs
relax each edge 1st time u see it
O(V) time as there are only V edges in a tree.

Minimum spanning tree

working on weighted, undirected graph

Definition: a spanning tree is an acyclic subset of the edges that connects all nodes
A mimimum spanning tree is a spanning tree with minimum weights
!= shortest path
cannot used to find shortest path
a cut of a graph G=(V,E) is a partition of the vertices V into two disjoint subsets.
an edge crosses a cut if it has one vertex in each of the two sets.

Properties

No cycle
If you cut a mst, the two copies are both msts
(Cycle property) For every cycle, the maximum weight edge is not in mst 3.5. For every cycle, the minimum may/ may not in mst
(Cut property) For every partition of the nodes, the minimum weight edge across the cut is in the MST.

For every vertex, the minimum outgoing edge is always part of the MST

Algorithm to find

Red rule: If C is a cycle with no red arcs, then color the max-weight edge in C red.
Blue rule: If D is a cut with no blue arcs, then color the min-weight edge in D blue

General algo:

Greedy algo -> repeat: Apply red rule and blue rule

Bad algo:

If the number of vertices is 1, then return.
Divide the nodes into two sets.
Recursively calculate the MST of each set.
Find the lightest edge the connects the two sets and add it to the MST.
Return.

Prim’s algo

Basic idea:

S : set of nodes connected by blue edges.
Initially: S = {A}
Repeat:
- Identify cut: {S, V–S}
- Find minimum weight edge on cut.
- Add new node to S.

// Initialize priority queue
PriorityQueue pq = new PriorityQueue();
for (Node v : G.V()) {
    pq.insert(v, INFTY);
}
pq.decreaseKey(start, 0);

// Initialize set S
HashSet<Node> S = new HashSet<Node>();
S.put(start);

// Initialize parent hash table
HashMap<Node,Node> parent = new HashMap<Node,Node>();
parent.put(start, null);

while (!pq.isEmpty()){
    Node v = pq.deleteMin();
    S.put(v);
    for each (Edge e : v.edgeList()){
        Node w = e.otherNode(v);
        if (!S.get(w)) {
            pq.decreaseKey(w, e.getWeight());
            parent.put(w, v);
        }
    }
}

Analysis

O(E log V) using a binary heap (similar to dijkstra)

Kruskal’s Algorithm

Basic idea:

Sort edges by weight from smallest to biggest.
Consider edges in ascending order:
- If both endpoints are in the same blue tree, then color the edge red.
- Otherwise, color the edge blue.

Data structure:

Union-Find
Connect two nodes if they are in the same blue tree.

// Sort edges and initialize
Edge[] sortedEdges = sort(G.E());
ArrayList<Edge> mstEdges = new ArrayList<Edge>();
UnionFind uf = new UnionFind(G.V());
// Iterate through all the edges, in order
for (int i=0; i<sortedEdges.length; i++) {
    Edge e = sortedEdges[i]; // get edge
    Node v = e.one(); // get node endpoints
    Node w = e.two();
    if (!uf.find(v,w)) { // in the same tree?
        mstEdges.add(e); // save edge
        uf.union(v,w); // combine trees
    }
}

Analysis

Performance:

Sorting: O(E log E) = O(E log V)
For E edges:
- Find: O(α(n)) or O(log V)
- Union: O(α(n)) or O(log V)

Boruvka’s Algorithm

Initially:

Create n connected components, one for each node in the graph. => For each node: O(V) store a component identifier.

One “Boruvka” Step:

For each connected component, search for the minimum weight outgoing edge. => DFS or BFS: O(V + E) Check if edge connects two components. Remember minimum cost edge connected to each component.
Add selected edges.
Merge connected components. => Scan every node: O(V) Compute new component ids. Update component ids. Mark added edges.

Analysis

Initially:

n components

At each step:

k components => k/2 components.

Termination:

1 component

Conclusion:

At most O(log V) Boruvka steps.

Total time:

O((E+V)log V) = O(E log V)

variant

what if all edges have same weight => O(E) => DFS/BFS

A MST contains exactly (V-1) edges

Every spanning tree contains (V-1) edges

thus any spanning tree find by DFS/BFS is MST

What if all the edges have weights from {1..10}?

Idea: Use an array of size 10

Putting edges in array of linked lists: O(E)
Iterating over all edges in ascending order: O(E)
For each edge:
- Checking whether to add an edge: O(a)
- Union two components: O(a) Total: O(aE)

What if all the edges have weights from {1..10}?

Implement Priority Queue:

Use an array of size 10 to implement
Insert: put node in correct list O(V)
Remove: lookup node (e.g., in hash table) and remove from liked list. O(V)
ExtractMin: Remove from the minimum bucket.
DecreaseKey: lookup node (e.g., in hash table) and move to correct liked list. O(E) Total: O(E+V) = O(E)

Directed MST

only those with one root:

add minimum incoming edge
O(E)
Maximum S T
Ok to re-weight
OK to negative

Steiner Tree

NP hard

Algorithm SteinerMST:

For every pair of required vertices (v,w), calculate the shortest path from (v to w).
- Use Dijkstra V times.
- Or All-Pairs-Shortest-Paths
Construct new graph on required nodes.
- V = required nodes
- E = shortest path distances
Run MST on new graph.
- Use Prim’s or Kruskal’s or Boruvka’s
- MST gives edges on new graph
Map new edges back to original graph.
- Use shortest path discovered in Step 1.
- Add these edges to Steiner MST.
- Remove duplicates.

O(T) < 2* OPT(G)

Heap and Disjoint set

Priority queue

Operation	Unsorted Array	Sorted Array	AVL tree
insert	O(1)	O(n)	O(logn)
extractMax	O(n)	O(1)	O(logn)

Using a Heap

Concept

Heap ordering
- priority[parent] >= priority[child]
Complete binary tree
- Every level is full, except the last
- All nodes are as far left as possible
  
  Max height - Math.floor(log n)

implementation

insert:
- add new leave
- bubble up
increase / decrease key:
- update priority
- bubble up / down ( left first when bubble down)
delete:
- swap(key, last)
- remove(last)
- bubble down
removeMax:
- delete(root)

Analysis

same O(cost) as avl
faster real cost: no constant
simpler: no rotations
better concurrency

Store in array

root at slot 0
left(x) = 2x + 1
right(x) = 2x + 2
parent(x) = floor((x-1)/2)

Heap sort

Unsorted list -> heap
heap -> sorted list

STEP 1

insert n times => O(n log n)
Or by recursion => O(n)
- start from complete tree
- Base: each leaf is a heap
- left + right are heap (using bubble up/down)
- Cost: <= 2 * O(n)
  STEP 2
extract max for n times
O(n log n)
Analysis
O(n log n)
In-place
fast than merge sort
always complete in n log n
unstable

Disjoint set

Quick-find

Array: componentID
two objects are connected if they have the same componentID
Integer[] for int

Hashtable + open addressing for non-int

Code

{
find(int p, int q)
    return(componentId[p] == componentId[q]);

union(int p, int q)
    updateComponent = componentId[q]
    for (int i=0; i<componentId.length; i++)
        if (componentId[i] == updateComponent)
            componentId[i] = componentId[p];
}

Analysis

find:O(1)
union: O(n)
Flat trees

Quick-union

Array: Int[] parent
Parent is its own if it is the root
Two objects are connected if they are part of the same tree

Code

{
  find(int p, int q)
      while (parent[p] != p) p = parent[p];
      while (parent[q] != q) q = parent[q];
      return (p == q);

  union(int p, int q)
      while (parent[p] != p) p = parent[p];
      while (parent[q] != q) q= parent[q];
      parent[p] = q;
}

Analysis

find: O(n)
union: O(n)

Optimisation

Weighted Union

union(1,4) => which tree to make as the root? the one with larger weight

{
  union(int p, int q)
      while (parent[p] !=p) p = parent[p];
      while (parent[q] !=q) q = parent[q];
      if (size[p] > size[q] {
          parent[q] = p; // Link q to p
          size[p] = size[p] + size[q];
      }
      else {
          parent[p] = q; // Link p to q
          size[q] = size[p] + size[q];
      }
}

Analysis

Maximum depth of tree

O(logn)

when T1 is merged with T2, depth increased only when size(T1+T2) > 2* size(T1)

Running time

find: O(logn)

Union: O(logn)

Path compression

After finding the root: set the parent of each traversed node to the root.

findRoot(int p) {
    root = p;
    while (parent[root] != root) root = parent[root];
    while (parent[p] != p) {
        temp = parent[p];
        parent[p] = root;
        p = temp;
    }
    return root;
}

Analysis

find: O(logn)
union: O(logn)

Weighted-union + path compression

Theorem: Starting from empty, any sequence of m union/find operations on n objects takes: O(n +mα(m,n)time.

find : α(m,n)
union: α(m,n)

CS2040 Graph Summary Notes

Graph revision

Basics

Types of graph

Terminology

Special graph

Bipartite Graph

Representing a graph

Adjacency List

Adjacency Matrix

Adj List VS Adj Matrix

Searhcing a graph

Breadth-first search

Concept

Code implementation

Analysis

Shortest Path

Applications

Depth-First Search

Concept

Code

Analysis

BFS + DFS

Shortest Paths for Direted Graph

Bellman-Ford

Code

Analysis

Dijkstra

Concept

Code

Analysis

Directed Graph

Topological order

DAG Topological Order using DFS

Analysis

Using Kahn’s Algo

Concept

Implementation

Analysis

Shortest Path for DAG

Shortest Path for Tree

Minimum spanning tree

Properties

Algorithm to find

Prim’s algo

Analysis

Kruskal’s Algorithm

Analysis

Boruvka’s Algorithm

Analysis

variant

Directed MST

Maximum S T

Steiner Tree

Heap and Disjoint set

Priority queue

Using a Heap

Concept

implementation

Analysis

Store in array

Heap sort

STEP 1

STEP 2

Analysis

Disjoint set

Quick-find

Code

Analysis

Quick-union

Code

Analysis

Optimisation

Weighted Union

Analysis

Path compression

Analysis

Weighted-union + path compression