Abstract Data Types and Graphs
Motivation
When we describe an algorithm we want to focus on what it does to its data, not how the data is laid out in memory. An abstract data type (ADT) specifies a collection of values together with the operations that may be performed on them, while leaving the representation unspecified. This separation lets us swap implementations (linked list vs. array, adjacency list vs. adjacency matrix) without rewriting any algorithm built on the ADT, and it lets us reason about cost in terms of the operations the algorithm performs rather than the bytes it touches (Kleinberg and Tardos 2005; Cormen et al. 2009).
Graphs in particular admit several reasonable representations with very different cost profiles, so understanding the ADT view is essential before discussing graph algorithms like BFS, DFS, or Dijkstra’s algorithm.
Common abstract data types
| ADT | Operations | Typical implementation |
|---|---|---|
| Stack | push, pop, top | array or linked list, \(O(1)\) each |
| Queue | enqueue, dequeue | circular array or linked list, \(O(1)\) each |
| List | insert, delete, access | linked list (\(O(1)\) insert, \(O(n)\) access) or dynamic array (\(O(n)\) insert at front, \(O(1)\) access) |
| Priority queue | insert, extract-min, decrease-key | binary heap (\(O(\log n)\)); Fibonacci heap (\(O(1)\) amortized decrease-key) |
| Dictionary / Map | insert, delete, lookup | balanced BST (\(O(\log n)\)) or hash table (\(O(1)\) expected) |
| Disjoint sets / Union-find | union, find | union-by-rank with path compression, \(O(\alpha(n))\) amortized |
The same operations can have very different costs depending on representation, which is why algorithm analysis must specify which ADT operations are used and what they cost.
Graphs
A graph is a pair \(G = (V, E)\) where \(V\) is a finite set of vertices and \(E\) is a set of edges. Edges may be:
- Directed (an edge \((u, v)\) goes from \(u\) to \(v\)) or undirected (an edge \(\{u, v\}\) has no orientation).
- Weighted (each edge has a real-valued weight \(w(e)\)) or unweighted.
- Simple (no self-loops, no parallel edges) or multi-graphs.
A path is a sequence of vertices \(v_0, v_1, \dots, v_k\) with \((v_{i-1}, v_i) \in E\) for each \(i\). A path is simple if no vertex repeats. A cycle is a path with \(v_0 = v_k\) and \(k \ge 1\).
Let \(n = |V|\) and \(m = |E|\). Most graph algorithms have running times of the form \(O(n + m)\) or \(O(m \log n)\), and the constants depend on the representation.
Graph representations
Adjacency matrix
A boolean (or weighted) \(n \times n\) matrix \(A\) with \(A[u][v] = 1\) iff \((u, v) \in E\).
- Edge query \(u \to v\): \(O(1)\).
- List neighbours of \(u\): \(O(n)\), even if \(u\) has few neighbours.
- Space: \(\Theta(n^2)\) regardless of \(|E|\).
Useful on dense graphs (\(m = \Theta(n^2)\)) and when many edge-existence queries are made.
Adjacency list
For each vertex \(u\), a list of its neighbours (and edge weights, if any).
- Edge query \(u \to v\): \(O(\deg(u))\) in the worst case.
- List neighbours of \(u\): \(O(\deg(u))\), optimal.
- Space: \(\Theta(n + m)\).
Useful on sparse graphs (\(m = O(n)\) or \(O(n \log n)\)), which is the common case in practice.
Edge list
A simple list of edges. Compact but inefficient for most queries; mainly used as an input or interchange format.
Choosing a representation
Adjacency lists are the default for nearly all graph algorithms in this course because the typical graphs of interest (web graph, social graph, road network) are sparse. The choice matters: BFS runs in \(O(n + m)\) on an adjacency list but \(O(n^2)\) on an adjacency matrix.