References

- https://leetcode.com/problems/connecting-cities-with-minimum-cost/
- https://www.youtube.com/watch?v=GFsPyTdCS1I

##### Problem Statement

There are `n`

cities numbered from 1 to `n`

.

You are given `connections`

, where each `connections[i] = [city1, city2, cost]`

represents the cost to connect `city1`

and `city2`

together. (A *connection* is bidirectional: connecting `city1`

and `city2`

is the same as connecting `city2`

and `city1`

.)

Return the minimum cost so that for every pair of cities, there exists a path of connections (possibly of length 1) that connects those two cities together. The cost is the sum of the connection costs used. If the task is impossible, return -1.

**Example 1:**

Input:n = 3, connections = [[1,2,5],[1,3,6],[2,3,1]]Output:6Explanation:Choosing any 2 edges will connect all cities so we choose the minimum 2.

**Example 2:**

Input:n = 4, connections = [[1,2,3],[3,4,4]]Output:-1Explanation:There is no way to connect all cities even if all edges are used.

**Note:**

`1 <= n <= 10000`

`1 <= connections.length <= 10000`

`1 <= connections[i][0], connections[i][1] <= n`

`0 <= connections[i][2] <= 10`

^{5}`connections[i][0] != connections[i][1]`

##### Solution

###### Approach 1: Minimum Spanning Tree (Using Kruskal’s algorithm)

**Intuition**

If we model the cities and connections as a graph, each connection is an edge (undirected) and each city is a node of the graph. We need to find a subset of edges which connects all the nodes of the graph with the minimum possible total weight. This is by definition the Minimum Spanning Tree or MST of a graph.

**Algorithm**

There are a variety of algorithms that we can use to obtain the MST of a graph. We will use Kruskal’s algorithm here, which is implemented using the Disjoint set Union-Find data structure.

In order to obtain the MST using Kruskal’s algorithm, we first sort all the connections (edges) present in the graph based on their weights (in increasing order) and will iterate over them one by one. The objective here is to greedily pick edges that will help us to connect more nodes in the graph. Each time we find a new edge which *does not* result in a cycle with the edges selected so far, we add that edge in the final MST. We keep doing this till we have obtained the MST which connects all the nodes in the graph (i.e. connects all the cities using the connections).

Disjoint-set union find can be implemented in a couple of ways. A plain union find is shown below which keeps the track of the parent of each node (initially parent of `i`

is set to itself, i.e. `i`

) and performs the union and find using a helper method `getRoot`

.

''' Vanilla Disjoint-Set Union Find ''' class UnionFind: def __init__(self, N): self.parents = [i for i in range (N + 1)] def union(self, x, y): root1, root2 = self.find(x), self.find(y) if root1 == root2: return self.parents[root2] = root1 def isInSameGroup(self, x, y): return self.find(x) == self.find(y) def find(city): # Traverse all the way to the top (root) going through the parent nodes while city != parents[city]: city = parents[city] return parents[city]

The above implementation can be made faster by incorporating **Path compression**. Here, while obtaining the root, we compress the path by assigning the grandparent of the node as the parent (thereby skipping connections and moving nodes closer to the root). We modify the `Find`

method to implement path compression.

def find(city): while city != parents[city]: # Path compression # a's grandparent is now a's parent parents[city] = parents[parents[city]] return parents[city]

This can be made even faster using a technique known as **Weighted Union**. In this technique, in addition to the parent nodes, we also keep the weights of each of the nodes. Every time we take union, the root node with more weight (i.e. having more elements in the corresponding set) is used as the parent node of the other node. We initialize the weight corresponding to each node as 1 initially, as each element belongs to it’s own set in the beginning. Below is the implementation of this idea (we modify `Union`

method).

class UnionFind: def __init__(self, N): # Set the initial parent node to itself self.parents = [i for i in range (N + 1)] # Initialize weight for each node to be 1 # Used to store weights of each nodes self.weights = [i for i in range (N + 1)] # Modify Union method to incorporate weighted compression def union(self, a, b): rootA, rootB = self.find(a), self.find(b) # If both a and b have same root, # i.e. they both belong to the same set, hence we don't need to take union if root1 == root2: return # Weighted union self.parents[root2] = root1 if self.weights[rootA] > self.weights[rootB]: # In case rootA is having more weight # 1. Make rootA as the parent of rootB # 2. Increment the weight of rootA by rootB's weight self.parents[rootB] = rootA self.weights[rootA] += self.weights[rootB] else: # Otherwise # 1. Make rootB as the parent of rootA # 2. Increment the weight of rootB by rootA's weight self.parents[rootA] = rootB self.weights[rootB] += self.weights[rootA] def isInSameGroup(self, x, y): return self.find(x) == self.find(y) def find(city): # Traverse all the way to the top (root) going through the parent nodes while city != parents[city]: city = parents[city] return parents[city] class Solution: def minimumCost(self, n: int, connections: List[List[int]]) -> int: unionfind = UnionFind(n) # Sort connections based on their weights (in increasing order) connections.sort(key=lambda x: x[2]) # Keep track of the total cost of adding all those edges total = 0 for city1, city2, cost in connections: if not isInSameGroup(city1, city2): union(city1, city2) total += cost # Check that all cities are connected. root = find(n) return total if all(root == find(city) for city in range(1, n+1)) else -1

If we combine both **Path compression** and **Weighted Union**, it takes \log^{\ast} Nlog∗*N* for the union and find operations in case of Disjoint-set union link. Here \log^{\ast} Nlog∗*N* is an extremely slow-growing inverse Ackermann function a.k.a Iterated logarithm and practically does not exceed 5. Hence it can be treated as a constant for implementation purposes.

We can combine all the concepts we have seen above in order to implement Kruskal’s algorithm for obtaining MST of a graph. Below is the implementation of this.

class Solution: def minimumCost(self, N: int, connections: List[List[int]]) -> int: if (L := len(connections)) < N-1: return -1 parent = list(range(0, N+1)) rank = [1] * (N+1) # by Kruskal def find(x): # nonlocal parent if parent[x] != x: parent[x] = find(parent[x]) return parent[x] connections.sort(key = lambda x: x[2]) # we have to find N-1 edges i = j = cost = 0 while i < N-1: u, v, w = connections[j] if find(u) != find(v): cost += w u, v = parent[u], parent[v] if rank[u] < rank[v]: parent[u] = parent[v] rank[v] += rank[u] else: parent[v] = parent[u] rank[u] += rank[v] i += 1 j += 1 return cost

**Complexity Analysis**

- Time complexity: Assuming $N$ to be the total number of nodes (cities) and $M$ to be the total number of edges (connections). Sorting all the $M$ connections will take $O(M \cdot \log M)
*$*. Performing union find each time will take $\log^{\ast} N$ (Iterated logarithm). Hence for $M$ edges, it’s $O(M \cdot \log^{\ast} N)*$*which is practically $O(M)$ as the value of iterated logarithm, $\log^{\ast} N$ never exceeds 5. - Space complexity: $O(N)$, space required by
`parents`

and`weights`

.