Research

Leetcode 139. Word Break

Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words.

Note that the same word in the dictionary may be reused multiple times in the segmentation.

Example 1:

Input: s = "leetcode", wordDict = ["leet","code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example 2:

Input: s = "applepenapple", wordDict = ["apple","pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
Note that you are allowed to reuse a dictionary word.

Example 3:

Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: false

Constraints:

• 1 <= s.length <= 300
• 1 <= wordDict.length <= 1000
• 1 <= wordDict[i].length <= 20
• s and wordDict[i] consist of only lowercase English letters.
• All the strings of wordDict are unique.

Solution

Approach 1: Brute Force

Algorithm

The naive approach to solve this problem is to use recursion and backtracking. For finding the solution, we check every possible prefix of that string in the dictionary of words, if it is found in the dictionary, then the recursive function is called for the remaining portion of that string. And, if in some function call it is found that the complete string is in dictionary, then it will return true.

class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
def wordBreakRecur(s: str, word_dict: Set[str], start: int):
if start == len(s):
return True
for end in range(start + 1, len(s) + 1):
if s[start:end] in word_dict and wordBreakRecur(s, word_dict, end):
return True
return False

return wordBreakRecur(s, set(wordDict), 0)

Complexity Analysis is the length of the input string.

• Time complexity : . Given a string of length , there are ways to split it into two parts. At each step, we have a choice: to split or not to split. In the worse case, when all choices are to be checked, that results in .
• Space complexity : . The depth of the recursion tree can go up to .

Approach 2: Recursion with memoization

Algorithm

In the previous approach we can see that many subproblems were redundant, i.e we were calling the recursive function multiple times for a particular string. To avoid this we can use memoization method, where an array is used to store the result of the subproblems. Now, when the function is called again for a particular string, value will be fetched and returned using the array, if its value has been already evaluated.

With memoization many redundant subproblems are avoided and recursion tree is pruned and thus it reduces the time complexity by a large factor.

class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
@lru_cache
def wordBreakMemo(s: str, word_dict: FrozenSet[str], start: int):
if start == len(s):
return True
for end in range(start + 1, len(s) + 1):
if s[start:end] in word_dict and wordBreakMemo(s, word_dict, end):
return True
return False

return wordBreakMemo(s, frozenset(wordDict), 0)

Complexity Analysis

nn is the length of the input string.

• Time complexity : . Size of recursion tree can go up to .
• Space complexity : . The depth of recursion tree can go up to .

Algorithm

Another approach is to use Breadth-First-Search. Visualize the string as a tree where each node represents the prefix upto index endend. Two nodes are connected only if the substring between the indices linked with those nodes is also a valid string which is present in the dictionary. In order to form such a tree, we start with the first character of the given string (say ) which acts as the root of the tree being formed and find every possible substring starting with that character which is a part of the dictionary. Further, the ending index (say ) of every such substring is pushed at the back of a queue which will be used for Breadth First Search. Now, we pop an element out from the front of the queue and perform the same process considering the string to be the original string and the popped node as the root of the tree this time. This process is continued, for all the nodes appended in the queue during the course of the process. If we are able to obtain the last element of the given string as a node (leaf) of the tree, this implies that the given string can be partitioned into substrings which are all a part of the given dictionary.

The formation of the tree can be better understood with this example:

class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
word_set = set(wordDict)
q = deque()
visited = set()

q.append(0)
while q:
start = q.popleft()
if start in visited:
continue
for end in range(start + 1, len(s) + 1):
if s[start:end] in word_set:
q.append(end)
if end == len(s):
return True
return False

Complexity Analysis

nn is the length of the input string.

• Time complexity : . For every starting index, the search can continue till the end of the given string.
• Space complexity : . Queue of at most size is needed.

Approach 4: Using Dynamic Programming

Algorithm

The intuition behind this approach is that the given problem (ss) can be divided into subproblems and . If these subproblems individually satisfy the required conditions, the complete problem, also satisfies the same. e.g. “catsanddog” can be split into two substrings “catsand”, “dog”. The subproblem “catsand” can be further divided into “cats”,”and”, which individually are a part of the dictionary making “catsand” satisfy the condition. Going further backwards, “\text{catsand}catsand”, “\text{dog}dog” also satisfy the required criteria individually leading to the complete string “\text{catsanddog}catsanddog” also to satisfy the criteria.

Now, we’ll move onto the process of \text{dp}dp array formation. We make use of \text{dp}dp array of size n+1n+1, where nn is the length of the given string. We also use two index pointers ii and jj, where ii refers to the length of the substring (s’s′) considered currently starting from the beginning, and jj refers to the index partitioning the current substring (s’s′) into smaller substrings s'(0,j)s′(0,j) and s'(j+1,i)s′(j+1,i). To fill in the \text{dp}dp array, we initialize the element \text{dp}dp as \text{true}true, since the null string is always present in the dictionary, and the rest of the elements of \text{dp}dp as \text{false}false. We consider substrings of all possible lengths starting from the beginning by making use of index ii. For every such substring, we partition the string into two further substrings s1′s1′ and s2′s2′ in all possible ways using the index jj (Note that the ii now refers to the ending index of s2′s2′). Now, to fill in the entry \text{dp}[i]dp[i], we check if the \text{dp}[j]dp[j] contains \text{true}true, i.e. if the substring s1′s1′ fulfills the required criteria. If so, we further check if s2′s2′ is present in the dictionary. If both the strings fulfill the criteria, we make \text{dp}[i]dp[i] as \text{true}true, otherwise as \text{false}false.

class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
word_set = set(wordDict)
dp = [False] * (len(s) + 1)
dp = True

for i in range(1, len(s) + 1):
for j in range(i):
if dp[j] and s[j:i] in word_set:
dp[i] = True
break
return dp[len(s)]

Complexity Analysis

nn is the length of the input string.

• Time complexity : O(n^3)O(n3). There are two nested loops, and substring computation at each iteration. Overall that results in O(n^3)O(n3) time complexity.
• Space complexity : O(n)O(n). Length of pp array is n+1n+1.
0122
Previous Post

Next article