AmazonData StructuresHashmapMedium DifficultyProgramming InterviewsProgramming LanguagesPythonQueue

# [LeetCode 1429] First Unique Number

You have a queue of integers, you need to retrieve the first unique integer in the queue.

Implement the FirstUnique class:

• FirstUnique(int[] nums) Initializes the object with the numbers in the queue.
• int showFirstUnique() returns the value of the first unique integer of the queue, and returns -1 if there is no such integer.
• void add(int value) insert value to the queue.

Example 1:

Input:
[[[2,3,5]],[],,[],,[],,[]]
Output:
[null,2,null,2,null,3,null,-1]
Explanation:
FirstUnique firstUnique = new FirstUnique([2,3,5]);
firstUnique.showFirstUnique(); // return 2
firstUnique.add(5);            // the queue is now [2,3,5,5] firstUnique.showFirstUnique(); // return 2
firstUnique.add(2);            // the queue is now [2,3,5,5,2] firstUnique.showFirstUnique(); // return 3
firstUnique.add(3);            // the queue is now [2,3,5,5,2,3] firstUnique.showFirstUnique(); // return -1

Example 2:

Input:
[[[7,7,7,7,7,7]],[],,,,,,[]]
Output:
[null,-1,null,null,null,null,null,17]
Explanation:
FirstUnique firstUnique = new FirstUnique([7,7,7,7,7,7]); firstUnique.showFirstUnique(); // return -1
firstUnique.add(7);            // the queue is now [7,7,7,7,7,7,7] firstUnique.add(3);            // the queue is now [7,7,7,7,7,7,7,3] firstUnique.add(3);            // the queue is now [7,7,7,7,7,7,7,3,3] firstUnique.add(7);            // the queue is now [7,7,7,7,7,7,7,3,3,7] firstUnique.add(17);           // the queue is now [7,7,7,7,7,7,7,3,3,7,17] firstUnique.showFirstUnique(); // return 17

Example 3:

Input:
[[],[],,[]]
Output:
[null,809,null,-1]
Explanation:
FirstUnique firstUnique = new FirstUnique();
firstUnique.showFirstUnique(); // return 809
firstUnique.add(809);          // the queue is now [809,809] firstUnique.showFirstUnique(); // return -1

Constraints:

• 1 <= nums.length <= 10^5
• 1 <= nums[i] <= 10^8
• 1 <= value <= 10^8
• At most 50000 calls will be made to showFirstUnique and add.

## Solution

#### Approach 1: Brute Force

Intuition

The simplest solution for this problem is to create a queue as normal, and then search through it to find the first unique number.

We do this by looping through the items in the queue (starting from the oldest). For each item, we loop through the queue again, counting how many times it appears. If it only appears once, we stop and return it. If there are no items that appear only once, then we return -1.

Algorithm

We don’t need to implement counting ourselves; we can use Collections.frequency(...) in Java, and .count(...) in Python.

from collections import deque

class FirstUnique:
def __init__(self, nums: List[int]):
self._queue = deque(nums)

def showFirstUnique(self):
for item in self._queue:
if self._queue.count(item) == 1:
return item
return -1

self._queue.append(value)

Complexity Analysis

Let be the length of the initial array passed into the constructor. Let be the total number of items added onto the queue so far (including those from the constructor).

• Time complexity :
• constructor . We create a new copy of the passed in numbers; this has a cost of time to create.
• add() .We perform a single append to a queue; this has a cost of .
• showFirstUnique() . Counting how many times a given item appears in the queue has a cost of . This is true even for the library functions we’re using.In the worst case, we search through the entire queue without finding a unique number. At a cost of each time, this gives us a cost of .
• Space complexity : .The memory used is simply the total number of items currently in the queue.

#### Approach 2: Queue and HashMap of Unique-Status

Intuition

When given a data-structure-design question like this one, a good strategy is to start simple and identify where the inefficiencies are.

In Approach 1, we performed numerous count operations; each call to showFirstUnique() performed, in the worst case, count operations. At a cost of per count, this was expensive! These count-ups are also repetitive, and so should be our focus for optimization.

When deciding whether or not to return a particular number, all we need to know is whether or not that number is unique. In other words, we only want to know whether it has occurred “once”, or “more than once”. Seeing as numbers cannot be removed from the FirstUnique data structure, we know that once a particular number is added for the second time, that number will never again be unique.

So, instead of counting how many times a number occurred in the queue, we could instead keep a HashMap of numbers to booleans, where for each number that has been added, we’re storing the answer to the question is this number unique?. We’ll call it isUnique. Then, when FirstUnique.add(number) is called, one of three cases will apply:

1. This particular number has never been seen before now. Add it to isUnique with a value of true. Also, add it to the queue.
2. This particular number has already been seen by isUnique, with a value of true. This means that the number was previously unique (and is currently in the queue), but with this new addition, it no longer is. Update its value to false. Do not add it to the queue.
3. This particular number has already been seen by isUnique, with a value of false. This means that it has already been seen twice before. We don’t need to do anything, and shouldn’t add it to the queue.

Then, instead of needing to perform “count” operations, the showFirstUnique() can simply look in isUnique for the information it needs.

Here is an animation showing how this works so far.

+++

This is a good start. However, you might have noticed another inefficiency while watching the animation: the 7 at the front needs to be stepped passed on every call to showFirstUnique(). As soon as the second 7 was added, though, this 7 was no longer unique, and so would never be the answer to a call to showFirstUnique() again (remember, this queue has no deletion operation). Therefore, there is no reason to leave it in the queue—we should remove it so that we don’t keep looking at it.

But, where should we actually do these removals from the queue? In showFirstUnique(), or in add()?

• If we do a removal in the add() method, we’ll need to do an search of the queue to find the number, and then potentially another to remove it from the queue, (if you’re chosen programming language even supports deletions from the middle of a queue!).
• If we do the removal in the showFirstUnique() method, we might sometimes need to do lots of removals before we find a unique number to return. However, it will be faster across all calls to the data structure, because we didn’t need to do a search to find the non-unique number like we would have with add(). We’ll talk more about this in the time complexity section.

So, our best option is to do the removals in the showFirstUnique() method.

We should leave the number’s value as false in isUnique, though. Like we said above, a number can never “become” unique again.

Here is an animation showing the full algorithm.

+++

from collections import deque

class FirstUnique:

def __init__(self, nums: List[int]):
self._queue = deque(nums)
self._is_unique = {}
for num in nums:
# Notice that we're calling the "add" method of FirstUnique; not of the queue.

def showFirstUnique(self) -> int:
# We need to start by "cleaning" the queue of any non-uniques at the start.
# Note that we know that if a value is in the queue, then it is also in
# is_unique, as the implementation of add() guarantees this.
while self._queue and not self._is_unique[self._queue]:
self._queue.popleft()
# Check if there is still a value left in the queue. There might be no uniques.
if self._queue:
return self._queue # We don't want to actually *remove* the value.
return -1

def add(self, value: int) -> None:
# Case 1: We need to add the number to the queue and mark it as unique.
if value not in self._is_unique:
self._is_unique[value] = True
self._queue.append(value)
# Case 2 and 3: We need to mark the number as no longer unique.
else:
self._is_unique[value] = False

Complexity Analysis

Let be the length of the initial array passed into the constructor.

Let be the total number of items added onto the queue so far (including those from the constructor).

• Time complexity :
• constructor .For each of the numbers passed into the constructor, we’re making a call to add(). As we will determine below, each call to add() has a cost of . Therefore, for the numbers added in the constructor, we get a total cost of .
• add() .We check the status of the number in a HashMap, which is an operation. We also optionally modify its status in the HashMap, which, again, is an operation. Depending on the status of the number, we add it to the queue, which is again, an operation. Therefore, we get an time complexity for the add() method.
• showFirstUnique() (amortized). For this implementation, the showFirstUnique() method needs to iterate down the queue until it finds a unique number. For each unique number it encounters along the way, it removes it. Removing an item from a queue has a cost of . The total number of these removals we need to carry out is proportional to the number of calls to add(), because each add() corresponds to at most one removal that will ultimately have to happen. Then when we find a unique number, it is an operation to return it. Because the number of removals is proportional to the number of calls to add(), we say that the time complexity amortizes across all calls to add() and showFirstUnique(), giving an overall time complexity of (amortized).If you’re not familiar with amortized time complexity, check out the Wikipedia page.
• Space complexity : .Each number is stored up to once in the queue, and up to once in the HashMap. Therefore, we get an overall space complexity of .

#### Approach 3: LinkedHashSet for Queue, and HashMap of Unique-Statuses

Intuition

While an amortized time of will always be better than a non-amortized time of a higher complexity class, say, , non-amortized would be even better. The downside of amortized time complexities is that while the average time per method call is “good”, some calls might be really slow. Imagine if every one-millionth Google search took 1,000,000 ms, and all the others took 1 ms. The amortized time would be 2 ms, which, in theory, sounds great! However, the one-in-a-million person waiting 16 minutes for their search results won’t be very happy with Google! (There would probably be less bad press if every search operation just took 5 ms…).

Is there a way we could remove the amortization from the data structure we designed in Approach 2?

For this to happen, we’d need to have each “removal” happen with its corresponding add(); not during some arbitrary showFirstUnique() call in the future. This is the only way we could avoid having lots of “waiting” removal operations. How does this work, though? Didn’t we already decide it was too difficult?

• The trouble with doing the removal in the add() method was that it leads to a worst-case search of the queue, and potentially a worst-case removal from the middle of a queue—if it’s even possible (Java doesn’t allow removals from the middle of a queue).
• Making the actual remove an operation isn’t too difficult; we simply need to use a LinkedList-based rather than Array-based queue implementation.
• Searching a LinkedList for a value is still , though. The only data structures that supports search in time are hash-sets and hash-maps. But these don’t maintain the input order; aren’t we just going around in circles now?

There is another, not so well known, data structure we can use here: LinkedHashSet. Note that in Python, we can use an OrderedDict to achieve the same functionality. If you have never heard of it, have a look at its specs before reading any further. Can you see how it solves our problem?

LinkedHashSet is a HashSetLinkedList hybrid. Like a HashSet, items can be found, updated, added, and removed in time. In addition, it puts links between the entries to keep track of the order they were added. Whenever an item is removed from the LinkedHashSet, the links are updated to point to the previous and next, just like they are in an ordinary LinkedList.

In essence, a LinkedHashSet is a type of queue that supports removals from the middle! This is exactly what we need to solve the problem. We can now do the removal in the add() method, and know that it is .

Algorithm

# In Python, we have to make do with the OrderedDict class. We can use it as a Set by setting
# the values to None.

from collections import OrderedDict

class FirstUnique:

def __init__(self, nums: List[int]):
self._queue = OrderedDict()
self._is_unique = {}
for num in nums:
# Notice that we're calling the "add" method of FirstUnique; not of the queue.

def showFirstUnique(self) -> int:
# Check if there is still a value left in the queue. There might be no uniques.
if self._queue:
# We don't want to actually *remove* the value.
# Seeing as OrderedDict has no "get first" method, the way that we can get
# the first value is to create an iterator, and then get the "next" value
# from that. Note that this is O(1).
return next(iter(self._queue))
return -1

def add(self, value: int) -> None:
# Case 1: We need to add the number to the queue and mark it as unique.
if value not in self._is_unique:
self._is_unique[value] = True
self._queue[value] = None
# Case 2: We need to mark the value as no longer unique and then
# remove it from the queue.
elif self._is_unique[value]:
self._is_unique[value] = False
self._queue.pop(value)
# Case 3: We don't need to do anything; the number was removed from the queue
# the second time it occurred.

Complexity Analysis

Let be the length of the initial array passed into the constructor.

Let be the total number of items added onto the queue so far (including those from the constructor).

• Time complexity :
• constructor . For each of the numbers passed into the constructor, we’re making a call to add(). As we will determine below, each call to add() has a cost of . Therefore, for the numbers added in the constructor, we get a total cost of .
• add() . Like before, we’re performing a series of operations when add() is called. Additionally, we’re also removing the number from the queue if it had been unique, but now no longer is. Seeing as the queue is implemented as a LinkedHashSet, the cost of this removal is . Therefore, we get an time complexity for add().
• showFirstUnique() . This time around, the implementation for showFirstUnique() is straightforward. It simply checks whether or not the queue contains any items, and if it does, it returns the first one (without removing it). This has a cost of .
• Space complexity : . Each number is stored up to once in the LinkedHashSet, and up to once in the HashMap. Therefore, we get an overall space complexity of .
Previous Post

Next article