|
| 1 | +--- |
| 2 | +id: kth-largest-element-in-a-Stream |
| 3 | +title: Kth Largest Element in a Stream |
| 4 | +sidebar_label: 0703 - Kth Largest Element in a Stream |
| 5 | +tags: |
| 6 | + - Heap |
| 7 | + - Design |
| 8 | + - Data Stream |
| 9 | +description: "This is a solution to the Kth Largest Element in a Stream problem on LeetCode." |
| 10 | +--- |
| 11 | + |
| 12 | +## Problem Description |
| 13 | + |
| 14 | +Design a class to find the $k^{th}$ largest element in a stream. Note that it is the kth largest element in the sorted order, not the $k^{th}$ distinct element. |
| 15 | + |
| 16 | +Implement `KthLargest` class: |
| 17 | + |
| 18 | +- `KthLargest(int k, int[] nums)` Initializes the object with the integer `k` and the stream of integers `nums`. |
| 19 | +- `int add(int val)` Appends the integer `val` to the stream and returns the element representing the $k^{th}$ largest element in the stream. |
| 20 | + |
| 21 | +### Examples |
| 22 | + |
| 23 | +**Example 1:** |
| 24 | + |
| 25 | +``` |
| 26 | +Input |
| 27 | +["KthLargest", "add", "add", "add", "add", "add"] |
| 28 | +[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]] |
| 29 | +Output |
| 30 | +[null, 4, 5, 5, 8, 8] |
| 31 | +
|
| 32 | +Explanation |
| 33 | +KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]); |
| 34 | +kthLargest.add(3); // return 4 |
| 35 | +kthLargest.add(5); // return 5 |
| 36 | +kthLargest.add(10); // return 5 |
| 37 | +kthLargest.add(9); // return 8 |
| 38 | +kthLargest.add(4); // return 8 |
| 39 | +``` |
| 40 | + |
| 41 | +### Constraints |
| 42 | + |
| 43 | +- $1 \leq k \leq 10^4$ |
| 44 | +- $0 \leq nums.length \leq 10^4$ |
| 45 | +- $-10^4 \leq nums[i] \leq 10^4$ |
| 46 | +- $-10^4 \leq val \leq 10^4$ |
| 47 | +- At most $10^4$ calls will be made to `add`. |
| 48 | +- It is guaranteed that there will be at least `k` elements in the array when you search for the $k^{th}$ element. |
| 49 | + |
| 50 | +## Solution for Kth Largest Element in a Stream |
| 51 | + |
| 52 | +### Approach 1: Heap |
| 53 | + |
| 54 | +A heap is a data structure that is capable of giving you the smallest (or largest) element (by some criteria) in constant time, while also being able to add elements and remove the smallest (or largest) element in only logarithmic time. Imagine if you wanted to replicate this functionality naively with an array. To make sure we can find the smallest element in constant time, let's just keep our array sorted, so that the last element is always the largest (or smallest, depending on if we're sorting in ascending or descending order). Removing the largest/smallest element will take $O(1)$ time as we are popping from the end of the array. However, to add a new element, we first need to find where the element should be inserted and then insert it by shifting the array, which requires $O(n)$ time. Now, there are potential improvements to this approach, like using a deque for removals and insertions and binary searching to find insertion points, but the point is that a heap makes it so we don't need to worry about any of that. |
| 55 | + |
| 56 | +In summary, a heap: |
| 57 | + |
| 58 | +- Stores elements, and can find the smallest (min-heap) or largest (max-heap) element stored in $O(1)$. |
| 59 | +- Can add elements and remove the smallest (min-heap) or largest (max-heap) element in $O(log(n))$. |
| 60 | +- Can perform insertions and removals while always maintaining the first property. |
| 61 | + |
| 62 | +The capability to remove and insert elements in $log(n)$ time makes heaps extremely useful. For example, many problems that can be naively solved in $O(n^2)$ time, can be solved in $O(n⋅log(n))$ time by using a heap. To put this in perspective, for an input size of $n = 10^5$ elements, $n⋅log(n)$ is over **6000** times smaller than $n^2$. |
| 63 | + |
| 64 | +So now that we know what a heap does, how does it help solve this problem? Let's say we have some stream of numbers, `nums = [6, 2, 3, 1, 7]`, and `k = 3`. Because the input is small, we can clearly see the kth smallest element is `3`. Although, earlier we said that a heap can only find an element in $O(1)$ time if it's a minimum or maximum (depending on choice of implementation). Well, a heap is also capable of removing the smallest element quickly, so what if we just keep removing the smallest element from `nums` until `nums.length == k`? In this case, we would have `nums = [3, 6, 7]`, and a heap can now give us our answer in $O(1)$ time. |
| 65 | + |
| 66 | +That's the key to solving this problem - use a min-heap (min means that the heap will remove/find the smallest element, a max heap is the same thing but for the largest element) and keep the heap at size `k`. That way, the smallest element in the heap (the one we can access in $O(1)$) will always be the kth largest element. This way, when adding a number to the heap with `add()`, we can do it very quickly in $log(n)$ time. If our heap exceeds size `k`, then we can also remove it very quickly. In the end, the smallest element in the heap will be the answer. |
| 67 | + |
| 68 | +#### Algorithm |
| 69 | + |
| 70 | +1. In the constructor, create a min heap using the elements from `nums`. Then, pop from the heap until `heap.length == k`. |
| 71 | +2. For every call to `add()`: |
| 72 | + - First, push `val` into `heap`. |
| 73 | + - Next, check if `heap.length > k`. If so, pop from the heap. |
| 74 | + - Finally, return the smallest value from the heap, which we can get in $O(1)$ time. |
| 75 | + |
| 76 | +## Code in Different Languages |
| 77 | + |
| 78 | +<Tabs> |
| 79 | +<TabItem value="cpp" label="C++"> |
| 80 | + <SolutionAuthor name="@Shreyash3087"/> |
| 81 | + |
| 82 | +```cpp |
| 83 | +#include <queue> |
| 84 | + |
| 85 | +class KthLargest { |
| 86 | +private: |
| 87 | + int k; |
| 88 | + std::priority_queue<int, std::vector<int>, std::greater<int>> heap; |
| 89 | + |
| 90 | +public: |
| 91 | + KthLargest(int k, std::vector<int> nums) { |
| 92 | + this->k = k; |
| 93 | + for (int num : nums) { |
| 94 | + heap.push(num); |
| 95 | + if (heap.size() > k) { |
| 96 | + heap.pop(); |
| 97 | + } |
| 98 | + } |
| 99 | + } |
| 100 | + |
| 101 | + int add(int val) { |
| 102 | + heap.push(val); |
| 103 | + if (heap.size() > k) { |
| 104 | + heap.pop(); |
| 105 | + } |
| 106 | + return heap.top(); |
| 107 | + } |
| 108 | +}; |
| 109 | + |
| 110 | +``` |
| 111 | +</TabItem> |
| 112 | +<TabItem value="java" label="Java"> |
| 113 | + <SolutionAuthor name="@Shreyash3087"/> |
| 114 | + |
| 115 | +```java |
| 116 | +class KthLargest { |
| 117 | + private static int k; |
| 118 | + private PriorityQueue<Integer> heap; |
| 119 | + |
| 120 | + public KthLargest(int k, int[] nums) { |
| 121 | + this.k = k; |
| 122 | + heap = new PriorityQueue<>(); |
| 123 | + |
| 124 | + for (int num: nums) { |
| 125 | + heap.offer(num); |
| 126 | + } |
| 127 | + |
| 128 | + while (heap.size() > k) { |
| 129 | + heap.poll(); |
| 130 | + } |
| 131 | + } |
| 132 | + |
| 133 | + public int add(int val) { |
| 134 | + heap.offer(val); |
| 135 | + if (heap.size() > k) { |
| 136 | + heap.poll(); |
| 137 | + } |
| 138 | + |
| 139 | + return heap.peek(); |
| 140 | + } |
| 141 | +} |
| 142 | +``` |
| 143 | + |
| 144 | +</TabItem> |
| 145 | +<TabItem value="python" label="Python"> |
| 146 | + <SolutionAuthor name="@Shreyash3087"/> |
| 147 | + |
| 148 | +```python |
| 149 | +class KthLargest: |
| 150 | + def __init__(self, k: int, nums: List[int]): |
| 151 | + self.k = k |
| 152 | + self.heap = nums |
| 153 | + heapq.heapify(self.heap) |
| 154 | + |
| 155 | + while len(self.heap) > k: |
| 156 | + heapq.heappop(self.heap) |
| 157 | + |
| 158 | + def add(self, val: int) -> int: |
| 159 | + heapq.heappush(self.heap, val) |
| 160 | + if len(self.heap) > self.k: |
| 161 | + heapq.heappop(self.heap) |
| 162 | + return self.heap[0] |
| 163 | +``` |
| 164 | +</TabItem> |
| 165 | +</Tabs> |
| 166 | + |
| 167 | +## Complexity Analysis |
| 168 | + |
| 169 | +### Time Complexity: $O(N⋅log(N)+M⋅log(k))$ |
| 170 | + |
| 171 | +> **Reason**: The time complexity is split into two parts. First, the constructor needs to turn `nums` into a heap of size `k`. In Python, `heapq.heapify()` can turn nums into a heap in $O(N)$ time. Then, we need to remove from the heap until there are only k elements in it, which means removing `N - k` elements. Since k can be, say 1, in terms of big O this is N operations, with each operation costing $log(N)$. Therefore, the constructor costs $O(N + N⋅log(N))=O(N⋅log(N))$. |
| 172 | +> |
| 173 | +> Next, every call to `add()` involves adding an element to `heap` and potentially removing an element from `heap`. Since our heap is of size `k`, every call to `add()` at worst costs $O(2 \times log(k))=O(log(k))$. That means `M` calls to add() costs $O(M⋅log(k))$. |
| 174 | +
|
| 175 | +### Space Complexity: $O(N)$ |
| 176 | + |
| 177 | +> **Reason**: The only extra space we use is the `heap`. While during `add()` calls we limit the size of the heap to `k`, in the constructor we start by converting `nums` into a heap, which means the heap will initially be of size `N`. |
| 178 | +
|
| 179 | +## References |
| 180 | + |
| 181 | +- **LeetCode Problem**: [Kth Largest Element in a Stream](https://leetcode.com/problems/kth-largest-element-in-a-stream/description/) |
| 182 | + |
| 183 | +- **Solution Link**: [Kth Largest Element in a Stream](https://leetcode.com/problems/kth-largest-element-in-a-stream/solutions/) |
0 commit comments