Top K Frequent Elements - Leetcode Solution


Step-by-Step Thought Process

Brute Force

  1. Understand the problem: Find the k most frequent elements in an array of integers.
  2. Use a Counter to count the frequency of each number in nums.
  3. Initialize an empty min-heap heap to store at most k (frequency, number) pairs.
  4. Iterate through each key (number) and val (frequency) in the Counter.
  5. If the heap has fewer than k elements, push (val, key) to the heap.
  6. Otherwise, push (val, key) to the heap and pop the smallest element to maintain k elements.
  7. Extract the second element (number) from each tuple in the heap and return them as the result.

Code Solution (Brute Force)


                

                

                

                

Why the Brute-Force Solution is Inefficient

The brute-force solution uses a heap to maintain the k most frequent elements, leading to:

Optimal Solution

The optimal solution uses bucket sort to achieve O(n) time complexity:

  1. Get the length n of the input array nums.
  2. Use a Counter to count the frequency of each number in nums.
  3. Initialize a buckets array of size n+1, where buckets[i] stores a list of numbers with frequency i.
  4. Iterate through the Counter, placing each number in buckets[freq]; if buckets[freq] is empty, initialize it as a list with the number, otherwise append the number.
  5. Initialize an empty list ret to store the result.
  6. Iterate through buckets from index n down to 0; if buckets[i] is not empty, extend ret with buckets[i].
  7. Stop when ret contains k elements and return ret.

Code Solution (Optimal)


                

                

                

                

Detailed Explanation

Understanding the Problem: Top K Frequent Elements

In this problem, we're given an unsorted array of integers and asked to return the k elements that appear most frequently. This problem is a classic example of frequency analysis and appears frequently in scenarios such as log data analysis, recommendation systems, and natural language processing where you need to extract high-occurrence entities.

Initial Approach: Min Heap

A common brute-force approach involves counting the frequency of each element and using a min heap to keep track of the top k most frequent numbers. We build a frequency map using a hash table (or Python’s Counter) and then push each (frequency, number) pair into the heap. When the heap size exceeds k, we remove the smallest frequency element. This ensures the heap always contains the k most frequent elements seen so far.

Once all elements are processed, the numbers in the heap are our result. The heap allows for efficient retrieval of the most frequent items in O(n log k) time.

Optimized Strategy: Bucket Sort

While the heap-based approach is efficient, we can do better. The maximum frequency any number can have is equal to the length of the input array. So instead of sorting or using a heap, we can use a bucket sort idea. We create an array of n + 1 buckets where index i represents all numbers that appear exactly i times.

After populating the buckets, we iterate from the highest possible frequency down to 1, collecting numbers until we have gathered k of them. This approach skips sorting altogether and guarantees a linear time solution in O(n).

Time and Space Complexity

The heap solution has a time complexity of O(n log k), where n is the number of unique elements and k is the number of top elements to retrieve. It uses O(k) space for the heap and O(n) for the frequency map.

The bucket sort solution improves the time complexity to O(n) in the average case by eliminating the log factor introduced by the heap. The space complexity remains O(n) due to the use of the frequency map and buckets.

Conclusion

The “Top K Frequent Elements” problem showcases the trade-offs between general-purpose data structures like heaps and specialized techniques like bucket sort. The heap approach is versatile and works well in many scenarios, but if the data’s frequency bounds are known or small, bucket sort offers superior performance. Understanding both strategies enhances your problem-solving toolkit for real-world frequency analysis problems.

Get Personalized Support at AlgoMap Bootcamp 💡