Find Duplicates in Array
Learn how to find duplicates in an array using simple and efficient techniques with examples, pseudocode, and real-world applications.
π§ Find Duplicates in Array
This is one of the most popular array problems from the Blind 75 list.
It checks how well you can handle repeated values and understand time-space trade-offs.
π Problem Statement
Given an array of integers nums, return all numbers that appear more than once.
Example
Input: nums = [4,3,2,7,8,2,3,1]
Output: [2,3]π‘ Approach 1 β Using a Set (Simple and Clear)
We can use a set to track which numbers have been seen before.
Steps
- Create an empty set
seen - Create an empty list
duplicates - Loop through each number in
nums - If itβs already in
seen, add it toduplicates - Otherwise, add it to
seen
Pseudocode
create empty set seen
create empty list duplicates
for num in nums:
if num in seen:
add num to duplicates
else:
add num to seen
return duplicatesPython Code
def find_duplicates(nums):
seen = set()
duplicates = []for num in nums: if num in seen: duplicates.append(num) else: seen.add(num)
return duplicates
print(find_duplicates([4,3,2,7,8,2,3,1])) # Output: [2, 3]βοΈ Approach 2 β In-Place Method (O(1) Space)
If modifying the array is allowed,
we can mark visited numbers by flipping their index value negative.
Steps
- Loop through each number
- Use its absolute value to find the index
- If that index value is already negative β itβs a duplicate
- Otherwise, flip it to negative to mark as visited
Python Code
def find_duplicates(nums):
duplicates = []for num in nums: index = abs(num) - 1 if nums[index] < 0: duplicates.append(abs(num)) else: nums[index] = -nums[index]
return duplicates
print(find_duplicates([4,3,2,7,8,2,3,1])) # Output: [2, 3]π Real-World Example
Imagine a task tracking app where users upload a list of task IDs.
You need to check if any task ID appears more than once.
task_ids = [101, 204, 305, 204, 509, 101]
duplicates = find_duplicates(task_ids)
print("Duplicate Task IDs:", duplicates)
# Output: Duplicate Task IDs: [204, 101]This avoids duplicate entries and keeps your data clean.
π§ Key Takeaways
- β Use Set-based approach for readability
- β Use In-place approach for memory efficiency
- π§ Common real-world use cases:
- Duplicate user sign-ups
- Duplicate product listings
- Repeated transaction IDs
π Challenge for You
Try solving this problem without using extra space
and explain why the negative marking trick works.
π Related Topics
- Hashing in Python
- Array manipulation
- Time and space complexity optimization
---