My App
Dsa

Find Duplicates in Array

Learn how to find duplicates in an array using simple and efficient techniques with examples, pseudocode, and real-world applications.

🧠 Find Duplicates in Array

This is one of the most popular array problems from the Blind 75 list.
It checks how well you can handle repeated values and understand time-space trade-offs.


πŸ” Problem Statement

Given an array of integers nums, return all numbers that appear more than once.

Example


Input: nums = [4,3,2,7,8,2,3,1]
Output: [2,3]

πŸ’‘ Approach 1 β€” Using a Set (Simple and Clear)

We can use a set to track which numbers have been seen before.

Steps

  1. Create an empty set seen
  2. Create an empty list duplicates
  3. Loop through each number in nums
  4. If it’s already in seen, add it to duplicates
  5. Otherwise, add it to seen

Pseudocode


create empty set seen
create empty list duplicates

for num in nums:
if num in seen:
add num to duplicates
else:
add num to seen

return duplicates

Python Code


def find_duplicates(nums):
seen = set()
duplicates = []

for num in nums: if num in seen: duplicates.append(num) else: seen.add(num)

return duplicates


print(find_duplicates([4,3,2,7,8,2,3,1]))  # Output: [2, 3]

βš™οΈ Approach 2 β€” In-Place Method (O(1) Space)

If modifying the array is allowed,
we can mark visited numbers by flipping their index value negative.

Steps

  1. Loop through each number
  2. Use its absolute value to find the index
  3. If that index value is already negative β†’ it’s a duplicate
  4. Otherwise, flip it to negative to mark as visited

Python Code


def find_duplicates(nums):
duplicates = []

for num in nums: index = abs(num) - 1 if nums[index] < 0: duplicates.append(abs(num)) else: nums[index] = -nums[index]

return duplicates


print(find_duplicates([4,3,2,7,8,2,3,1]))  # Output: [2, 3]

🌍 Real-World Example

Imagine a task tracking app where users upload a list of task IDs.
You need to check if any task ID appears more than once.


task_ids = [101, 204, 305, 204, 509, 101]
duplicates = find_duplicates(task_ids)
print("Duplicate Task IDs:", duplicates)

# Output: Duplicate Task IDs: [204, 101]

This avoids duplicate entries and keeps your data clean.


🧭 Key Takeaways

  • βœ… Use Set-based approach for readability
  • βœ… Use In-place approach for memory efficiency
  • 🧠 Common real-world use cases:
    • Duplicate user sign-ups
    • Duplicate product listings
    • Repeated transaction IDs

πŸš€ Challenge for You

Try solving this problem without using extra space
and explain why the negative marking trick works.


  • Hashing in Python
  • Array manipulation
  • Time and space complexity optimization

---