5 Unique Ways to Find Duplicates from an Array

5 Unique Ways to Find Duplicates from an Array
Photo by Matthew Henry / Unsplash

When working with arrays, whether in JavaScript, Python, or any other programming language, you’ll inevitably face the challenge of identifying duplicates. Perhaps you’re cleaning up data, validating input, or optimizing search functionalities. Whatever the reason, it’s essential to know a variety of techniques to spot and handle duplicates efficiently. In this article, we’ll explore five distinct strategies to detect duplicates in an array—ranging from classic loops to modern language features and data structures.

1. Using a Loop and a Tracking Structure

Best suited for: Beginners who want full control and transparency over the process.

How it works:
A straightforward approach is to use a loop—like a for loop or forEach—and keep track of elements as you encounter them. For example, in JavaScript, you could maintain a seen object (or Set) that records each item. If you come across an item that’s already in seen, it’s a duplicate.

Example (JavaScript):

const arr = [1, 2, 3, 2, 4, 5, 3];
const seen = {};
const duplicates = [];

arr.forEach(item => {
  if (seen[item]) {
    duplicates.push(item);
  } else {
    seen[item] = true;
  }
});

console.log(duplicates); // [2, 3]

Why it’s useful:

  • Offers complete control and is easy to understand.
  • Works in all languages with basic data structures.

2. Using a Set Data Structure

Best suited for: When efficiency and simplicity matter, and you need a quick, readable solution.

How it works:
A Set is a data structure that stores unique values. By iterating over the array and trying to add each element to a Set, you can quickly identify duplicates: if the Set already contains the element, it’s a duplicate.

Example (JavaScript):

const arr = ['apple', 'banana', 'orange', 'apple', 'kiwi'];
const set = new Set();
const duplicates = [];

for (const fruit of arr) {
  if (set.has(fruit)) {
    duplicates.push(fruit);
  } else {
    set.add(fruit);
  }
}

console.log(duplicates); // ['apple']

Why it’s useful:

  • Very concise and fast lookup times.
  • Readily available in many languages with similar data structures.

3. Sorting and Comparing Neighbors

Best suited for: Large arrays where memory is limited, or when you want an in-place solution.

How it works:
If you sort the array first, all duplicates will cluster together. You can then make a single pass to identify consecutive elements that are identical.

Example (JavaScript):

const arr = [10, 5, 3, 5, 1, 10, 2];
arr.sort((a, b) => a - b); // [1, 2, 3, 5, 5, 10, 10]

const duplicates = [];
for (let i = 0; i < arr.length - 1; i++) {
  if (arr[i] === arr[i + 1]) {
    duplicates.push(arr[i]);
  }
}

console.log(duplicates); // [5, 10]

Why it’s useful:

  • Works well in scenarios where external memory structures are not desired.
  • Sorting-based solutions are common in algorithmic challenges.

4. Using the filter() Method (Functional Approach)

Best suited for: Developers who love functional programming constructs and neat, chainable solutions.

How it works:
In languages that support higher-order functions like filter(), you can determine duplicates by checking whether the current item’s index in the array is not the same as the index at which the item first appears. If indexOf() (or a similar function) returns a different position, it means the item has appeared before.

Example (JavaScript):

const arr = ['cat', 'dog', 'cat', 'mouse', 'dog'];
const duplicates = arr.filter((item, index) => arr.indexOf(item) !== index);
console.log(duplicates); // ['cat', 'dog']

Why it’s useful:

  • Very concise and expressive.
  • Easily integrates into a chain of transformations and filters.

5. Using Frequency Counting (Map or Object)

Best suited for: Cases when you need both duplicates and their frequency counts.

How it works:
Create a frequency object or Map that counts how many times each element appears in the array. At the end of one pass, any element with a count greater than one is a duplicate.

Example (JavaScript):

const arr = [4, 4, 6, 7, 6, 8, 4];
const frequency = {};
arr.forEach(item => {
  frequency[item] = (frequency[item] || 0) + 1;
});

const duplicates = Object.keys(frequency).filter(key => frequency[key] > 1);
console.log(duplicates); // ['4', '6']

Why it’s useful:

  • Provides not just which elements are duplicates but also how many times they appear.
  • Helpful in data analysis and reporting scenarios.

Which Method Should You Choose?

Your choice depends largely on your specific context:

  • For readability and quick checks: Use a Set or filter() approach.
  • For large datasets where memory matters: Sorting might be preferable.
  • For counting occurrences: Frequency counting is ideal.
  • For basic understanding and maximum control: The classic loop and object tracking method is perfect.

Conclusion

Finding duplicates in an array is a common programming task, and there’s no one-size-fits-all solution. By exploring different methods—tracking with loops, sets, sorting, filtering, or frequency counting—you can build a toolkit of approaches. Choose the one that best fits your performance needs, coding style, and the nature of your data. Over time, understanding these various strategies will help you write cleaner, more efficient code.