3 Ways to Find Duplicate Elements in an Array - Java

There are multiple ways to find duplicate elements in an array in Java and we will see three of them in this program. Solution and logic shown in this article are generic and applies to an array of any type e.g. String array or integer array or array of any object. One of the most common way to find duplicates is by using brute force method, which compares each element of the array to every other element. This solution has the time complexity of O(n^2) and only exists for the academic purpose. You shouldn't be using this solution in the real world. The standard way to find duplicate elements from an array is by using HashSet data structure. If you remember, Set abstract data type doesn't allow duplicates. You can take advantage of this property to filter duplicate elements. This solution has a time complexity of O(n), as you only need to iterate over array once, but also has space complexity of O(n) as you need to store unique elements in the array.


Our third solution to find duplicate elements in an array is actually similar to our second solution but instead of using Set data structure we will use hash table data structure. This is a pretty good solution because you can extend it to found count of duplicates as well. In this solution, we iterate over the array and build the map which stores array elements and their count. Once the table is build, you can iterate over a hash table and print out all the elements, who has a count greater than one, those are your duplicates.



How to find duplicates in Java array?

In the first paragraph, I have given you a brief overview of three ways to find duplicate elements from Java array. Now, let's understand the logic behind each of those solutions in little more detail.

Solution 1 :

Our first solution is very simple. All we are doing here is to loop over an array and comparing each element to every other element. For doing this, we are using two loops, inner loop, and outer loop. We are also making sure that we are ignoring comparing of elements to itself by checking for i != j before printing duplicates. Since we are comparing every element to every other element, this solution has quadratic time complexity i.e. O(n^2). This solution has the worst complexity in all three solutions.
 for (int i = 0; i < names.length; i++) {
     for (int j = i + 1 ; j < names.length; j++) {
          if (names[i].equals(names[j])) {
                   // got the duplicate element
          }
     }
 }


This question is also very popular on programming interviews and if you are preparing for them, I also suggest you to solves problems from Cracking the Coding Interview: 150 Programming Questions and Solutions. One of the best book to prepare for software developer interviews.

How to find duplicates in array in Java

Solution 2 :

Second solution is even simpler than this. All you need to know is that Set doesn't allow duplicates in Java. Which means if you have added an element into Set and trying to insert duplicate element again, it will not be allowed. In Java, you can use HashSet class to solve this problem. Just loop over array elements, insert them into HashSet using add() method and check return value. If add() returns false it means that element is not allowed in the Set and that is your duplicate. Here is the code sample to do this :
 for (String name : names) {
     if (set.add(name) == false) {
        // your duplicate element
     }
}
Complexity of this solution is O(n) because you are only going through array one time, but it also has space complexity of O(n) because of HashSet data structure, which contains your unique elements. So if an array contains 1 million elements, in worst case you would need an HashSet to store those 1 million elements.


Solution 3 :

Our third solution takes advantage of another useful data structure, hash table. All you need to do is loop through the array using enhanced for loop and insert each element and its count into hash table. You can use HashMap class of JDK to solve this problem. It is the general purpose hash table implementation in Java. In order to build table, you check if hash table contains the elements or not, if it is then increment the count otherwise insert element with count 1. Once you have this table ready, you can iterate over hashtable and print all those keys which has values greater than one. These are your duplicate elements. This is in fact a very good solution because you can extend it to found count of duplicates as well. If you remember, I have used this approach to find duplicate characters in String earlier. Here is how you code will look like :
// build hash table with count
        for (String name : names) {
            Integer count = nameAndCount.get(name);
            if (count == null) {
                nameAndCount.put(name, 1);
            } else {
                nameAndCount.put(name, ++count);
            }
        }

        // Print duplicate elements from array in Java
        Set<Entry<String, Integer>> entrySet = nameAndCount.entrySet();
        for (Entry<String, Integer> entry : entrySet) {
            if (entry.getValue() > 1) {
                System.out.printf("duplicate element '%s' and count '%d' :", entry.getKey(), entry.getValue());
            }
        }
Time complexity of this solution is O(2n) because we are iterating over array twice and space complexity is same as previous solution O(n). In worst case you would need a hash table with size of array itself.

How to find duplicate elements in array in Java



Java Program to find duplicate elements in array

Here is our three solutions packed into a Java program to find duplicate elements in array. You can run this example from command line or Eclipse IDE, whatever suits you. Just make sure that name of your Java source file should be same as your public class e.g. "DuplicatesInArray". I have left bit of exercise for you, of course if you would like to do. Can you refactor this code into methods, you can do that easily by using extract method feature of IDE like Eclipse and Netbans and write unit test to check the logic of each approach. This would give you some practice in refactoring code and writing JUnit tests, two important attributes of a professional programmer.
package dto;

import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;

/**
 * Java Program to find duplicate elements in an array. There are two straight
 * forward solution of this problem first, brute force way and second by using
 * HashSet data structure. A third solution, similar to second one is by using
 * hash table data structure e.g. HashMap to store count of each element and
 * print element with count 1.
 * 
 * @author java67
 */

public class DuplicatesInArray{

    public static void main(String args[]) {

        String[] names = { "Java", "JavaScript", "Python", "C", "Ruby", "Java" };

        // First solution : finding duplicates using brute force method
        System.out.println("Finding duplicate elements in array using brute force method");
        for (int i = 0; i < names.length; i++) {
            for (int j = i + 1; j < names.length; j++) {
                if (names[i].equals(names[j]) ) {
                   // got the duplicate element
                }
            }
        }

        // Second solution : use HashSet data structure to find duplicates
        System.out.println("Duplicate elements from array using HashSet data structure");
        Set<String> store = new HashSet<>();

        for (String name : names) {
            if (store.add(name) == false) {
                System.out.println("found a duplicate element in array : "
                        + name);
            }
        }

        // Third solution : using Hash table data structure to find duplicates
        System.out.println("Duplicate elements from array using hash table");
        Map<String, Integer> nameAndCount = new HashMap<>();

        // build hash table with count
        for (String name : names) {
            Integer count = nameAndCount.get(name);
            if (count == null) {
                nameAndCount.put(name, 1);
            } else {
                nameAndCount.put(name, ++count);
            }
        }

        // Print duplicate elements from array in Java
        Set<Entry<String, Integer>> entrySet = nameAndCount.entrySet();
        for (Entry<String, Integer> entry : entrySet) {

            if (entry.getValue() > 1) {
                System.out.println("Duplicate element from array : "
                        + entry.getKey());
            }
        }
    }
}
Output :
Finding duplicate elements in array using brute force method
Duplicate elements from array using HashSet data structure
found a duplicate element in array : Java
Duplicate elements from array using hash table
Duplicate element from array : Java
From the output, you can see that the only duplicate element from our String array, which is "Java" has been found by all of our three solutions.


That's all about how to find duplicate elements in an array in Java. In this tutorial, you have learned three ways to solve this problem. The brute force way require you to compare each element from array to another, hence has quadratic time complexity. You can optimize performance by using HashSet data structure, which doesn't allow duplicates. So a duplicate element is the one for which add() method of HashSet return false. Our third solution uses hash table data structure to make a table of elements and their count. Once you build that table, iterate over it and print elements whose count is greater than one. This is a very good coding problem and frequently asked in Java Interview. It also shows how use of a right data structure can improve performance of algorithm significantly.

If you have like this coding problem, you may also like to solve following coding problems from Java Interviews :
  • How to find all pairs on integer array whose sum is equal to given number? [solution]
  • How to remove duplicates from an array in Java? [solution]
  • How to sort an array in place using QuickSort algorithm? [solution]
  • Write a program to find top two numbers from an integer array? [solution]
  • How do you remove duplicates from array in place? [solution]
  • How do you reverse array in place in Java? [solution]
  • Write a program to find missing number in integer array of 1 to 100? [solution]
  • How to check if array contains a number in Java? [solution]
  • How to find maximum and minimum number in unsorted array? [solution]


Recommended books to Prepare for Software Engineer Interviews

If you are solving these coding problems to prepare for software engineer job interviews, you can also take a look at following books. They contain wealth of knowledge and several frequently asked coding problems from Java and C++ interviews :
  • Programming Interviews Exposed: Secrets to Landing Your Next Job (book)
  • Coding Puzzles: Thinking in code By codingtmd (book)
  • Cracking the Coding Interview: 150 Programming Questions and Solutions (book)

11 comments :

Olivier Chorier said...

First solution : you should start j from i+1 instead of 0.
Second/Third solution : Take care about memory consumption (we have two collections in memory)

Luca Aliberti said...

How do you guarantee that two different values stored in the list do not have the same hash ?

Bradley Ross said...

It doesn't matter if two different values have the same hash if you are using a class such as HashSet or HashMap. The value of the hash indicates a set into which the entry will be placed, not the actual location of the entry in the set. At least, that is how I read the JavaDoc for the classes.

Javin Paul said...

@Olivier, valuable suggestions, you must be a good code reviewer :)

Javin Paul said...

@Luca, seems Bradley Ross has already answered your question, thanks Ross. If you are interested upon how those Hash classes work in Java, I suggest you to take a look at my another post How HashSet internally works in Java

sapan jain said...

what is the best way using Java8?

Javin Paul said...

@sapan, you mean best way to learn Java8? or you mean using it on your real project?

In case of first try solving programming problems like this using Java 8, for the second use Java 8 with collections and streams.

Arun Gupta said...

I am a bit confused here with the first solution.

How does it guarantee to return duplicate values without first sorting the array?

Vivek kumar said...

Any idea without using extra space ?

Javin Paul said...

Hello @Vivek, the first solution is without additional space but time complexity is O(n^2), I don't think there is a solution with O(n) without any additional space.

Anonymous said...

HI, For the first 2 solutions , logic is not may be correct for the below inputs,
String[] names = { "Java","Java", "JavaScript", "Python", "C", "Ruby", "Java" };

Post a Comment