Thursday, April 13, 2023

3 Ways to Find Duplicate Elements in an Array - Java

There are multiple ways to find duplicate elements in an array in Java and we will see three of them in this program. The solution and logic shown in this article are generic and apply to an array of any type e.g. String array or integer array or array of any object. One of the most common ways to find duplicates is by using the brute force method, which compares each element of the array to every other element. This solution has the time complexity of O(n^2) and only exists for academic purposes. You shouldn't be using this solution in the real world.


The standard way to find duplicate elements from an array is by using the HashSet data structure. If you remember, Set abstract data type doesn't allow duplicates. You can take advantage of this property to filter duplicate elements.

This solution has a time complexity of O(n), as you only need to iterate over array once, but also has a space complexity of O(n) as you need to store unique elements in the array.

Our third solution to find duplicate elements in an array is actually similar to our second solution but instead of using a Set data structure, we will use the hash table data structure. This is a pretty good solution because you can extend it to the found count of duplicates as well. In this solution, we iterate over the array and build the map which stores array elements and their count.

Once the table is built, you can iterate over a hash table and print out all the elements, who have a count greater than one, those are your duplicates.

Also, basic knowledge of essential data structure is also very important and that's why I suggest all Java programmers join a comprehensive Data Structure and Algorithms course like Data Structures and Algorithms: Deep Dive Using Java on Udemy to fill the gaps in your understanding.




How to find duplicates in Java array?

In the first paragraph, I have given you a brief overview of three ways to find duplicate elements from Java array. Now, let's understand the logic behind each of those solutions in little more detail.

Solution 1 :

Our first solution is very simple. All we are doing here is to loop over an array and comparing each element to every other element. For doing this, we are using two loops, inner loop, and outer loop. We are also making sure that we are ignoring comparing of elements to itself by checking for i != j before printing duplicates. Since we are comparing every element to every other element, this solution has quadratic time complexity i.e. O(n^2). This solution has the worst complexity in all three solutions.
 for (int i = 0; i < names.length; i++) {
     for (int j = i + 1 ; j < names.length; j++) {
          if (names[i].equals(names[j])) {
                   // got the duplicate element
          }
     }
 }


This question is also very popular in programming interviews and if you are preparing for them, I also suggest you solve problems from Cracking the Coding Interview: 150 Programming Questions and Solutions. One of the best books to prepare for software developer interviews.

How to find duplicates in array in Java

Solution 2 :

The second solution is even simpler than this. All you need to know is that Set doesn't allow duplicates in Java. Which means if you have added an element into Set and trying to insert duplicate element again, it will not be allowed. In Java, you can use the HashSet class to solve this problem. Just loop over array elements, insert them into HashSet using add() method, and check the return value.

If add() returns false it means that element is not allowed in the Set and that is your duplicate. Here is the code sample to do this :

 for (String name : names) {
     if (set.add(name) == false) {
        // your duplicate element
     }
}

The complexity of this solution is O(n) because you are only going through the array one time, but it also has a space complexity of O(n) because of the HashSet data structure, which contains your unique elements. So if an array contains 1 million elements, in the worst case you would need a HashSet to store those 1 million elements.


Solution 3 :

Our third solution takes advantage of another useful data structure, hash table. All you need to do is loop through the array using enhanced for loop and insert each element and its count into hash table. You can use HashMap class of JDK to solve this problem. It is the general purpose hash table implementation in Java. In order to build table, you check if hash table contains the elements or not, if it is then increment the count otherwise insert element with count 1. Once you have this table ready, you can iterate over hashtable and print all those keys which has values greater than one. These are your duplicate elements. This is in fact a very good solution because you can extend it to found count of duplicates as well. If you remember, I have used this approach to find duplicate characters in String earlier. Here is how you code will look like :
// build hash table with count
        for (String name : names) {
            Integer count = nameAndCount.get(name);
            if (count == null) {
                nameAndCount.put(name, 1);
            } else {
                nameAndCount.put(name, ++count);
            }
        }

        // Print duplicate elements from array in Java
        Set<Entry<String, Integer>> entrySet = nameAndCount.entrySet();
        for (Entry<String, Integer> entry : entrySet) {
            if (entry.getValue() > 1) {
                System.out.printf("duplicate element '%s' and count '%d' :", entry.getKey(), entry.getValue());
            }
        }
Time complexity of this solution is O(2n) because we are iterating over array twice and space complexity is same as previous solution O(n). In worst case you would need a hash table with size of array itself.

How to find duplicate elements in array in Java



Java Program to find duplicate elements in array

Here is our three solutions packed into a Java program to find duplicate elements in array. You can run this example from command line or Eclipse IDE, whatever suits you. Just make sure that name of your Java source file should be same as your public class e.g. "DuplicatesInArray". I have left bit of exercise for you, of course if you would like to do. Can you refactor this code into methods, you can do that easily by using extract method feature of IDE like Eclipse and Netbans and write unit test to check the logic of each approach. This would give you some practice in refactoring code and writing JUnit tests, two important attributes of a professional programmer.
package dto;

import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;

/**
 * Java Program to find duplicate elements in an array. There are two straight
 * forward solution of this problem first, brute force way and second by using
 * HashSet data structure. A third solution, similar to second one is by using
 * hash table data structure e.g. HashMap to store count of each element and
 * print element with count 1.
 * 
 * @author java67
 */

public class DuplicatesInArray{

    public static void main(String args[]) {

        String[] names = { "Java", "JavaScript", "Python", "C", "Ruby", "Java" };

        // First solution : finding duplicates using brute force method
        System.out.println("Finding duplicate elements in array using brute force method");
        for (int i = 0; i < names.length; i++) {
            for (int j = i + 1; j < names.length; j++) {
                if (names[i].equals(names[j]) ) {
                   // got the duplicate element
                }
            }
        }

        // Second solution : use HashSet data structure to find duplicates
        System.out.println("Duplicate elements from array using HashSet data structure");
        Set<String> store = new HashSet<>();

        for (String name : names) {
            if (store.add(name) == false) {
                System.out.println("found a duplicate element in array : "
                        + name);
            }
        }

        // Third solution : using Hash table data structure to find duplicates
        System.out.println("Duplicate elements from array using hash table");
        Map<String, Integer> nameAndCount = new HashMap<>();

        // build hash table with count
        for (String name : names) {
            Integer count = nameAndCount.get(name);
            if (count == null) {
                nameAndCount.put(name, 1);
            } else {
                nameAndCount.put(name, ++count);
            }
        }

        // Print duplicate elements from array in Java
        Set<Entry<String, Integer>> entrySet = nameAndCount.entrySet();
        for (Entry<String, Integer> entry : entrySet) {

            if (entry.getValue() > 1) {
                System.out.println("Duplicate element from array : "
                        + entry.getKey());
            }
        }
    }
}
Output :
Finding duplicate elements in array using brute force method
Duplicate elements from array using HashSet data structure
found a duplicate element in array : Java
Duplicate elements from array using hash table
Duplicate element from array : Java
From the output, you can see that the only duplicate element from our String array, which is "Java" has been found by all of our three solutions.


That's all about how to find duplicate elements in an array in Java. In this tutorial, you have learned three ways to solve this problem. The brute force way require you to compare each element from array to another, hence has quadratic time complexity. You can optimize performance by using HashSet data structure, which doesn't allow duplicates. So a duplicate element is the one for which add() method of HashSet return false. Our third solution uses hash table data structure to make a table of elements and their count. Once you build that table, iterate over it and print elements whose count is greater than one. This is a very good coding problem and frequently asked in Java Interview. It also shows how use of a right data structure can improve performance of algorithm significantly.

If you have like this coding problem, you may also like to solve following coding problems from Java Interviews :
  • How to find all pairs on integer array whose sum is equal to given number? [solution]
  • How to remove duplicates from an array in Java? [solution]
  • How to sort an array in place using QuickSort algorithm? [solution]
  • Write a program to find top two numbers from an integer array? [solution]
  • How do you remove duplicates from array in place? [solution]
  • How do you reverse array in place in Java? [solution]
  • Write a program to find missing number in integer array of 1 to 100? [solution]
  • How to check if array contains a number in Java? [solution]
  • How to find maximum and minimum number in unsorted array? [solution]

Further Learning
The Coding Interview Bootcamp: Algorithms + Data Structures
Data Structures and Algorithms: Deep Dive Using Java
Algorithms and Data Structures - Part 1 and 2

Recommended books to Prepare for Software Engineer Interviews

If you are solving these coding problems to prepare for software engineer job interviews, you can also take a look at following books. They contain wealth of knowledge and several frequently asked coding problems from Java and C++ interviews :
  • Programming Interviews Exposed: Secrets to Landing Your Next Job (book)
  • Coding Puzzles: Thinking in code By codingtmd (book)
  • Cracking the Coding Interview: 150 Programming Questions and Solutions (book)

21 comments :

Olivier Chorier said...

First solution : you should start j from i+1 instead of 0.
Second/Third solution : Take care about memory consumption (we have two collections in memory)

Unknown said...

How do you guarantee that two different values stored in the list do not have the same hash ?

Unknown said...

It doesn't matter if two different values have the same hash if you are using a class such as HashSet or HashMap. The value of the hash indicates a set into which the entry will be placed, not the actual location of the entry in the set. At least, that is how I read the JavaDoc for the classes.

javin paul said...

@Olivier, valuable suggestions, you must be a good code reviewer :)

javin paul said...

@Luca, seems Bradley Ross has already answered your question, thanks Ross. If you are interested upon how those Hash classes work in Java, I suggest you to take a look at my another post How HashSet internally works in Java

sapan said...

what is the best way using Java8?

javin paul said...

@sapan, you mean best way to learn Java8? or you mean using it on your real project?

In case of first try solving programming problems like this using Java 8, for the second use Java 8 with collections and streams.

Arun Gupta said...

I am a bit confused here with the first solution.

How does it guarantee to return duplicate values without first sorting the array?

Unknown said...

Any idea without using extra space ?

javin paul said...

Hello @Vivek, the first solution is without additional space but time complexity is O(n^2), I don't think there is a solution with O(n) without any additional space.

Anonymous said...

HI, For the first 2 solutions , logic is not may be correct for the below inputs,
String[] names = { "Java","Java", "JavaScript", "Python", "C", "Ruby", "Java" };

Anonymous said...

public class App {
public static void main(String[] args) {
String[] names = { "Java", "JavaScript", "Python", "C", "Ruby", "Java" };
List listStr = Arrays.asList(names);
Integer dupCount = 0;
StringBuilder dupvalues = new StringBuilder();

Map map = new HashMap<>();
for (String value : listStr) {
int times = Collections.frequency(listStr, value);
if(map.containsKey(value)){
dupCount++;
map.put(value, String.valueOf(times));
dupvalues.append(value).append(",");
}else{
map.put(value, String.valueOf(times));
}
}
System.out.println(map);
}
}

Ram said...

How to find duplicate number from 1 to N.
ex.input is 1 to 40
output-11,22,33

Anonymous said...

List iList = new ArrayList<>();
for (String s : sarr){
iList.add(s);
}
for (String elem : iList){
if (Collections.frequency(iList, elem) > 1) {
System.out.println("Got one : " + elem);
}
}

time and space complexity both O(n)

javin paul said...

@Anonymous good solution. didn't know about the Collections.frequency() method, is that newly added?

Unknown said...

it will give us wrong answer if we have a element more than two times.
for example - if java is three times in array then in answer it will print two times

Unknown said...

public List getDuplicateArray(String[] in){
Map indexMap=null;
List indexArray=null;
indexMap=new HashMap<>();
for(String n:in) {
if(indexMap.containsKey(n)) {
indexMap.put(n,2);
}else {
indexMap.put(n,1);
}
indexArray=new ArrayList();
for(Integer key:indexMap.keySet()) {
if(indexMap.get(key)>1) {
indexArray.add(key);
}

}return indexArray;

Satya said...

public List getDuplicateArray(String[] in){
Map indexMap=null;
List indexArray=null;
indexMap=new HashMap<>();
for(String n:in) {
if(indexMap.containsKey(n)) {
indexMap.put(n,2);
}else {
indexMap.put(n,1);
}
indexArray=new ArrayList();
for(Integer key:indexMap.keySet()) {
if(indexMap.get(key)>1) {
indexArray.add(key);
}

}return indexArray;

eran_ariel said...

thank you :)

Unknown said...

public static void main(String[] args) {
int arr[]={1,1,1,5,3,3,6,4,3,4,7,3,2,9,3,2,4,1};


for (int i = 0;i < arr.length; i++)
{
for(int j = i + 1; j < arr.length; j++)
{
if(arr [i]==arr [j])

System.out.println("duplicate value sorting="+arr[j]);
}
}
}
}





i can't find how to resolve it & can't seprate the value of duplicate

Anonymous said...

Arr[i]

Post a Comment