Friday, January 3, 2014

How to Remove Duplicates from Array without using Java Collection API

This is a coding questions recently asked to one of my reader in a Java Technical interview. Question was to remove duplicates from an integer array without using any collection API classes like Set or LinkedHashSet, which can make this task trivial. In general, if you need to do this for any project work, I suggest better using Set interface, particularly LinkedHashSet, because that also keep the order on which elements are inserted into Set. Only for technical interview perspective, you need to do this using either loops or recursion,  depending upon what is your strongest area. In this article, I am sharing a naive solution, which has lots of limitation to be considered as production quality code, It's not the best solution but still a solution. Main problem, while dealing with array is not finding duplicates, it's about removing them. Since array is a static, fixed length data structure, you can not change its length. This means, deleting an element from an array requires creating a new array and copying content into that array. If your input array contains lots of duplicates then this may result in lots of temporary array. It also increase cost of copying contents, which can be very bad. Given this restriction, you need to come out with a strategy to minimize both memory and CPU requirements.



Java Program to remove duplicates from integer array without Collection

Java Program to remove duplicates from Integer array without CollectionIn this program, we have not used any collection class to remove duplicates, earlier, I had shown you a way to remove duplicates from ArrayList, which was using LinkedHashSet. You can still use that solution if interviewer doesn't mention without Collection specifically. All you need to do is to convert your array into ArrayList first then subsequently create a LinkedHashSet from that ArrayList. In this example, we are removing duplicates from array by not copying them into result array, this solution is not actually deleting duplicates instead it replacing it from default value i.e. zero.

import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * Java program to remove duplicates from this array. You don't
 * need to physically delete duplicate elements, replacing with null, or
 * empty or default value is ok.
 *
 * @author http://javarevisited.blogspot.com
 */
public class TechnicalInterviewTest {

    private static final Logger logger = LoggerFactory.getLogger(TechnicalInterviewTest.class);

    public static void main(String args[]) {

        int[][] test = new int[][]{
            {1, 1, 2, 2, 3, 4, 5},
            {1, 1, 1, 1, 1, 1, 1},
            {1, 2, 3, 4, 5, 6, 7},
            {1, 2, 1, 1, 1, 1, 1},};

        for (int[] input : test) {
            System.out.println("Array with Duplicates       : " + Arrays.toString(input));
            System.out.println("After removing duplicates   : " + Arrays.toString(removeDuplicates(input)));
        }
    }

    /*
     * Method to remove duplicates from array in Java, without using
     * Collection classes e.g. Set or ArrayList. Algorithm for this
     * method is simple, it first sort the array and then compare adjacent
     * objects, leaving out duplicates, which is already in result.
     */
    public static int[] removeDuplicates(int[] numbersWithDuplicates) {

        // Sorting array to bring duplicates together      
        Arrays.sort(numbersWithDuplicates);     
      
        int[] result = new int[numbersWithDuplicates.length];
        int previous = numbersWithDuplicates[0];
        result[0] = previous;

        for (int i = 1; i < numbersWithDuplicates.length; i++) {
            int ch = numbersWithDuplicates[i];

            if (previous != ch) {
                result[i] = ch;
            }
            previous = ch;
        }
        return result;

    }
}

Output :
Array with Duplicates       : [1, 1, 2, 2, 3, 4, 5]
After removing duplicates   : [1, 0, 2, 0, 3, 4, 5]
Array with Duplicates       : [1, 1, 1, 1, 1, 1, 1]
After removing duplicates   : [1, 0, 0, 0, 0, 0, 0]
Array with Duplicates       : [1, 2, 3, 4, 5, 6, 7]
After removing duplicates   : [1, 2, 3, 4, 5, 6, 7]
Array with Duplicates       : [1, 2, 1, 1, 1, 1, 1]
After removing duplicates   : [1, 0, 0, 0, 0, 0, 2]


That's it about how to remove duplicates from array in Java without using Collection class. As I said before, this solution is not perfect and has some serious limitation, which is an exercise for you to find out. One hint I can give is that, array itself can contain default value as duplicates e.g. 0 for int, even if you use any Magic number e.g. Integer.MAX_VALUE, you can  not be certain that they will not be part of input. Regarding removing duplicate permanently from result array, one approach could be to count number of duplicates and then create an array of right size i.e. length - duplicates, and then copying content from intermediate result array to final array, leaving out elements which are marked duplicate.


8 comments :

Vivek Singh said...

import java.util.Arrays;

public class Removeduplicates {

/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub


int[] a = { 3, 1, 1, 4, 1, 4, 5 };

Arrays.sort(a);
int count = 0;

long start = System.currentTimeMillis();

for (int i = 0; i < a.length; i++) {
if (i + 1 < a.length && a[i] == a[i + 1]) {
count++;
}

}

int[] b = new int[a.length - count];
int c = 0;
for (int j = 0; j < a.length; j++) {
if (j + 1 < a.length && a[j] == a[j + 1]) {

} else {
b[c] = a[j];
c++;
}
}

a = b;
System.out.println(Arrays.toString(a) + "took"
+ (System.currentTimeMillis() - start));

}
}

Anonymous said...

why do u need to sort the the array first ?

Anonymous said...

If the array is not sorted you can have duplicates again after the procedure. So sorting is crucial.

Anonymous said...

int[] in = { 3, 1, 1, 4, 1, 4, 5 };
BitSet filter = new BitSet(in.length);
for (int i = 0; i < in.length; i++) {
if (filter.get(in[i])) {
in[i] = 0;
} else {
filter.set(in[i]);
}
}

Anonymous said...

@Anonymous : Why BitSet here? Why not another Array?

陶智 said...

I think u can compare the element while u r sorting.It could be more convenient.

Anonymous said...

int [] num={25,52,25,65,8,96,8,25};
String s="";
for (int i = 0; i < num.length; i++) {
if(!s.contains(String.valueOf(num[i])))
s+=num[i]+",";
}
String stringArray[]=s.split(",");
int [] uniqnum=new int[stringArray.length];
for (int i = 0; i < stringArray.length; i++)
uniqnum[i]=Integer.parseInt(stringArray[i]);
for (int i = 0; i < uniqnum.length; i++) {
System.out.println(uniqnum[i]);
}

Anonymous said...

import java.util.*;
class rmduplicates
{

public static void main(String args[])
{
String[] s={"sasi","sai","nag","sasi","nag","babu","babu","sai"};
int k=0;
/*String[] s=new String[5];
Scanner sc=new Scanner(System.in);
for(int i=0;i<5;i++)
{
s[i]=sc.next();
}*/
Arrays.sort(s);
k=duplicates(s);
System.out.println("total duplicates:"+k);
for(int i=0;i<s.length;i++)
{
for(int j=i+1;j<s.length;j++)
{
if(s[i]==s[j])
{
copy(s,i,j);
}
}
}
for(int i=0;i<k;i++)
{
System.out.println(s[i]);
}
}
public static void copy(String[] s,int i,int j)
{
for(int k=j;k<s.length-1;k++)
{
s[k]=s[k+1];

}

}
public static int duplicates(String[] s)
{
int k=0;
for(int i=0;i<s.length;i++)
{
for(int j=i+1;j<s.length;j++)
{
if(s[i]==s[j])
{
k++;
}
}
}
return k;
}
}

Post a Comment