There are two ways to search an element in a List class, by using contains() method or by using Collections.binarySearch() method. There are two versions of binarySearch() method, one which takes a List and Comparator and other which takes a List and Comparable. This method searches the specified list for the specified object using the binary search algorithm. The list must be sorted into ascending order according to the natural ordering of its elements (as by the sort(List) method) prior to making this call. If List is not sorted, then results are undefined. If the List contains multiple elements equal to the specified object, there is no guarantee which one will be returned. This method runs in log(n) time for a "random access" list (which provides near-constant-time positional access).

If the specified list does not implement the RandomAccess interface and is large, this method will do an iterator-based binary search that performs O(n) link traversals and O(log n) element comparisons. In the end this method returns the index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1).

The insertion point is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key, or list.size() if all elements in the list are less than the specified key.

This means that return value will be >= 0 if and only if the key is found. Since common implementation of List interface e.g. ArrayList, Vector, CopyOnWriteArrayList and Stack implements RandomAccess interface, they can be used for performing binary search, but there are other implementations like LinkedList, which doesn't implement java.util.RandomAccess, hence not suitable for binary search operation.

Since binary search can only be performed in sorted list, you also need to sort your collection before doing search, which may potentially impact performance, especially if your List is big and not in sorted order already.

###

Here is our program to find object using binary search in Java List. We have a list of Integer with 1M records and uses both contains() and binarySearch() method to search an element.

From the output, you can see that contains() method is almost 10 times faster than binary search, which means it make sense to use contains() for searching objects in List, especially for those which implements RandomAccess interface e.g. ArrayList.

If the specified list does not implement the RandomAccess interface and is large, this method will do an iterator-based binary search that performs O(n) link traversals and O(log n) element comparisons. In the end this method returns the index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1).

The insertion point is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key, or list.size() if all elements in the list are less than the specified key.

This means that return value will be >= 0 if and only if the key is found. Since common implementation of List interface e.g. ArrayList, Vector, CopyOnWriteArrayList and Stack implements RandomAccess interface, they can be used for performing binary search, but there are other implementations like LinkedList, which doesn't implement java.util.RandomAccess, hence not suitable for binary search operation.

Since binary search can only be performed in sorted list, you also need to sort your collection before doing search, which may potentially impact performance, especially if your List is big and not in sorted order already.

###
__Java Code for contains() Vs binarySearch()__

import java.util.ArrayList; import java.util.Collections; import java.util.List; import org.slf4j.Logger; import org.slf4j.LoggerFactory; /** * Java program to perform binary search in Java collection e.g List, Set * @author Javin Paul */ public class BinarySearchTest { public static final Logger logger = LoggerFactory.getLogger(BinarySearchTest.class); public static void main(String args[]) { //creating List List<Integer> numbers = new ArrayList<Integer>(1000000); //List of 1M records //initializing List for(int i =0; i<numbers.size(); i++){ numbers.add(new Integer(i)); } //performing contains search long startTime = System.nanoTime(); boolean isExist = numbers.contains(new Integer(1000000)); long totalTime = System.nanoTime() - startTime; logger.info("Time to search 1Mth Record using contains() is {} nano seconds", totalTime); //performing binary search startTime = System.nanoTime(); Collections.sort(numbers); // List needs to be sorted for Binary Search Integer number = Collections.binarySearch(numbers, new Integer(1000000)); totalTime = System.nanoTime() - startTime; logger.info("Time to search 1Mth Record using binary search is {} nano seconds", totalTime); } } Ouput: 2013-06-04 23:23:17,834 0 [main] INFO test.BinarySearchTest - Time to search 1Mth Record using contains() is 51404 nano seconds 2013-06-04 23:23:17,849 15 [main] INFO test.BinarySearchTest - Time to search 1Mth Record using binary search is 554261 nano seconds

From the output, you can see that contains() method is almost 10 times faster than binary search, which means it make sense to use contains() for searching objects in List, especially for those which implements RandomAccess interface e.g. ArrayList.

## 7 comments :

I would do the sorting prior start time to be more justified.

But yes "contains ' is faster then "binarySearch" for ArrayList,

Below my output :

Time to search 1Mth Record using contains() is {} nano seconds 84585

Time to search 1Mth Record using binary search is {} nano seconds 1283703

for searching in a list., we can indexOf() also, This gives better timing than contains()..

correct me if I am wrong...

@Anonymous: if you check the source code of ArrayList, contains method internally call indexOf method only. So time for both of them will same. I think. (other than a extra method call.)

This is a good tutorial.

Additional info: time complexity of contains and binarySerach depends on the given list. If it is already ordered list, i will go for BinarySearch(which will take O(logn) time). If the input list is not ordered, then do contains, which will do a liner search(take time O(n)).

Well, I think that this is not fare relatively to binarySearch() as you measure it's time in conjunction with sorting invocation which takes additional O(NlogN) time. It would be more convenient if you test these methods on a sorted list or take a timestamp not before sorting a list but just before invoking of binarySearch(). You will get antipodal results.

I don't agree "contains() method is almost 10 times faster than binary search" infact the binary search is always many times faster than contains()

Reason : contains uses traditional linear search algorithm so the complexity is O(n) whereas the Binary search has complexity of O(logn).

Your code is giving incorrect result because the list (numbers) you are filling is not actually filled.

for(int i =0; i numbers = new ArrayList(capacity);

// List of 1M records //initializing List

for (int i = 0; i < capacity; i++) {

numbers.add(new Integer(i));

}

long startTime = System.nanoTime();

Collections.sort(numbers);

long totalTime = System.nanoTime() - startTime;

logger.info("Time to sort 1 Million Records is {} nano seconds", totalTime);

// performing contains search

startTime = System.nanoTime();

boolean isExist = numbers.contains(new Integer(1000000));

totalTime = System.nanoTime() - startTime;

logger.info("Time to search 1Mth Record using contains() is {} nano seconds", totalTime);

// performing binary search

startTime = System.nanoTime();

// List needs to be sorted for Binary Search

Integer number = Collections.binarySearch(numbers, new Integer(1000000));

totalTime = System.nanoTime() - startTime;

logger.info("Time to search 1Mth Record using binary search is {} nano seconds", totalTime);

}

}

The output of the above code is :

[main] INFO BinarySearchTest - Time to sort 1 Million Records is 19167309 nano seconds

[main] INFO BinarySearchTest - Time to search 1Mth Record using contains() is 4889023 nano seconds

[main] INFO BinarySearchTest - Time to search 1Mth Record using binary search is 38720 nano seconds

So you can clearly see the BinarySearch is more than 100 times faster than linear search in this case. Actually it depends upon the size of Array you are searching on.

Unfortunately this is a misleading article. You can not compare contains and binarySearch with just few records. Test both of them in 100k or million records then only you will see the actual difference.

Smit Jain is correct, there is no record in the list actually because list size is zero, therefore it never populates the list.

But here is catch, record time only for Collections.binarySearch(numbers, new Integer(1000000)); then only you will see the actual difference. The sorting is taking time and therefore you see different result from binary search.

Binary search is good only when you already have sorted list else it will take almost same amount of time what linear search is taking.

## Post a Comment