There are two ways to search an element in a List class, by using contains() method or by using Collections.binarySearch() method. There are two versions of binarySearch() method, one which takes a List and Comparator and other which takes a List and Comparable. This method searches the specified list for the specified object using the binary search algorithm. The list must be sorted into ascending order according to the natural ordering of its elements (as by the sort(List) method) prior to making this call. If List is not sorted, then results are undefined. If the List contains multiple elements equal to the specified object, there is no guarantee which one will be returned. This method runs in log(n) time for a "random access" list (which provides near-constant-time positional access).
If the specified list does not implement the RandomAccess interface and is large, this method will do an iterator-based binary search that performs O(n) link traversals and O(log n) element comparisons. In the end this method returns the index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1).
The insertion point is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key, or list.size() if all elements in the list are less than the specified key.
This means that return value will be >= 0 if and only if the key is found. Since common implementation of List interface e.g. ArrayList, Vector, CopyOnWriteArrayList and Stack implements RandomAccess interface, they can be used for performing binary search, but there are other implementations like LinkedList, which doesn't implement java.util.RandomAccess, hence not suitable for binary search operation.
Since binary search can only be performed in sorted list, you also need to sort your collection before doing search, which may potentially impact performance, especially if your List is big and not in sorted order already.
Here is our program to find objects using binary search in Java List. We have a list of Integer with 1M records and uses both contains() and binarySearch() method to search an element.
From the output, you can see that contains() method is almost 10 times faster than the binary search, which means it makes sense to use contains() for searching objects in List, especially for those which implement the RandomAccess interface like ArrayList.
If the specified list does not implement the RandomAccess interface and is large, this method will do an iterator-based binary search that performs O(n) link traversals and O(log n) element comparisons. In the end this method returns the index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1).
The insertion point is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key, or list.size() if all elements in the list are less than the specified key.
This means that return value will be >= 0 if and only if the key is found. Since common implementation of List interface e.g. ArrayList, Vector, CopyOnWriteArrayList and Stack implements RandomAccess interface, they can be used for performing binary search, but there are other implementations like LinkedList, which doesn't implement java.util.RandomAccess, hence not suitable for binary search operation.
Since binary search can only be performed in sorted list, you also need to sort your collection before doing search, which may potentially impact performance, especially if your List is big and not in sorted order already.
Java Code for contains() Vs binarySearch()
import java.util.ArrayList; import java.util.Collections; import java.util.List; import org.slf4j.Logger; import org.slf4j.LoggerFactory; /** * Java program to perform binary search in Java collection e.g List, Set * @author Javin Paul */ public class BinarySearchTest { public static final Logger logger = LoggerFactory.getLogger(BinarySearchTest.class); public static void main(String args[]) { //creating List List<Integer> numbers = new ArrayList<Integer>(1000000); //List of 1M records //initializing List for(int i =0; i<numbers.size(); i++){ numbers.add(new Integer(i)); } //performing contains search long startTime = System.nanoTime(); boolean isExist = numbers.contains(new Integer(1000000)); long totalTime = System.nanoTime() - startTime; logger.info("Time to search 1Mth Record using contains() is {} nano seconds", totalTime); //performing binary search startTime = System.nanoTime(); Collections.sort(numbers); // List needs to be sorted for Binary Search Integer number = Collections.binarySearch(numbers, new Integer(1000000)); totalTime = System.nanoTime() - startTime; logger.info("Time to search 1Mth Record using binary search is {} nano seconds", totalTime); } } Ouput: 2013-06-04 23:23:17,834 0 [main] INFO test.BinarySearchTest - Time to search 1Mth Record using contains() is 51404 nano seconds 2013-06-04 23:23:17,849 15 [main] INFO test.BinarySearchTest - Time to search 1Mth Record using binary search is 554261 nano seconds
From the output, you can see that contains() method is almost 10 times faster than the binary search, which means it makes sense to use contains() for searching objects in List, especially for those which implement the RandomAccess interface like ArrayList.
I would do the sorting prior start time to be more justified.
ReplyDeleteBut yes "contains ' is faster then "binarySearch" for ArrayList,
Below my output :
Time to search 1Mth Record using contains() is {} nano seconds 84585
Time to search 1Mth Record using binary search is {} nano seconds 1283703
for searching in a list., we can indexOf() also, This gives better timing than contains()..
ReplyDeletecorrect me if I am wrong...
@Anonymous: if you check the source code of ArrayList, contains method internally call indexOf method only. So time for both of them will same. I think. (other than a extra method call.)
ReplyDeleteThis is a good tutorial.
Additional info: time complexity of contains and binarySerach depends on the given list. If it is already ordered list, i will go for BinarySearch(which will take O(logn) time). If the input list is not ordered, then do contains, which will do a liner search(take time O(n)).
Well, I think that this is not fare relatively to binarySearch() as you measure it's time in conjunction with sorting invocation which takes additional O(NlogN) time. It would be more convenient if you test these methods on a sorted list or take a timestamp not before sorting a list but just before invoking of binarySearch(). You will get antipodal results.
ReplyDeleteI don't agree "contains() method is almost 10 times faster than binary search" infact the binary search is always many times faster than contains()
ReplyDeleteReason : contains uses traditional linear search algorithm so the complexity is O(n) whereas the Binary search has complexity of O(logn).
Your code is giving incorrect result because the list (numbers) you are filling is not actually filled.
for(int i =0; i numbers = new ArrayList(capacity);
// List of 1M records //initializing List
for (int i = 0; i < capacity; i++) {
numbers.add(new Integer(i));
}
long startTime = System.nanoTime();
Collections.sort(numbers);
long totalTime = System.nanoTime() - startTime;
logger.info("Time to sort 1 Million Records is {} nano seconds", totalTime);
// performing contains search
startTime = System.nanoTime();
boolean isExist = numbers.contains(new Integer(1000000));
totalTime = System.nanoTime() - startTime;
logger.info("Time to search 1Mth Record using contains() is {} nano seconds", totalTime);
// performing binary search
startTime = System.nanoTime();
// List needs to be sorted for Binary Search
Integer number = Collections.binarySearch(numbers, new Integer(1000000));
totalTime = System.nanoTime() - startTime;
logger.info("Time to search 1Mth Record using binary search is {} nano seconds", totalTime);
}
}
The output of the above code is :
[main] INFO BinarySearchTest - Time to sort 1 Million Records is 19167309 nano seconds
[main] INFO BinarySearchTest - Time to search 1Mth Record using contains() is 4889023 nano seconds
[main] INFO BinarySearchTest - Time to search 1Mth Record using binary search is 38720 nano seconds
So you can clearly see the BinarySearch is more than 100 times faster than linear search in this case. Actually it depends upon the size of Array you are searching on.
Unfortunately this is a misleading article. You can not compare contains and binarySearch with just few records. Test both of them in 100k or million records then only you will see the actual difference.
ReplyDeleteSmit Jain is correct, there is no record in the list actually because list size is zero, therefore it never populates the list.
ReplyDeleteBut here is catch, record time only for Collections.binarySearch(numbers, new Integer(1000000)); then only you will see the actual difference. The sorting is taking time and therefore you see different result from binary search.
Binary search is good only when you already have sorted list else it will take almost same amount of time what linear search is taking.
That is not correct at all, when the list is sorted binary search is much faster than contain
ReplyDeletewhen you need to sort it first, then contain method is faster.
Just check below //initializing List
ReplyDeleteThis idiot is not even adding any value to the list.
And moreover binary search is preferred where you got to search many times for something in same list, and that's why you first sort it and search many times and then overall you will find difference here.