Wednesday, May 10, 2023

Difference between PriorityQueue and TreeSet in Java? Example

The PriorityQueue and TreeSet collection classes have a lot of similarities e.g. both provide O(log(N)) time complexity for adding, removing, and searching elements, both are non-synchronized and you can get elements from both PriorityQueue and TreeSet in sorted order, but there is a fundamental difference between them, TreeSet is a Set and doesn't allow a duplicate element, while PriorityQueue is a queue and doesn't have such restriction. It can contain multiple elements with equal values and in that case head of the queue will be arbitrarily chosen from them.

Another key difference between TreeSet and PriorityQueue is iteration order, though you can access elements from the head in sorted order like head always give you lowest or highest priority element depending upon your Comparable or Comparator implementation iterator returned by PriorityQueue doesn't provide any ordering guarantee.

Only guarantee PriorityQueue gives that head will always be the smallest or largest element. On the other hand, TreeSet keeps all elements in sorted order, and the iterator returned by TreeSet will allow you to access all elements in that sorted order.

This is one of the frequently asked Collection interview questions and what makes it interesting is the subtle difference between a lot of similarities between PriorityQueue and TreeSet.  You can use one in place of another in some scenarios but not in all scenarios and that's what the interviewer is looking for when he asked this question to you on Interview.




What are similarities between PriorityQueue and TreeSet

Before looking at the difference between Priority Queue and TreeSet, let's first understand the similarities between them. This will help you to think of scenarios where you can use a PriorityQueue in place of TreeSet or vice-versa.

1. ThreadSafety

The first similarity between PriorityQueue and TreeSet is that both are not thread-safe, which means you cannot share them between multiple threads. If multiple threads are required to modify the TreeSet at the same time, then you must synchronize their access externally.

2. Ordering

The third similarity between PriorityQueue and TreeSet is that both can be used to access elements in a particular order. TreeSet is a SortedSet, hence it always keeps the elements in the order defined by their Comparator or natural order if there is no Comparator defined, while PriorityQueue will always make sure that head contains the lowest or highest value depending upon the order imposed by Comparator or Comparable.




3. Eligibility 

When I say eligibility, which means which objects can be stored in PrioritySet and TreeSet? Is there any restriction or all objects are allowed? Well, there is, you can only store objects which implement Comparable or Comparator in both PriorityQueue and TreeSet because the collection classes are responsible for keeping their commitment i.e. PriorityQueue must adjust after every insertion or deletion to keep the lowest or highest element in head position. Similarly, TreeSet must re-arrange elements so that they remain the sorted order specified by Comparator or natural order imposed by Comparable.


4. Performance

This is one point where you will see both similarities and differences between PriorityQueue and TreeSet in Java, like both, provide O(log(N)) complexity for adding, removing, and searching elements, but when you want to remove the highest or lowest priority element then PriorityQueue gives O(1) performance because it always keeps the element in the head, much like a heap data structure i.e. minimum heap where the root is always the minimum value. 

If you are not familiar with the heap data structure, I suggest you join these Data Structure Algorithm courses or read a good book on data structure like Algorithms 4th Edition by Robert Sedgewick.

TreeSet vs PriorityQueue in Java


Difference between PriorityQueue and TreeSet

Now, that you know and understand the similarities between TreeSet and PrioritySet, let's see how different they are? and why you cannot use PrioritySet in place of TreeSet in Java?


1. Underlying Data Structure

This first and foremost difference is the underlying data structure. PriorityQueue is a Queue and that's why it provides the functionality of the FIFO data structure, while TreeSet is a Set and doesn't provide the Queue API.


2. Duplicate Elements

The second difference between them can be derived from the first difference i.e. properties of their underlying data structure. Since TreeSet is a Set it doesn't allow duplicate elements but PriorityQueue may contain duplicates. In the case of ties, the head of the priority queue is chosen arbitrarily.



3. Performance

The third difference between TreeSet and PrirityQueue comes from their relative performance. The PriorityQueue provides the largest or smallest element in O(1) time, which is not possible by TreeSet. Since TreeSet is backed by a red-black tree, the search operation will take O(logN) time.

This is why if you are developing an application where priority matters e.g. a job scheduling algorithm where you always want to execute the job with the highest priority, you should use a PriorityQueue data structure.


4. Availability

The 5th and last difference between PriorityQueue and TreeSet class are that the former was added in JDK 1.5 while TreeSet was available from JDK 1.4 itself. This is not a very significant difference in the age of Java 8, but if you are working with legacy systems still running on Java 1.5, a point worth remembering.


5. Ordering

The fourth difference, which is more subtle than you think because in similarities I have told that both are responsible for keeping some order. The key point is that in TreeSet all elements remain in the sorted order, while in priority queue apart from the root, which is guaranteed to be smallest or largest depending upon Comparing logic, rest of the element may or may not follow any order.

This means if you store the same elements in the TreeSet and PriorityQueue and iterate over them, then their order will be different. TreeSet will print them in sorted order but PriorityQueue will not until you are always getting the element from the head.  You can also read a good core Java book like Java: The Complete Reference, Ninth Edition to learn more about PriorityQueue implementation in Java.

Difference between PriorityQueue and TreeSet in Java



Using PriorityQueue and TreeSet in Java Program

Now, let's some code to highlight the difference between PriorityQueue and TreeSet in Java. If you remember the most subtle difference I mention in the previous paragraph was that TreeSet keeps all elements in the sorted order, while PriorityQueue only keeps the root or head into order.

I mean the lowest element will always be at root in PriorityQueue. If you use poll() which consumes the head and removes it, you will always retrieve the elements in increasing order from PriorityQueue, as shown in the following Java program.

import java.util.Collections;
import java.util.Date;
import java.util.Iterator;
import java.util.List;
import java.util.PriorityQueue;
import java.util.Queue;
import java.util.Set;
import java.util.TreeSet;

/**
 * Java Program to demonstrate difference between PriorityQueue
 * and TreeSet. 
 * 
 * @author WINDOWS 8
 *
 */
public class App {  

  public static void main(String args[]) {

      Set setOfNumbers = new TreeSet<>();
      Queue queueOfNumbers = new PriorityQueue<>();
      
      // inserting elements into TreeSet and PriorityQueue
      setOfNumbers.add(202);
      setOfNumbers.add(102);
      setOfNumbers.add(503);
      setOfNumbers.add(33);
      
      queueOfNumbers.add(202);
      queueOfNumbers.add(102);
      queueOfNumbers.add(503);
      queueOfNumbers.add(33);
      
      // Iterating over TreeSet
      System.out.println("Iterating over TreeSet in Java");
      Iterator itr = setOfNumbers.iterator();
      while(itr.hasNext()){
        System.out.println(itr.next());
      }
      
      // Iterating over PriorityQueue
      System.out.println("Iterating over PriorityQueue in Java");
      itr = queueOfNumbers.iterator();
      while(itr.hasNext()){
        System.out.println(itr.next());
      }
      
      // Consuming numbers using from head in PriorityQueue
      System.out.println("polling a PriorityQueue in Java");
      while(!queueOfNumbers.isEmpty()){
        System.out.println(queueOfNumbers.poll());
      }
  }
     

}

Output:
Iterating over TreeSet in Java
33
102
202
503
Iterating over PriorityQueue in Java
33
102
503
202
polling a PriorityQueue in Java
33
102
202
503

You can see that the order is different from the Iterator returned by PriorityQueue and HeadSet (highlighted by red) but when you use the poll() method to consume all elements in PriorityQueue, you have got them in the same order they were present in TreeSet i.e. lowest to highest.

This proves the point that TreeSet keeps all elements in sorted order while PriorityQueue only cares about the root or head position.  This kind of detail is very important from the Java certification perspective as well, you will find a lot of questions based upon such subtle details in mock exams as well as on the real exam.


Summary

Finally, here is the summary of all the differences between TreeSet and PriorityQueue in Java on the nice tabular format for easy reference:

Difference between PriorityQueue vsTreeSet in Java


That's all about the difference between TreeSet and PrirorityQueue in Java. Though both provide some sort of sorting capability there is a huge difference between the behavior of these two collection classes. 

The first and most important difference is one is Set and the other is Queue, which means one doesn't allow duplicate while the other is FIFO data structure without any restriction on duplication.  If you want to learn more about advanced data structure and collection classes in Java, I suggest joining these Java Collections and Stream API courses which cover Java Collection Framework in depth. 



Other Java Collection Interview Questions you may like
  • Difference between ArrayList and HashSet in Java? (answer)
  • Difference between Hashtable and HashMap in Java? (answer)
  • Difference between HashMap and LinkedHashMap in Java? (answer)
  • Difference between ConcurrentHashMap and HashMap in Java? (answer)
  • Difference between ArrayList and Vector in Java? (answer)
  • Difference between ArrayList and LinkedList in Java? (answer)
  • Difference between TreeMap, HashMap, and LinkedHashMap in Java? (answer)
  • Difference between fail-fast and fail-safe Iterator in Java? (answer)
  • Difference between HashMap, Hashtable, and ConcurrentHashMap in Java? (answer)
  • Difference between an Array and ArrayList in Java? (answer)
  • Difference between synchronized ArrayList and CopyOnWriteArrayList in Java? (answer)

Thanks for reading this article so far, if you like this article then please share it with your friends and colleagues. If you have any questions or feedback then please drop a comment.

16 comments :

Robin Richtsfeld said...

The PriorityQueue should have complexity Of(log(n)), but the Java implementation is actually O(n).
This was very disturbing when I found out, so I had to rewrite it completely.

javin paul said...

Hello Robin, for which operation you saw O(n) time complexity? could you please explain little more? Thanks

Robin Richtsfeld said...

Hello Javon Paul, the problem is the 'indexOf' method using a plain old for loop and which is used subsequently by 'remove' and 'contains', you get the idea...

javin paul said...

Hello Robin,
Indeed, this is seriously weired and it hasn't been rectified yet, even on Java 8. I checked the PriorityQueue code for both Java 1.7 and Java 8 and I still see same code:

public boolean contains(Object o) {
return indexOf(o) != -1;
}

private int indexOf(Object o) {
if (o != null) {
for (int i = 0; i < size; i++)
if (o.equals(queue[i]))
return i;
}
return -1;
}

Did you raise this to Oracle Java Forum?

Robin Richtsfeld said...

Hello Javin Paul,
I even tried to submit an alternative implementation but didn't cope with how the JDK development was organized. But now that you mention it, Oracle Java Forum seems to be a good starting point. May I reference this blog post and comments from there?

javin paul said...

Hello Robin, sure, you can. You are helping the community, thank you.

Rey said...

I know that PriorityQueue is implemented by a heap behind the sense, and searching an element in a heap take O(N) time.
So, there are any way to improve the contains() or remove() function in Java?

Rey said...

Another problem is you said: "but when you want to remove the highest or lowest priority element then PriorityQueue gives O(1) performance because it always keep the element in head, much like a heap data structure".
Why it take O(1), heap take O(logn) to remove the root and then siftDown() the heap?
So it should take O(logn)??

javin paul said...

@Duc, if you remove it just take O(1) time but yes if you take re-arranging the heap them it takes O(logN) time.

Robin Richtsfeld said...

The whole point of a PriorityQueue is that its elements are ordered. This allows for O(logN) for most operations.
Removing the first element takes so long because it doesn't make use of this and instead moves one element at a time, hence it has O(n).

Rey said...

Hi Robin,
You said that 'remove' and 'contains' of PriorityQueue in Java is bad (It take O(n)). So, how can we make it faster?
I still not get your ideal.
Thanks,

Robin Richtsfeld said...

@Duc When removing from head, just increment a pointer and make the array cyclic -> O(1). When removing from middle, use the improved indexOf -> O(logN) and System.arraycopy and contains using improved indexOf -> O(logN).
Currently indexOf is O(N) but it could be O(logN) when implemented correctly.

Anonymous said...

The PriorityQueue in Java is implemented as Heap but the PriorityQueue class is derived from AbstractQueue. It's still a heap. A priority queue is an abstract data structure, while a heap is a non-abstract data structure satisfying all requirements of a priority queue.

Anonymous said...

You CAN'T REMOVE the highest/lowest in O(1).
You can actually only search for it in O(1) but removal will take O(logn) to complete

Here's the code for the PriorityQueue#pool method:

public E poll() {
if (size == 0)
return null;
int s = --size;
modCount++;
E result = (E) queue[0];
E x = (E) queue[s];
queue[s] = null;
if (s != 0)
siftDown(0, x);
return result;
}

The siftDown subroutine runs an internal while loop, which executes in O(logn)

http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/PriorityQueue.java

Anonymous said...

* `remove` uses `indexOf` and `siftDown`
* `indexOf` is O(n), could be O(log(n))
* `siftDown` is O(log(n))
* ergo -> `remove` is O(n), could be O(log(n))

Rajesh said...

The article is good. It needs one correction: it takes logN to remove the top element rather than (1) since it needs to re-adjust the elements also. Also, I think it is not mentioned that you cannot insert null into PriorityQueue

Post a Comment