Difference between row_number(), rank() and dense_rank() in SQL Server, Oracle.

Though all three are ranking functions in SQL, also known as window function in Microsoft SQL Server, the difference between rank(), dense_rank(), and row_number() comes when you have ties on ranking i.e. duplicate records. For example, if you are ranking employees by their salaries then what would be the rank of two employees of same salaries? It depends on upon which ranking function you are using e.g. row_number, rank, or dense_rank. The row_number() function always generates a unique ranking even with duplicate records i.e. if the ORDER BY clause cannot distinguish between two rows, it will still give them different rankings, though which record will come earlier or later is decided randomly e.g. in our example two employees Shane and Rick have the same salary and has row number 4 and 5, this is random, if you run again, Shane might come 5th.

The rank() and dense_rank() will give the same ranking to rows that cannot be distinguished by the order by clause, but dense_rank will always generate a contiguous sequence of ranks e.g. (1,2,3,...), whereas rank() will leave gaps after two or more rows with the same rank (think "Olympic Games": if two athletes win the gold medal, there is no second place, only third).  You can also see T-SQL fundamentals or Querying Microsoft SQL Server to learn more about how rank and dense_rank break ties.

Surprisingly all these functions behave similarly in Microsoft SQL Server and Oracle, at least at the high level, so if you have used them in MSSQL, you can also use it on Oracle 11g or other versions.




SQL to build schema
Here is the SQL to create table and insert some data into it for demonstration purpose:

IF OBJECT_ID( 'tempdb..#Employee' ) IS NOT NULL
DROP TABLE #Employee;

CREATE TABLE #Employee (name varchar(10), salary int);

INSERT INTO #Employee VALUES ('Rick', 3000);
INSERT INTO #Employee VALUES ('John', 4000);
INSERT INTO #Employee VALUES ('Shane', 3000);
INSERT INTO #Employee VALUES ('Peter', 5000);
INSERT INTO #Employee VALUES ('Jackob', 7000);
INSERT INTO #Employee VALUES ('Sid', 1000);

You can see that we have included two employees with same salaries i.e. Shane and Rick, just to demonstrate the difference between row_number, rank, and dense_rank window function in SQL server, which is obvious when there are ties in ranking.


ROW_NUMBER() Example
It always generates a unique value for each row, even if they are same and ORDER BY clause cannot distinguish between them. That's why it is used to solve problems like second highest salary or nth highest salary, we have seen earlier.

In the following example, we have two employees with the same salary and even though we have generated row numbers over salary column it produces different row number for those two employees with the same salary.

select e.*, row_number() over (order by salary desc) row_number from #Employee e
result:
name    salary  row_number
Jackob  7000    1
Peter   5000    2
John    4000    3
Shane   3000    4
Rick    3000    5
Sid     1000    6

You can see in this example that we have ranked employees based upon their salaries and each of them have unique rank even if their salaries are same e.g. Shane and Rick have same salary 3000 but they got the unique rank 4th and 5th. It's worth knowing that in the case of a tie, ranks are assigned on a random basis, see Querying Microsoft SQL Server to learn more about when to use the row_number() function in SQL Server.

Difference between row_number(), rank() and dense_rank() in SQL Server



RANK() Example
The rank() function will assign the same rank to same values i.e. which are not distinguishable by ORDER BY. Also, the next different rank will not start from immediately next number but there will be gap i.e. if 4th and 5th employee have the same salary then they will have same rank 4, and 6th employee which has different salary will have new rank 6.

Here is the example to clarify the point:

select e.*, rank() over (order by salary desc) rank from #Employee e
result:
name    salary  rank
Jackob  7000    1
Peter   5000    2
John    4000    3
Shane   3000    4
Rick    3000    4
Sid     1000    6

You can see that both Shane and Rick has got the same rank 4th, but the Sid got the rank 6th, instead of 5 because it keep original ordering.



DENSE_RANK() Example
The dense_rank function is similar to rank() window function i.e. same values will be assigned the same rank, but the next different value will have rank which is just one more than the previous rank, i.e. if 4th and 5th employee has the same salary then they will have same rank but 6th employee, which has different salary will have rank 5, unlike rank 6 as is the case with rank() function. There will be no gap on ranking in case of dense_rank() as shown in the following example:

select e.*, dense_rank() over (order by salary desc) dense_rank from #Employee e
name    salary  dense_rank
Jackob  7000    1
Peter   5000    2
John    4000    3
Shane   3000    4
Rick    3000    4
Sid     1000    5

You can see that both Shane and Rick has same ranking 4th, but Sid now has 5th rank which is differnt than 6th in earlier example when we used the rank() function. Btw, if you are serious to master SQL, I stronly sugest to read Joe Celko's SQL for Smarties, one of the more advanced book in SQL.

rank vs dense_rank vs row_number in SQL Server and Oracle



Difference between row_number vs rank vs dense_rank

As I told, the difference between rank, row_number, and dense_rank is visible when there are duplicate records. Since in all our example we are ranking records on salary, if two records will have the same salary then you will notice the difference between these three ranking functions.

The row_number gives continuous numbers, while rank and dense_rank give the same rank for duplicates, but the next number in rank is as per continuous order so you will see a jump but in dense_rank doesn't have any gap in rankings.

-- difference between row_number(), rank(), and dense_rank()
-- will only visible when there were duplicates.
-- row_number gives consecutive ranking even with duplicate
-- rank and dense_rank give the same ranking but rank has a jump
-- while dense_rank doesn't have jump

select e.*,
row_number() over (order by salary desc) row_number, 
rank() over (order by salary desc) rank,
dense_rank() over (order by salary desc) as dense_rank 
from #Employee e

and here is the output which clearly shows the difference in the ranking generated by rank() and dense_rank() function:

Difference between rank and dense_rank and row_number in Java

You can see the employees Shane and Rick have same salary 3000 hence their ranking is same when you use the rank() and dense_rank() but the next ranking is 6 which is as per continuous ranking using rank() and 5 when you use dense_rank(). The row_number() doesn't break ties and always give a unique number to each record.

Btw, I ran all three SQL queries on Oracle 11g R2 and it gave me the same result. So, it seems both Oracle and SQL Server support these function and they behave identically.


That's all about the difference between ROW_NUMBER(), RANK() and DENSE_RANK() function in SQL SERVER. As I told, the difference boils down to the fact when ties happen. In the case of the tie, ROW_NUMBER() will give unique row numbers, the rank will give the same rank, but the next different rank will not be in sequence, there will be a gap. In the case of dense_rank, both rows in the tie will have the same rank and there will be no gap. The next different rank will be in sequence.

Other SQL tutorials you may find interesting
  • 21 Frequently asked SQL Queries from Interviews (read here)
  • The difference between WHERE and HAVING clause in SQL? (answer)
  • What is the difference between UNION and UNION ALL in SQL? (answer)
  • What is the difference between LEFT and RIGHT OUTER JOIN in SQL? (answer)
  • What is the difference between View and Materialized View in Oracle? (answer)
  • What is the difference between self and equijoin in SQL? (answer)
  • The difference between TRUNCATE and DELETE in SQL? (answer)
  • What is the difference between Primary and Foreign key in a table? (answer)
  • How to join three tables in one SQL query? (solution)
  • 5 websites to learn SQL online for Free (list)
  • How to find all customers who have never ordered? (solution)
  • How to remove duplicate rows from a table in SQL? (solution)
  • What is the difference between close and deallocate a cursor? (answer)

No comments :

Post a Comment