Thursday, August 1, 2013

Difference between WHERE vs HAVING clause in SQL - GROUP BY Comparison with Example

What is difference between WHERE and  HAVING clause in SQL is one of the most popular question asked on SQL and database interviews, especially to beginners. Since programming jobs, required more than one skill, it’s quite common to see couple of SQL Interview questions in Java and .NET interviews. By the way unlike any other question, not many Java programmers or dot net developers, who is supposed to have knowledge of basic SQL, fail to answer this question. Though almost half of the programmer says that WHERE is used in any SELECT query, while HAVING clause is only used in SELECT queries, which contains aggregate function or group by clause, which is correct. Though both WHERE and HAVING clause is used to specify filtering condition in SQL, there is subtle difference between them. Real twist comes into interview, when they are asked to explain result of a SELECT query, which contains both WHERE and HAVING clause, I have seen many people getting confused there. Key point, which is also main difference between WHERE and HAVING clause in SQL is that, condition specified in WHERE clause is used while fetching data (rows) from table, and data which doesn't pass the condition will not be fetched into result set, on the other hand HAVING clause is later used to filter summarized data or grouped data. In short if both WHERE and HAVING clause is used in a SELECT query with aggregate function or GROUP BY clause, it will execute before HAVING clause. This will be more clear, when we will see an example of WHERE, HAVING, JOIN and GROUP BY clause together.


WHERE vs HAVING Clause Example in SQL

Difference betwen WHERE and HAVING clause in SQL - Interview QuestionIn this example of WHERE and HAVING clause, we have two tables Employee and Department. Employee contains details of employees e.g. id, name, age, salary and department id, while Department contains id and department name. In order to show, which employee works for which department we need to join two tables on DEPT_ID to get the the department name. Our requirement is to find how many employees are working in each department and average salary of department. In order to use WHERE clause, we will only include employees who are earning  more than 5000. Before executing our query which contains WHERE, HAVING, and GROUP BY clause, let see data from Employee and Department table:

SELECT * FROM Employee;
EMP_ID
EMP_NAME
EMP_AGE
EMP_SALARY
DEPT_ID
1
Virat
23
10000
1
2
Rohit
24
7000
2
3
Suresh
25
8000
3
4
Shikhar
27
6000
1
5
Vijay
28
5000
2


SELECT * FROM Department;
DEPT_ID
DEPT_NAME
1
Accounting
2
Marketing
3
Sales


SELECT d.DEPT_NAME, count(e.EMP_NAME) as NUM_EMPLOYEE, avg(e.EMP_SALARY) as AVG_SALARY FROM Employee e,
Department d WHERE e.DEPT_ID=d.DEPT_ID AND EMP_SALARY > 5000 GROUP BY d.DEPT_NAME;
DEPT_NAME
NUM_EMPLOYEE
AVG_SALARY
Accounting
1
8000
Marketing
1
7000
Sales
2
8000


From the number of employee (NUM_EMPLOYEE) column you can see that only Vijay who work for Marketing department is not included in result set because his earning 5000. This example shows that, condition in WHERE clause is used to filter rows before you aggregate them and then HAVING clause comes in picture for final filtering, which is clear from following query, now Marketing department is excluded because it doesn't pass condition in HAVING clause i..e AVG_SALARY > 7000

SELECT d.DEPT_NAME, count(e.EMP_NAME) as NUM_EMPLOYEE, avg(e.EMP_SALARY) as AVG_SALARY FROM Employee e,
Department d WHERE e.DEPT_ID=d.DEPT_ID AND EMP_SALARY > 5000 GROUP BY d.DEPT_NAME HAVING AVG_SALARY > 7000;
DEPT_NAME
NUM_EMPLOYEE
AVG_SALARY
Accounting
1
8000
Sales
2
8000

Difference between WHERE and HAVING in SQL

Apart from this key difference we have seen in this article, here are few more differences between WHERE and HAVING clause, which is worth remembering and can be used to compare both of them :

1) Apart from SELECT queries, you can use WHERE clause with UPDATE and DELETE clause but HAVING clause can only be used with SELECT query. For example following query, which involve WHERE clause will work but other which uses HAVING clause will not work :

update DEPARTMENT set DEPT_NAME="NewSales" WHERE DEPT_ID=1 ;  // works fine

update DEPARTMENT set DEPT_NAME="NewSales" HAVING DEPT_ID=1 ; // error
Incorrect syntax near the keyword 'HAVING'.: update DEPARTMENT set DEPT_NAME='NewSales' HAVING DEPT_ID=1

2) WHERE clause is used for filtering rows and it applies on each and every row, while HAVING clause is used to filter groups in SQL.

3) One syntax level difference between WHERE and HAVING clause is that, former is used before GROUP BY clause, while later is used after GROUP BY clause.

4) When WHERE and HAVING clause are used together in a SELECT query with aggregate function,  WHERE clause is applied first on individual rows and only rows which pass the condition is included for creating groups. Once group is created, HAVING clause is used to filter groups based upon condition specified.

That's all on difference between WHERE and HAVING clause in SQL. As I said this is very popular question and you can't afford not to prepare it. Always remember key difference between WHERE and HAVING clause in SQL, if WHERE and HAVING clause is used together, first WHERE clause is applied to filter rows and only after grouping HAVING clause is applied.

11 comments :

Michał Niwiński said...

nice explanation:)and summarization: "...this is very popular question and you can't afford not to prepare it."

David Karr said...

Your facts about the difference between "where" and "having" seem good, but the facts in your example are all messed up. The second table has the same data as the third table (and the second table has the highlighting that should have been in the third table), and the query it's associated with does not reflect the contents of the table, and the critical query that's supposed to demonstrate the "having" clause doesn't even have the "having" clause.

Javin @ Refrential integrity in database said...

Hello David, welcome to Javareivisted and Thanks for pointing out that mess, my bad. I have updated post now. Once again thanks for your keen observation and suggestion, much appreciated.

Javin

Lakshmi said...

that was nice explanation, but i think in the third table the Accounting dept has 2 emp where as sales has 1 employee which is displayed in the other way in your explanation

ARUN said...

Nice explanation!. Thank you verymuch.

" WHERE clause is used for filtering rows and it applies on each and every row, while HAVING clause is used to filter groups in SQL."

Randall Goya said...

shouldn't the last query be HAVING AVG_SALARY >= 7000; ?
otherwise, it should return only 2 results, excluding the one that equals 7000 but is not greater

Anonymous said...

It's extremely important to note that WHERE will use an index and HAVING will not.

both of the following have the same result set however "where" will use the id index and having will do a table scan

select * from table where id = 1
select * from table having id = 1

Javin Paul said...

@Anonymous, That's a very good point, thanks for bringing it.

Anonymous said...

Couple of points :
HAVING clause can also be used without GROUP BY Clause.
Because the HAVING clause is processed after the rows have been grouped, you can refer to aggregate functions in the logical expression. For example, in following query it will display only departments which has 10 employees.

SELECT dept_id, COUNT(dept_id) as number_of_employee FROM Employee GROUP BY dept_id HAVING COUNT(dept_id) = 10

Tomáš Gašpar said...

I don't get it.
Department Accounting with id = 1 has 2 employees but in your table from that select:

SELECT d.DEPT_NAME, count(e.EMP_NAME) as NUM_EMPLOYEE, avg(e.EMP_SALARY) as AVG_SALARY FROM Employee e,
Department d WHERE e.DEPT_ID=d.DEPT_ID AND EMP_SALARY > 5000 GROUP BY d.DEPT_NAME;

is value NUM_EMPLOYEE = 1. Shouldn't be it 2 and Department Sales should has 2 employees, not 1.

Javin Paul said...

@Tomas, You are absolutely right, it is also obvious from average salary which 8000, which is only possible in case of two employee. Did you try running that query in your own table? I think it's typo or error while copying.

Post a Comment