Tuesday, April 18, 2023

5 Examples of GROUP BY Clause in SQL with Aggregate Functions

There is no doubt that SQL is an essential skill and every programmer, developer, DevOps, and Business analyst should know SQL. If you want to learn SQL from scratch then you have come to the right place. Earlier, I have shared many SQL interview questions and the best SQL courses for beginners, and today, I am going to share some GROPU By examples in SQL to write aggregation queries. THE GROUP BY clause in SQL is another important command to master for any programmer. We often use the GROUP BY command along with a select clause for reporting purposes, since the GROUP BY clause is mainly used to group related data together it's one of the most important SQL commands for reporting purposes. 

Many queries that require the use of aggregate function like sum(), avg(), or count() requires the grouping of data using the GROUP BY clause. SQL queries which involve GROUP BY and HAVING clauses are also a bit confusing for many programmers who don't have hands-on experience in SQL and are often used as SQL interview questions to filter.

In this article, we will see some examples of the GROUP BY clause in SQL which help you to understand where to use group by clause and how to use GROUP BY along with the SELECT clause. You will also learn some SQL rules related to the GROUP BY clause which is available in some databases particularly in MySQL as a group by extensions.



5 SQL GROUP BY and HAVING Examples for Beginners

Now it's time to see the GROUP BY clause in action. The following are some examples of how you can use GROUP BY to aggregate data and then apply filtering on aggregated or grouped data by using the HAVING clause.

1. Group By clause Example 1 - Finding duplicate

One of the popular use of the GROUP BY clause is finding duplicate records in the database. Following SQL query will list employees which are duplicate in terms of salary

mysql> select emp_name, count(emp_name) 
       from employee group by emp_name having count(emp_name)>1;
+----------+-----------------+
| emp_name | count(emp_name) |
+----------+-----------------+
| James    |               2 |
+----------+-----------------+
1 row in set (0.00 sec)

This was a rather simple example of finding duplicate records in the database. If you need to decide whether an employee is duplicate or not based upon more than one field then it's important to include all those in the group by clause, otherwise, you will get an incorrect count. You can further see my article about how to find duplicate records in the database for more details.


2. Group By clause Example 2 - Calculating Sum

Another popular example of the group by clause is using an aggregate function like sum() and avg().  If you know, the GROUP BY clause in SQL allows you to perform queries like finding how much each department is paying to employees i.e. total salaries per department. 

In order to write an SQL query to find the total salary per department, we need to group by the department and use sum(salary) in the select list as shown in the following SQL query :

mysql> select dept_id, sum(salary) as total_salary from employee group by dept_id;
+---------+--------------+
| dept_id | total_salary |
+---------+--------------+
|       1 |         3200 |
|       2 |         2850 |
|       3 |         2200 |
|       4 |         2250 |
+---------+--------------+
4 rows in set (0.00 sec)

Then you can further filter records by using having clauses to perform queries like finding all departments whose total salary expenditure is more than 3000. Here is an SQL query for that :

mysql> select dept_id, sum(salary) as total_salary
       from employee group by dept_id having sum(salary) > 3000;
+---------+--------------+
| dept_id | total_salary |
+---------+--------------+
|       1 |         3200 |
+---------+--------------+
1 row in set (0.01 sec)

This is your most expensive department in terms of salary. It's stood to know the useful aggregate function like count and sum, if you are not familiar with them then I suggest you join an Introductory Guide to SQL course on Educative, an interactive learning platform that allows you to practice SQL in the browser. 

COUNT, SUM, AVG, MIN, MAX Aggregate Function Examples




3. How to calculate average using group by clause

Similar to the previous group by clause example, instead of sum() we can also use avg() to perform queries like finding out average salary of employees per department. Once again we need to group by the department and this time instead of sum() aggregate function we need to use the avg() function as shown in the below query :

mysql> select dept_id, avg(salary) as average_salary from employee group by dept_id;
+---------+----------------+
| dept_id | average_salary |
+---------+----------------+
|       1 |      1066.6667 |
|       2 |      1425.0000 |
|       3 |      1100.0000 |
|       4 |       750.0000 |
+---------+----------------+
4 rows in set (0.00 sec)

Similarly, you can use having clause to further filter down this result set like finding a department whose average salary is below 1000. Here is an SQL query for that :

mysql> select dept_id, avg(salary) as average_salary 
       from employee group by dept_id having avg(salary) < 1000;
+---------+----------------+
| dept_id | average_salary |
+---------+----------------+
|       4 |       750.0000 |
+---------+----------------+
1 row in set (0.00 sec)

This is your most poorly paid department. There is no point going into that department and exploring further. 



4. Group By example 4 - Counting records

Similar to aggregate function sum() and avg(), another kind of aggregate query is very popular like counting records. One example of this kind of group by the query is how to find the number of employees per department?

In this case, we need to group by the department and need to use the count() aggregate function for counting employees as shown in the below SQL query :

mysql>  select dept_id, count(emp_id) as number_of_employees 
         from employee group by dept_id;
+---------+---------------------+
| dept_id | number_of_employees |
+---------+---------------------+
|       1 |                   3 |
|       2 |                   2 |
|       3 |                   2 |
|       4 |                   3 |
+---------+---------------------+
4 rows in set (0.00 sec)

While using the count() function in SQL, it's worth noting the difference between count() and count(field) which could give different counts based upon which column on which you are counting and whether that column contains NULL or not. 

Just remember count() also counts NULL values in a column. NULLs are very special in SQL and you should have a good knowledge of how to use Null and how to compare them.




5. How to use Group By clause with more than one column

In many practical cases, we use a group by clause with more than one column. While using two or three-column in the group by clause order is very important. The column which comes first on the group by clause will be grouped first and then the second column will be used to do grouping on that group. For example in the following SQL query result set is first grouped by dept_id and then each group is again grouped by emp_id.

mysql> select dept_id, emp_id from employee group by dept_id, emp_id;
+---------+--------+
| dept_id | emp_id |
+---------+--------+
|       1 |    101 |
|       1 |    102 |
|       1 |    110 |
|       2 |    103 |
|       2 |    104 |
|       3 |    105 |
|       3 |    108 |
|       4 |    106 |
|       4 |    107 |
|       4 |    109 |
+---------+--------+
10 rows in set (0.00 sec)

Another example of using multiple columns in group by clause is finding duplicate records in the table, where you must use all columns which are required to be the same for a record to be called duplicates. If you want to practice more of such advanced SQL concepts then I highly recommend the Complete SQL + Databases Bootcamp: Zero to Mastery course by Andrei Negaoie on ZTM Academy.  It's one of the most comprehensive and up-to-date courses to learn SQL. 

best course to learn SQL for beginners




Important points about Group By clause in SQL

Now it's time to revise and revisit some of the important things about the GROUP BY clause in SQL statements.

1. You can not use a non-aggregated column name in the select list if it is not used in Group By clause. For example following SQL query is illegal and will not run because it has used a non-aggregate column emp_id in the select list which is not named in Group By clause, in this SQL query only dept_id is used in group by clause.

mysql> select emp_id, dept_id, max(salary) 
       from employee 
       group by dept_id;

ERROR 1055 (42000): 'test.employee.emp_id' isn't in GROUP BY

but MySQL database permits it by its group by extension functionality which is disabled in the above scenario by enabling ONLY_FULL_GROUP_BY SQL mode.

2. In standard SQL you can not use a non-aggregated column name in the Having clause which is not used in group by clause, MySQL database also allows it similar to the previous group by extension. Following SQL query is invalid because we are using max_salary in having a clause that is not used in group by clause.

mysql> select  dept_id, max(salary) as max_salary 
       from employee 
       group by dept_id having max_salary > 1000;

ERROR 1463 (42000): Non-grouping field 'max_salary' is used in HAVING clause

3. Another rule of using group by clause in standard SQL is that you can not use an alias in the HAVING clause, the previous SQL query is also an example of this group by rule. This is also allowed in the MySQL database. 

You can disable all MySQL group by extension features by using ONLY_FULL_GROUP_BY SQL mode in MySQL. In order to change SQL mode from the MySQL command line you can use the following MySQL commands :

mysql> SET GLOBAL sql_mode='STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,
           NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT @@GLOBAL.sql_mode;
+-----------------------------------------------------------------------------------+
| @@GLOBAL.sql_mode                                                                 |
+-----------------------------------------------------------------------------------+
| ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION |
+-----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

You can also set MySQL SQL mode for a single client session by using the word SESSION instead of GLOBAL as shown below :

mysql> SET SESSION sql_mode='STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,
             NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT @@GLOBAL.sql_mode;
+-----------------------------------------------------------------------------------+
| @@GLOBAL.sql_mode                                                                 |
+-----------------------------------------------------------------------------------+
| ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION |
+-----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Also just changing MySQL SQL mode to Global will not take effect in the current session until you restart a new session. In order to enable ONLY_FULL_GROUP_BY in the current MySQL session using the above query.

Here is the summary of all the GROUP BY and HAVING examples we have seen in this article:

SQL Group BY and HAVING Examples for Beginners


That's all on the GROUP BY clause example in SQL queries. We have seen where to use the GROUP BY command and How to use the GROUP BY clause to group data in SQL. As I said GROUP BY and HAVING clause is one of the must-know for any programmers as it's quite common in the SQL world and particularly important for reporting purposes.


Other SQL and Database Articles you may like
  • Top 5 Websites to learn SQL for FREE (websites)
  • My favorite courses to learn SQL and Database (courses)
  • Top 5 Courses to learn PostgreSQL for Beginners (courses)
  • 10 SQL queries from Interviews (queries)
  • Top 5 Courses to learn Microsoft SQL Server (Courses)
  • 5 Free Courses to learn Database and SQL (courses)
  • 5 Books to Learn SQL Better (books)
  • How to join more than two tables in a single query (article)
  • 10 Free Courses to learn Oracle and SQL Server (courses)
  • Top 5 Courses to learn MySQL for Beginners (courses)
  • Difference between WHERE and HAVING clause (answer)
  • Top 5 SQL books for Advanced Programmers (books)
  • Difference between SQL, T-SQL, and PL/SQL? (answer)
  • 5 Free Courses to Learn MySQL database (courses)
  • Top 5 Online Courses to Learn SQL and Database (courses)

Thanks for reading this article, if you find this SQL GROUP By examples useful then please share them with your friends and colleagues. If you have any questions or feedback then please drop a note. 

P. S. - If you are new to the SQL world and looking for a free SQL and database course to start learning basics then I suggest you go through Introduction to Databases and SQL Querying course on Udemy. It's completely free, all you have is to create is a Udemy account and you can access the whole course.

No comments :

Post a Comment