Saturday, July 7, 2012

SubQuery Example in SQL – Correlated vs Noncorrelated

SubQuery in SQL is a query inside another query. Some time to get a particular information from database you may need to fire two separate sql queries, subQuery is a way to combine or join them in single query. SQL query which is on inner part of main query is called inner query while outer part of main query is called outer query. for example in below sql query

SELECT name FROM City WHERE pincode IN (SELECT pincode FROM pin WHERE zone='west')

section not highlighted is OUTER query while section highlighted with grey is INNER query. In this SQL tutorial we will see both Correlated and non correlated sub-query and there examples, some differences between correlated and noncorrelated subqueries and finally subquery vs join which is classic debatable topic in SQL. By the way this SQL tutorial is next in series of SQL and database articles in Javarevisited like truncate vs delete and 10 examples of  SELECT queries. If you are new here then you may find those examples interesting.

SubQuery Rules in SQL
Like any other concept in SQL, subquery also has some rules and you can only embed one query inside another by following rules :
1. subquery can be used in insert statement.
2. subquery can be used in select statement as column.
3. subquery should always return either a scaler value if used with where clause or value from a column if used with IN or NOT IN clause.

Before going to understand non-correlated  and correlated subquery, let’s see the table and data which we are going to use in this example. Until you have an understanding of how table look like and what kind of data it stores its little difficult to understand queries. In this subquery example we will use two table Stock and Market. Stock holds different stocks and Market holds all stock exchanges in the world.

mysql> select * from stock;
+---------+-------------------------+--------------------+
| RIC     | COMPANY                 | LISTED_ON_EXCHANGE |
+---------+-------------------------+--------------------+
| 6758.T  | Sony                    | T                  |
| GOOG.O  | Google Inc              | O                  |
| GS.N    | Goldman Sachs Group Inc | N                  |
| INDIGO  | INDIGO Airlines         | NULL               |
| INFY.BO | InfoSys                 | BO                 |
| VOD.L   | Vodafone Group PLC      | L                  |
+---------+-------------------------+--------------------+
6 rows in set (0.00 sec)

mysql> select  from Market;
+------+-------------------------+---------------+
| RIC  | NAME                    | COUNTRY       |
+------+-------------------------+---------------+
| T    | Tokyo Stock Exchange    | Japan         |
| O    | NASDAQ                  | United States |
| N    | New York Stock Exchange | United States |
| BO   | Bombay Stock Exchange   | India         |
+------+-------------------------+---------------+
4 rows in set (0.00 sec)


Noncorrelated subquery in SQL

There are two kind of subquery in SQL one is called non-correlated and other is called correlated subquery. In non correlated subquery, inner query doesn't depend on outer query and can run as stand alone query.Subquery used along-with IN or NOT IN sql clause is good examples of Noncorrelated subquery in SQL. Let's a noncorrelated subquery example to understand it better.

NonCorrelated Subquery Example:
Difference between correlated and noncorrelated suqueryLet’s see the query  “Find all stocks from Japan”, If we analyze this query we know that stock names are stored in Stock table while Country name is stored in Market table, so we need to fire two query first to get RIC for Japanese market and than all stocks which is listed on that Market. we can combine these two queries into one sql query by using subquery as shown in below example:

mysql> SELECT COMPANY FROM Stock WHERE LISTED_ON_EXCHANGE = (SELECT RIC FROM Market WHERE COUNTRY='Japan');
+---------+
| COMPANY |
+---------+
| Sony    |
+---------+
1 row IN SET (0.02 sec)

Here part which is inside bracket is called inner query or subquery. As you see in this example of subquery, inner query can run alone and its not depended on outer query and that's why its called NonCorrelated query.

NonCorrelated Subquery Example with IN Clause SQL
NonCorrelated subquery are used along-with IN and NOT IN clause. here is an example of subquery with IN clause in SQL.
SQL query: Find all stocks from United States and India

mysql> SELECT COMPANY FROM Stock WHERE LISTED_ON_EXCHANGE IN (SELECT RIC FROM Market WHERE COUNTRY='United States' OR COUNTRY= 'INDIA');
+-------------------------+
| COMPANY                 |
+-------------------------+
| Google Inc              |
| Goldman Sachs GROUP Inc |
| InfoSys                 |
+-------------------------+

When Subquery is used along-with IN or NOT IN Clause it returns result from one column instead of Scaler value.

Correlated SubQuery in SQL

Correlated subqueries are the one in which inner query or subquery reference outer query. Outer query needs to be executed before inner query. One of the most common example of correlated subquery is using keywords exits and not exits. An important point to note is that correlated subqueries are slower queries and one should avoid it as much as possible.

Example of Correlated Subquery in SQL
Here is an example of Correlated subquery “Return all markets which has at least one stock listed on it.”

mysql> SELECT m.NAME FROM Market m WHERE m.RIC = (SELECT s.LISTED_ON_EXCHANGE FROM Stock s WHERE s.LISTED_ON_EXCHANGE=m.RIC);

+-------------------------+
| NAME                    |
+-------------------------+
| Tokyo Stock Exchange    |
| NASDAQ                  |
| New York Stock Exchange |
| Bombay Stock Exchange   |
+-------------------------+
4 rows IN SET (0.00 sec)

Here inner query will execute for every Market as RIC will be changed for every market.

Difference between Correlated and NonCorrelated Subquery

Now we have seen correlated and noncorrelated subqueries and there example its much easier to understand difference between correlated vs noncorrelated queries. By the way this is also one of the popular sql interview question and its good to know few differences:

1.In case of correlated subquery inner query depends on outer query while in case of noncorrelated query inner query or subquery doesn't depends on outer query and run by its own.
2.In case of correlated subquery, outer query executed before inner query or subquery while in case of NonCorrelated subquery inner query executes before outer query.
3.Correlated Sub-queries are slower than non correlated subquery and should be avoided in favor of sql joins.
4.Common example of correlated subquery is using exits and not exists keyword while non correlated query mostly use IN or NOT IN keywords.

SubQuery vs Join in SQL

Any information which you retrieve from database using subquery can be retrieved by using different types os joins also. Since SQL is flexible and it provides different way of doing same thing. Some people find SQL Joins confusing and subquery specially noncorrelated more intuitive but in terms of performance SQL Joins are more efficient than subqueries.

Important points about SubQuery in DBMS
1.Almost whatever you want to do with subquery can also be done using join, it just matter of choice
subquery seems more intuitive to many user.
2.Subquery normally return an scaler value as result or result from one column if used along with
IN Clause.
3.You can use subqueries in four places: subquery as a column in select clause,
4.In case of correlated subquery outer query gets processed before inner query.


Other SQL and Database related articles from Javarevisited :

2 comments :

Naresh Jagatap said...

hi paul, here in the "Example of Correlated Subquery in SQL" , you have not used exists or not exists keyword. is it a correlated query?

Anonymous said...

Sorry but it's not correct. EXISTS clauses on statements are faster than the noncorrelates that you are refering to.

Post a Comment