Saturday, May 25, 2024

12 Intermediate Database Administrator and Developer Interview Questions on Indexing

A good understanding of Index is very important for any programmer working with database and SQL. It doesn't matter whether you are working as DBA or Application developer, working in Oracle, MySQL, or Microsoft SQL Server or any other database, you must have good understanding of how index works in general. You should know different types of index and their pros and cons, how your database or Query engine chooses indexes and the impact of table scan, index scan, and index seek. You should also be able to build upon your knowledge to optimize your SQL queries as well as troubleshoot a production issue involving long running queries or poor performing queries. This is also a great skill for experience developers with 5 to 6 years of experience under his belt. 

Since many programmers just have shallow understating of index, they began to show gaps when they face index based question on SQL Job interviews

I have taken many interviews and I am surprise that many programmer doesn't even know what is index seek and index scan and which one is faster? They are not even able to tell whether order of columns matter in a composite index or not etc?

These are very common question and if you have work in SQL or database, you might have faced those situations already. In this article, I have compiled such question to bring them together and give them a nice overview. 

To keep the article short and simple, I haven't gone into too much detail but I have pointed resource to learn more if you are curious about it. 


12 SQL Indexing Interview Questions for Programmers and Database Administrators

Here is my collection of some of the most common, frequently asked questions on database index. These questions will help you to refresh your knowledge of how index work in general, barring the database differences. 

Btw, this list is by no means complete and if you have any good index based question then feel free to share with us. 

1. What is index in database?
An index is a object in database which help you to retrieve data faster. Similar to index pages of book, index are stored in separate place than data and they point to the data. Though its not mandatory to have an index in a table, you often need index for faster retrieval of data using SELECT queries

You can create index on a column or a set of columns in a given table. The SQL clause CREATE INDEX is used to create index in a table. 


2. What are different types of index in Oracle or SQL Server?
There are mainly two types of indices in any database, clustered and non-clustered, but if you want to divide on number of columns then you can also say that you have a index which is based upon just one column and then you have a composite index which is based upon a set of columns. 


8. What is difference between Clustered and Non-Clustered index in a table? (answer)
There are two types of indexes in a table, clustered and non-clustered. The Clustered index specifies the actual physical ordering of record in disk e.g. if you have a Book table then you can arrange them using title or ISBN number, if you create clustered index on title then they will be arranged and stored in disk in that order. 

On the other hand, non-clustered index create an alternative index tree to facilitate faster retrieval of data. Since Clustered index define physical ordering, there can only be one clustered index in a table but you can have as many non-clustered index as you want. 

Here is a nice diagram which clearly shows the difference between clustered and non-clustered or secondary indices in SQL:

What is difference between Clustered and Non-Clustered index in a table?


4. How does index work in database? What happens when run SQL query?
When you create an index, there are arranged in a tree structure, so that you can navigate them in logN time. A data structure like B-Tree is used to store index but that may vary depending upon the database. 

Each node in index tree reference to other node and nodes at the leaf contains pointer to actual data. When you fire a SELECT query to retrieve some data, SQL query engine uses this tree to retrieve selective data. Whenever you add or remove data, this index tree is re-arranged. 

How does index work?



5. What is index scan in database? When does Query Engine trigger Index scan?
When you try to retrieve data from a table which has index but you didn't provide any condition using WHERE clause then it uses index scan to search for rows in index pages. For example, if you want all employees from an employee table e.g. 

select * from Organization.dbo.Employee;

Then it can use index-scan if you have a clustered index on Employee e.g. on EmployeeId. It's generally faster than table scan but slower than index seek.

what is index scan?



6. What is index seek in database? When does index seek is used to retrieve data?
The index seek is faster than index-scan because instead of scanning through all index, you directly retrieve data using pointer stored in index. This is actually the fastest way to retrieve data in a large table. 

This works when you have to retrieve 10 to 20% of data e.g. by specifying conditions in WHERE clause. For example, following query will use index seek, if you have an index on EmployeeId

select * from Organization.dbo.Empoloyee where EmployeeId=2

You can even see the actual Execution Plan in Microsoft SQL Server Management Studio by pressing Ctrl + A and then running the queries as shown below:





7. What is difference between index scan and index seek in database? (answer)
The key difference between index scan and index seek is that seek is faster than index scan. Former is generally used when you retrieve 90% to 100% data e.g. queries without WHERE clause, while index seek is used when you selectively retrieve data e.g. by specifying conditions using WHERE or HAVING clause. 

8. What is difference between table scan and index scan? (anwer)
There are two main ways to retrieve data from a table e.g. by using table scan or by using index. If your table is small then table scan is probably the fastest way but its in-efficient for a large table because you have to perform a lot of IO. 

In a table scan, SQL query engine, searches all records to find the matching rows, while in index-scan it searches through index pages. If you don't have index in your table then table scan is used, but if you have clustered index then it is used even when you don't specify any condition in WHERE clause

For example, select * from Books will use table scan if there is clustered index in table but will use index-scan if there is a clustered index in the table. 

What is difference between table scan and index scan?





9. I have a table which has clustered index on column 1, can I create another clustered index on column 2?
No, you can only have one clustered index per table, so you cannot create second one, it's not allowed. 


10. What is composite index in SQL?
An index may contain one or multiple columns. If a index contains multiple columns then it is known as composite index. For example, 

-- Create a non-clustered index on a table or view 
CREATE INDEX index1 ON table1 (col1);

is normal index and this one is a composite index:

-- Create a non-clustered index with a unique constraint 
-- on 3 columns and specify the sort order for each column 
CREATE UNIQUE INDEX i1 ON t1 (col1 DESC, col2 ASC, col3 DESC);

This index is based upon three column col1, col2, and col3. 


11. Does the order of columns matter in a composite index? 
Yes, the order of columns matters in a composite index. For example, if you have both firstname and lastname column in table and you create index on (firstname, lastname) then it can also be used in queries when you specify just one column, for example:

select * from Employee where firstname = 'John';

This query will use your composite index but following query will not use it becuase the mandatory column firstname is not available:

select * from Employee where lastname = 'kohli'

Hence, order of columns matters in composite index. 


12. What is the disadvantage of having many indices in a table?
Even though, index make data retrieval fast, especially in large table, you should be careful with so many index. Since index are re-arrange on every insert, update, and delete operation, there is a cost associated with them. 

If you execute more insert, update, and delete query than select then they will be slower because the time it will take to re-arrange all those index tree. 

Index data structure also take space in your database, so if you have multiple index tree then obviously more space will be needed. 


That's all about some of the frequently asked advanced SQL and database interview questions based upon Index. As I said, it's very important for every programmer working with database and SQL to have a good understanding of index because it directly affects the performance of your application. 

A person with good knowledge of index not only write faster SQL queries but also quick to diagnose any performance issue with their SQL queries. 

We can't depend upon Database administrator or DBAs for everything, hence I advice every application developer to learn more about how index working their respective database and fill the gaps in their knowledge. 

You can also checkout following resources to improve you understanding of indexes in major databases like Oracle, MySQL, and Microsoft SQL Server. 

Other related SQL queries, Interview questions, and articles:
  • How to find the second highest salary in a table? (solution)
  • 30 MySQL Interview Questions with answers (MySQL questions)
  • 5 Courses to learn Database and SQL Better (courses)
  • Write a SQL query to copy or backup a table in MySQL (solution)
  • 5 Courses to learn Oracle and Microsoft SQL Server database (courses)
  • How to migrate SQL queries from Oracle to SQL Server 2008? (answer)
  • Top 5 Websites to learn SQL online for FREE? (resource)
  • What is the difference between UNION and UNION ALL in SQL? (answer)
  • Top 5 Courses to learn MySQL Database for Beginners (Courses)
  • Difference between clustered and non-clustered indexes in SQL? (answer)
  • 5 Free Courses to learn T-SQL and SQL Server for Beginners (Courses)
  • Difference between the Unique and Primary keys in the table? (answer)
  • 5 Best PostgreSQL Courses for Beginners (online courses)
  • Top 5 Books to learn Advanced SQL and Database Design (books)
  • Difference between View and Materialized View in Database? (answer)
  • 10 Free SQL and Database Courses for Beginners (free courses)

Thanks for reading this article so far. If you like these intermediate index based SQL interview questions for database administrator and developers then please share with your friends and colleagues. If you have any question or feedback then please drop a note.

P.S. - If you are new to SQL and Database and interested in learning Database and SQL and looking for some free resources to start your journey then you can also take a look at this list of Free SQL Courses for Beginners to kick-start your learning.

1 comment :

Anonymous said...

Good questions, I was also asked about impact of data type like varchar(max) on the index? I couldn't answer, may be you can help?

Post a Comment