Sunday, June 21, 2015

Concept of INDEX in SQL Server

A table without Primary key(Index) or Foreign key(Index) is called Heap Table.
Here data s are store in no particular order.

First we will know the storage architecture of Normal column and INDEX column in a table.





Now will see the Scanning style of 2 different case of  Normal column and INDEX column.

In Normal column it will always do full scan of table.When the search item is match with existing data then it will scan till last for search any other similar value. But, in INDEX column it search the data by Depth first search(DFS).




For example we have a table Student_Details (Name,Stream,Semester). Now i am inserting some data into it. Its look like below picture.

Now if want to search particular data from  Student_Details table..then it will full scan the entire table and give the output. (If Search data is matching with Existing data, then also it will full scan all the rows in  Student_Details table)



In above picture It get search item in 1st attempt but it goes for full scan.





So if this table having more records then it will take more time to process, which is inefficient.

To solve such kind of performance issue the concept INDEX came. INDEX find the data without touching all rows in a table (without full scan of table). 

There are 2 types of Index.

1. Cluster Index (Physical)
2. Non Cluster Index (Logical)

Cluster Index: It sorts the data with respect to the  raw data column from base table, that's why we can create only single Cluster Index in a table. The leaf nodes of a clustered index contain the data pages.

So, when we search any data from table, then data request will go to Cluster Index pages 
and directly goes to base data table. It is more faster than other Data scan.



For example, Now we Search all the data for a name called "Anurag Das". Like below picture without full table scan it will search "Anurag Das" by 2 steps. Compare to full scan (11 times),it is more efficient.



Why only one clustered index in a table?

Because a clustered index keep the same order in which the rows will be stored on disk, having more than one clustered index on one table is impossible. 


Now will go for another INDEX called Non Cluster.


Non Cluster INDEX:  Here before raw data layer, one more data layer called "Pointer Data layer/Reference Data layer" is created. So, when we search any data from table then data request will go to Non Cluster index pages after that goes to Reference (Pointer) data layer and from there it goes to base data table.



It is slower than Cluster Data scan because it is having a extra layer.


It holds actually a pointer to the data in rows so we can have as many non-clustered indexes in a table. Now highest range For Non Cluster index 999 on SSMS 2012 version and 249 for SSMS 2005.

The way of Non Cluster Index scan is as picture given below.


1 comment: