"Database mein Index wahi kaam karta hai jo library mein catalog karta hai — bina uske, har cheez dhundhne ke liye poori library khangalni padegi."
1. Introduction — Index Kya Hai aur Kyun Zaroori Hai?
Real-Life Example se Samjho
Socho tumhare paas ek 1000 page ki dictionary hai. Tumhe "Ephemeral" word dhundhna hai.
Without Index: Page 1 se start karo, ek ek page palto — 1000 pages scan.
With Index: Dictionary ke peeche alphabetical index dekho → directly "E" section pe jao → 2 seconds mein mila!
SQL Server mein bhi exactly yahi hota hai.
Jab tum query likhte ho:
SELECT * FROM Employees WHERE EmployeeID = 10045;
Without Index: SQL Server poori Employees table ko ek ek row scan karta hai (isko Full Table Scan kehte hain). Agar table mein 10 lakh rows hain — 10 lakh rows check hongi!
With Index: SQL Server seedha ek ordered data structure mein jump karta hai aur milliseconds mein result deta hai.
Index Kyun Important Hai?
Without Index | With Index |
|---|---|
Full Table Scan — poori table padhi | Seekback — seedha data location pe jump |
Slow queries | Fast queries |
High CPU & I/O usage | Low resource consumption |
Poor user experience | Smooth, responsive apps |
Server overload on large data | Scales well with millions of rows |
2. Index Types — Sab Types Detail Mein
2.1 Clustered Index
Definition: Clustered Index table ka physical data order define karta hai. Iska matlab — table ki rows actual disk pe is index ki key order mein sorted hoti hain.
Key Points:
Ek table mein sirf EK Clustered Index ho sakta hai
Primary Key by default Clustered Index banta hai
Table itself is the index — no separate structure
Visual:
Clustered Index (EmployeeID ASC)
----------------------------------------------
Leaf Level = Actual Table Data
| EmployeeID | Name | Dept | Salary |
|------------|---------|-------|---------|
| 1001 | Rahul | IT | 55000 | ← Physically sorted
| 1002 | Priya | HR | 48000 |
| 1003 | Amit | FIN | 62000 |
| 1004 | Sneha | IT | 58000 |
Kab use karein:
Primary Key columns
Jo column sabse zyada range queries mein use ho (e.g.,
OrderDate,TransactionDate)High-selectivity, always-used column
Example:
CREATE TABLE Employees (
EmployeeID INT NOT NULL,
EmployeeName NVARCHAR(100) NOT NULL,
DeptID INT,
Salary DECIMAL(10,2),
JoiningDate DATE,
CONSTRAINT PK_Employees PRIMARY KEY CLUSTERED (EmployeeID ASC)
);
2.2 Non-Clustered Index
Definition: Non-Clustered Index ek alag structure hai jo table data ke upar bana hota hai. Isme leaf nodes mein actual rows nahi, balki row pointers hote hain.
Key Points:
Ek table mein maximum 999 Non-Clustered Indexes ban sakte hain (SQL Server 2008+)
Read queries speed up karta hai
Write operations (INSERT/UPDATE/DELETE) thoda slow karta hai kyunki index bhi update hota hai
Visual:
Non-Clustered Index (DeptID)
-----------------------------
Leaf Level = Key + Pointer to actual data
| DeptID | → Pointer (Clustered Key ya RID) |
|--------|-----------------------------------|
| FIN | → EmployeeID: 1003 |
| HR | → EmployeeID: 1002 |
| IT | → EmployeeID: 1001 |
| IT | → EmployeeID: 1004 |
DeptID = 'IT' query mein:
Index seek → DeptID = 'IT' leaf nodes mile
Pointer follow karo → Clustered Index se actual row fetch karo (Key Lookup)
Example:
CREATE NONCLUSTERED INDEX IX_Employees_DeptID
ON Employees (DeptID ASC);
2.3 Covering Index (with INCLUDE)
Definition: Covering Index woh Non-Clustered Index hota hai jo query ke saare required columns ko apne andar rakhta hai — na Clustered Index tak jaana padta hai, na Key Lookup hoti hai.
Kab zaroori hai:
-- Agar yeh query frequently run hoti hai:
SELECT EmployeeName, Salary
FROM Employees
WHERE DeptID = 'IT';
Agar sirf DeptID pe index hai toh:
Index seek → DeptID = 'IT' rows mile
Key Lookup → Har row ke liye Clustered Index se
EmployeeName,Salaryfetch karo
10,000 IT employees hain toh 10,000 Key Lookups! ❌
Solution — Covering Index:
CREATE NONCLUSTERED INDEX IX_Employees_DeptID_Covering
ON Employees (DeptID)
INCLUDE (EmployeeName, Salary);
Ab:
Index seek → DeptID = 'IT'
Leaf node mein hi
EmployeeNameaurSalarymilenge (INCLUDE columns hain)No Key Lookup!
💡 Rule: Key columns mein woh columns daalo jo WHERE/JOIN/ORDER BY mein use ho. INCLUDE mein woh columns daalo jo SELECT list mein hain.
3.4 Composite Index (Multi-Column Index)
Definition: Ek index jisme multiple key columns hote hain. Column order bahut important hota hai.
CREATE NONCLUSTERED INDEX IX_Orders_CustDate
ON Orders (CustomerID ASC, OrderDate DESC);
Column Order ka Rule — "Left-Most Prefix Rule":
Yeh index in queries mein kaam karega:
✅
WHERE CustomerID = 5✅
WHERE CustomerID = 5 AND OrderDate > '2024-01-01'✅
ORDER BY CustomerID, OrderDate DESC
Yeh queries mein index use nahi hoga:
❌
WHERE OrderDate > '2024-01-01'(CustomerID skip kiya — leading column missing)❌
WHERE OrderDate > '2024-01-01' AND CustomerID = 5— SQL Server optimize kar sakta hai, but generally avoid karo
Real-world example:
-- HRMS: Employee ki attendance by date range
CREATE NONCLUSTERED INDEX IX_Attendance_EmpDate
ON TBL_Attendance (EmployeeID ASC, AttendanceDate DESC)
INCLUDE (InTime, OutTime, Status);
3.5 🔴 Filtered Index
Definition: Ek Non-Clustered Index jo WHERE clause ke saath define hota hai. Sirf un rows ka index banta hai jo condition satisfy karein.
Kab use karein:
Table mein bahut saari NULL rows hain
Specific status (e.g., Active records only)
Sparse data scenarios
-- Sirf Active employees ka index
CREATE NONCLUSTERED INDEX IX_Employees_Active
ON Employees (DeptID, Salary)
WHERE IsActive = 1;
-- Sirf pending orders ka index
CREATE NONCLUSTERED INDEX IX_Orders_Pending
ON Orders (OrderDate, CustomerID)
WHERE Status = 'PENDING';
Benefits:
Smaller index size (kam pages)
Faster maintenance (kam rows update hone par)
More selective — better query plans
3.6 🟣 Columnstore Index (OLAP/Analytics ke liye)
Definition: Columnstore Index data ko row-wise nahi balki column-wise store karta hai. Yeh analytics aur reporting queries ke liye designed hai.
Row Store vs Column Store:
Row Store (Traditional):
Row 1: [1001, Rahul, IT, 55000]
Row 2: [1002, Priya, HR, 48000]
Row 3: [1003, Amit, FIN, 62000]
Column Store:
EmployeeID column: [1001, 1002, 1003, ...]
Name column: [Rahul, Priya, Amit, ...]
DeptID column: [IT, HR, FIN, ...]
Salary column: [55000, 48000, 62000, ...]
Query: SELECT SUM(Salary) FROM Employees WHERE DeptID = 'IT'
Row Store: Har row read karo, DeptID check karo, Salary add karo
Column Store: Sirf
DeptIDaurSalarycolumns read karo — baaki columns skip!
Types:
Type | Table Type | DML Operations | Best For |
|---|---|---|---|
Clustered Columnstore | Columnstore table | Supported (Delta Store) | Data Warehouse |
Non-Clustered Columnstore | Regular table | Read-only | Hybrid HTAP |
-- Clustered Columnstore — Pure analytics table
CREATE CLUSTERED COLUMNSTORE INDEX CCI_SalesData
ON FactSalesData;
-- Non-Clustered Columnstore — Regular table pe analytics
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Employees_Analytics
ON Employees (DeptID, Salary, JoiningDate);
3.7 🔵 Unique Index
Definition: Ensures ki index ke key columns mein duplicate values na aayein.
CREATE UNIQUE NONCLUSTERED INDEX UIX_Employees_Email
ON Employees (EmailAddress);
-- Primary Key automatically ek Unique Clustered Index hai
-- Unique Constraint bhi internally Unique Index banaata hai
3.8 Full-Text Index
Definition: Large text columns (VARCHAR/NVARCHAR) mein word-level searching ke liye. LIKE '%word%' se bahut fast.
-- Full-Text Catalog banana
CREATE FULLTEXT CATALOG FTC_Employees AS DEFAULT;
-- Full-Text Index banana
CREATE FULLTEXT INDEX ON Employees(JobDescription, Bio)
KEY INDEX PK_Employees;
-- Query
SELECT EmployeeID, EmployeeName
FROM Employees
WHERE CONTAINS(JobDescription, 'Python AND "data analysis"');
3.9 Spatial Index
Definition: Geographic/geometric data (GEOGRAPHY, GEOMETRY types) ke liye.
CREATE SPATIAL INDEX SIX_Locations_GeoData
ON OfficeLocations(GeoLocation)
USING GEOGRAPHY_GRID;
Index Types Summary Table
Index Type | Physical Storage | Max per Table | Best For |
|---|---|---|---|
Clustered | Data = Index | 1 | PK, Range queries |
Non-Clustered | Separate structure | 999 | Selective WHERE, JOIN |
Covering (INCLUDE) | Separate + Included cols | 999 | Eliminate Key Lookups |
Composite | Separate, multi-col key | 999 | Multi-column filters |
Filtered | Partial rows | 999 | Sparse/conditional data |
Columnstore (CCI) | Column-wise | 1 (clustered) | OLAP, Analytics |
Columnstore (NCCI) | Column-wise + row | 1 | Hybrid workloads |
Unique | Separate | 999 | Uniqueness enforcement |
Full-Text | Inverted index | Multiple | Text search |
Spatial | Grid-based | Multiple | Geo queries |
4. Index Create/Manage Karna — DDL Commands {#4-ddl-commands}
4.1 Index Create karna
-- Basic Syntax
CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX index_name
ON table_name (column1 [ASC|DESC], column2 [ASC|DESC], ...)
[INCLUDE (col1, col2, ...)]
[WHERE filter_predicate]
[WITH (
FILLFACTOR = n,
PAD_INDEX = ON|OFF,
SORT_IN_TEMPDB = ON|OFF,
DROP_EXISTING = ON|OFF,
ONLINE = ON|OFF,
STATISTICS_NORECOMPUTE = ON|OFF,
DATA_COMPRESSION = NONE|ROW|PAGE
)]
[ON filegroup_name];
Practical Examples:
-- 1. Simple Non-Clustered Index
CREATE NONCLUSTERED INDEX IX_Emp_DeptID
ON dbo.Employees (DeptID);
-- 2. Composite Index with INCLUDE
CREATE NONCLUSTERED INDEX IX_Emp_Dept_Covering
ON dbo.Employees (DeptID ASC, JoiningDate DESC)
INCLUDE (EmployeeName, Salary, EmailAddress);
-- 3. Filtered Index — sirf active employees
CREATE NONCLUSTERED INDEX IX_Emp_Active_Email
ON dbo.Employees (EmailAddress)
WHERE IsActive = 1;
-- 4. Unique Index
CREATE UNIQUE NONCLUSTERED INDEX UIX_Emp_Email
ON dbo.Employees (EmailAddress)
WHERE EmailAddress IS NOT NULL;
-- 5. Index with Fill Factor
CREATE NONCLUSTERED INDEX IX_Orders_CustDate
ON dbo.Orders (CustomerID, OrderDate DESC)
WITH (FILLFACTOR = 80, ONLINE = ON);
4.2 Index Rebuild aur Reorganize
Fragmentation check karna:
-- Fragmentation check
SELECT
DB_NAME() AS DatabaseName,
OBJECT_NAME(ips.object_id) AS TableName,
i.name AS IndexName,
i.type_desc AS IndexType,
ips.index_type_desc,
ips.avg_fragmentation_in_percent AS FragPercent,
ips.page_count AS PageCount,
ips.record_count AS RowCount
FROM sys.dm_db_index_physical_stats(
DB_ID(), NULL, NULL, NULL, 'LIMITED') AS ips
INNER JOIN sys.indexes AS i
ON ips.object_id = i.object_id
AND ips.index_id = i.index_id
WHERE ips.avg_fragmentation_in_percent > 5
AND ips.page_count > 100
ORDER BY ips.avg_fragmentation_in_percent DESC;
Fragmentation ke hisaab se action:
Fragmentation % | Action |
|---|---|
< 5% | Kuch nahi karna |
5% – 30% | REORGANIZE (online, minimal locking) |
> 30% | REBUILD (more resources, but full cleanup) |
-- REORGANIZE — online operation, low resource
ALTER INDEX IX_Emp_DeptID ON dbo.Employees REORGANIZE;
-- REBUILD — full index rebuild
ALTER INDEX IX_Emp_DeptID ON dbo.Employees
REBUILD WITH (FILLFACTOR = 80, ONLINE = ON);
-- Ek table ke sab indexes rebuild karo
ALTER INDEX ALL ON dbo.Employees
REBUILD WITH (FILLFACTOR = 80, ONLINE = ON);
💡 ONLINE = ON production pe use karo — table lock nahi lega. SQL Server Enterprise Edition needed.
4.3 Index Drop karna
-- Single index drop
DROP INDEX IX_Emp_DeptID ON dbo.Employees;
-- Check karke drop karo
IF EXISTS (
SELECT 1 FROM sys.indexes
WHERE name = 'IX_Emp_DeptID'
AND object_id = OBJECT_ID('dbo.Employees')
)
DROP INDEX IX_Emp_DeptID ON dbo.Employees;
4.4 Index Disable karna
-- Index disable karo (structure remain karta hai, data nahi)
ALTER INDEX IX_Emp_DeptID ON dbo.Employees DISABLE;
-- Re-enable karna ho toh rebuild karo
ALTER INDEX IX_Emp_DeptID ON dbo.Employees REBUILD;
4.5 Index Rename karna
EXEC sp_rename
'dbo.Employees.IX_Emp_DeptID',
'IX_Employees_DepartmentID',
'INDEX';
4.6 Index Information Queries
-- Table ke sab indexes dekho
SELECT
i.name AS IndexName,
i.type_desc AS IndexType,
i.is_unique,
i.is_primary_key,
i.is_unique_constraint,
i.filter_definition AS FilterCondition,
STRING_AGG(c.name, ', ')
WITHIN GROUP (ORDER BY ic.key_ordinal) AS KeyColumns
FROM sys.indexes i
INNER JOIN sys.index_columns ic
ON i.object_id = ic.object_id
AND i.index_id = ic.index_id
INNER JOIN sys.columns c
ON ic.object_id = c.object_id
AND ic.column_id = c.column_id
WHERE i.object_id = OBJECT_ID('dbo.Employees')
AND ic.is_included_column = 0
GROUP BY i.name, i.type_desc, i.is_unique,
i.is_primary_key, i.is_unique_constraint,
i.filter_definition
ORDER BY i.type_desc, i.name;
-- Index usage stats (kaunsa index kitna use ho raha hai)
SELECT
OBJECT_NAME(i.object_id) AS TableName,
i.name AS IndexName,
i.type_desc AS IndexType,
ius.user_seeks,
ius.user_scans,
ius.user_lookups,
ius.user_updates,
ius.last_user_seek,
ius.last_user_scan
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius
ON i.object_id = ius.object_id
AND i.index_id = ius.index_id
AND ius.database_id = DB_ID()
WHERE i.object_id = OBJECT_ID('dbo.Employees')
ORDER BY (ISNULL(ius.user_seeks,0) + ISNULL(ius.user_scans,0)) DESC;
5. Intermediate Usage — Real Workflows aur Combinations {#5-intermediate-usage}
5.1 Execution Plan Read Karna — Index Use Ho Raha Hai Ya Nahi?
-- Execution Plan enable karo
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
-- Query run karo
SELECT EmployeeID, EmployeeName, Salary
FROM dbo.Employees
WHERE DeptID = 'IT'
ORDER BY Salary DESC;
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
SSMS mein:
Ctrl + M→ Actual Execution Plan enable karoQuery run karo
Plan mein icons dekho:
🗂️ Index Seek — Good! ✅
📄 Index Scan — Acceptable, but check karo
📋 Table Scan / Clustered Index Scan — Investigate! ⚠️
🔗 Key Lookup — Covering index ki zaroorat hai! ⚠️
5.2 Missing Index Suggestions
SQL Server automatically missing indexes suggest karta hai execution plan mein aur DMVs mein:
-- SQL Server ke missing index suggestions
SELECT TOP 20
mid.statement AS TableName,
migs.avg_total_user_cost * migs.avg_user_impact
* (migs.user_seeks + migs.user_scans) AS ImprovementScore,
migs.avg_user_impact AS AvgImpact,
migs.user_seeks,
migs.user_scans,
mid.equality_columns,
mid.inequality_columns,
mid.included_columns,
-- Auto-generate CREATE INDEX statement
'CREATE NONCLUSTERED INDEX IX_' +
REPLACE(REPLACE(mid.statement, '[', ''), ']', '') + '_' +
CAST(mig.index_group_id AS VARCHAR) +
' ON ' + mid.statement +
' (' + ISNULL(mid.equality_columns, '') +
CASE WHEN mid.equality_columns IS NOT NULL
AND mid.inequality_columns IS NOT NULL
THEN ', ' ELSE '' END +
ISNULL(mid.inequality_columns, '') + ')' +
ISNULL(' INCLUDE (' + mid.included_columns + ')', '')
AS SuggestedIndexSQL
FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs
ON mig.index_group_id = migs.group_handle
INNER JOIN sys.dm_db_missing_index_details mid
ON mig.index_handle = mid.index_handle
WHERE mid.database_id = DB_ID()
ORDER BY ImprovementScore DESC;
⚠️ Warning: Missing index suggestions ko blindly implement mat karo! Evaluate karo — kai baar zyada indexes write performance hurt karte hain.
5.3 Unused Indexes Find Karna
-- Unused indexes — waste of space and write performance
SELECT
OBJECT_NAME(i.object_id) AS TableName,
i.name AS IndexName,
i.type_desc,
ISNULL(ius.user_seeks, 0) AS UserSeeks,
ISNULL(ius.user_scans, 0) AS UserScans,
ISNULL(ius.user_lookups, 0) AS UserLookups,
ISNULL(ius.user_updates, 0) AS UserUpdates,
ius.last_user_seek,
ius.last_user_scan,
-- Index size
ps.used_page_count * 8 / 1024.0 AS IndexSizeMB
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius
ON i.object_id = ius.object_id
AND i.index_id = ius.index_id
AND ius.database_id = DB_ID()
LEFT JOIN sys.dm_db_partition_stats ps
ON i.object_id = ps.object_id
AND i.index_id = ps.index_id
WHERE i.type_desc = 'NONCLUSTERED'
AND ISNULL(ius.user_seeks, 0) = 0
AND ISNULL(ius.user_scans, 0) = 0
AND ISNULL(ius.user_lookups, 0) = 0
AND ISNULL(ius.user_updates, 0) > 0 -- Sirf update ho raha hai, read nahi
AND i.object_id = OBJECT_ID('dbo.Employees')
ORDER BY ISNULL(ius.user_updates, 0) DESC;
5.4 Index Hints — Force Karna
Kabhi kabhi Query Optimizer galat plan choose karta hai. Tum manually index hint de sakte ho:
-- INDEX hint — specific index use karo
SELECT EmployeeID, EmployeeName
FROM dbo.Employees WITH (INDEX(IX_Emp_DeptID))
WHERE DeptID = 'IT';
-- NOLOCK hint (dirty reads) — reporting queries ke liye
SELECT COUNT(*) FROM dbo.Orders WITH (NOLOCK)
WHERE OrderDate >= '2024-01-01';
-- FORCESEEK — seek force karo, scan nahi
SELECT EmployeeID FROM dbo.Employees WITH (FORCESEEK)
WHERE DeptID = 'IT';
-- FORCESCAN — scan force karo
SELECT EmployeeID FROM dbo.Employees WITH (FORCESCAN)
WHERE DeptID = 'IT';
⚠️ Index hints carefully use karo — data distribution change hone par plan outdated ho sakta hai.
5.5 Key Lookup Eliminate Karna — Step by Step
Problem Scenario:
-- Table structure
CREATE TABLE dbo.Orders (
OrderID INT PRIMARY KEY, -- Clustered Index
CustomerID INT NOT NULL,
OrderDate DATE NOT NULL,
TotalAmount DECIMAL(12,2),
Status VARCHAR(20),
ShipAddress NVARCHAR(200)
);
-- Existing index
CREATE NONCLUSTERED INDEX IX_Orders_Customer
ON dbo.Orders (CustomerID);
-- Frequent query
SELECT OrderID, OrderDate, TotalAmount
FROM dbo.Orders
WHERE CustomerID = 1001;
Execution Plan mein dikhega:
Index Seek (IX_Orders_Customer) → Key Lookup (Clustered) → Nested Loops
Fix — Covering Index:
-- DROP existing, better index create karo
DROP INDEX IX_Orders_Customer ON dbo.Orders;
CREATE NONCLUSTERED INDEX IX_Orders_Customer_Covering
ON dbo.Orders (CustomerID)
INCLUDE (OrderDate, TotalAmount);
Ab: Index Seek only — No Key Lookup ✅
6. Advanced Concepts — Internal Working, Fill Factor, Statistics {#6-advanced-concepts}
6.1 Fill Factor Deep Dive
Definition: Fill Factor index page ko kitna % fill kare — baacha space future inserts ke liye hota hai.
FILLFACTOR = 100 (default):
| DATA DATA DATA DATA DATA DATA DATA DATA | ← 100% full
Naya row aaya → Page Split! → 2 pages ban gaye → Fragmentation ❌
FILLFACTOR = 80:
| DATA DATA DATA DATA DATA DATA [20% empty] |
Naye rows easily fit ho jaate hain → Kam page splits → Kam fragmentation ✅
Fill Factor | Best For | Drawback |
|---|---|---|
100% | Read-only tables | Page splits on insert |
80-90% | Mixed OLTP | Some wasted space |
60-70% | High insert/update tables | More space consumed |
50% | Very high churn tables | Lot of wasted space |
-- Server-level default Fill Factor change karna
EXEC sp_configure 'fill factor', 80;
RECONFIGURE;
-- Index-level
CREATE NONCLUSTERED INDEX IX_Orders_Date
ON dbo.Orders (OrderDate)
WITH (FILLFACTOR = 75, PAD_INDEX = ON);
-- PAD_INDEX = ON ensures intermediate pages bhi same fill factor use karein
6.2 Statistics — Query Optimizer ka Brain
Definition: Statistics SQL Server mein column values ka distribution information hoti hai. Query Optimizer in statistics ke basis pe decide karta hai ki index use karna chahiye ya nahi, aur kaunsa join strategy best hai.
Statistics histogram for OrderDate column:
Range: Jan 2024 - Dec 2024
Jan: 5,000 rows
Feb: 4,800 rows
...
Dec: 12,000 rows ← Peak
Query: WHERE OrderDate BETWEEN '2024-11-01' AND '2024-12-31'
Optimizer decides: Estimated rows = 23,000 → Index Seek profitable
Statistics Update karna:
-- Manual update
UPDATE STATISTICS dbo.Orders;
UPDATE STATISTICS dbo.Orders IX_Orders_Date;
-- Full scan (accurate but slower)
UPDATE STATISTICS dbo.Orders WITH FULLSCAN;
-- Sample (faster, less accurate)
UPDATE STATISTICS dbo.Orders WITH SAMPLE 30 PERCENT;
-- Statistics info dekho
DBCC SHOW_STATISTICS('dbo.Orders', 'IX_Orders_Date');
Auto Statistics:
-- Auto create/update enabled rakho (default ON)
ALTER DATABASE YourDB SET AUTO_CREATE_STATISTICS ON;
ALTER DATABASE YourDB SET AUTO_UPDATE_STATISTICS ON;
ALTER DATABASE YourDB SET AUTO_UPDATE_STATISTICS_ASYNC ON; -- Production best practice
6.3 Index Internals — Page Splits in Detail
Page Split kya hota hai:
Before Split (Full Page):
Page 5: [Row1, Row2, Row3, Row4, Row5] ← 100% full
New Row4.5 insert karna hai (sort order mein Row4 aur Row5 ke beech):
After Split:
Page 5: [Row1, Row2, Row3] ← 50% full
Page 6 (new): [Row4, Row4.5, Row5] ← 50% full
Problem:
- New page allocate karna pada = I/O overhead
- Pages non-contiguous ho gayi = Fragmentation
- Frequent splits = Severe performance degradation
Prevention:
Monotonically increasing Clustered Key use karo (IDENTITY, SEQUENCE, NEWSEQUENTIALID())
Appropriate Fill Factor set karo
Regular index maintenance karo
6.4 Heap vs Clustered Table — Kaunsa Better?
-- Heap table (no clustered index)
CREATE TABLE dbo.LogEvents (
EventID INT IDENTITY,
EventMsg NVARCHAR(500),
CreatedAt DATETIME2 DEFAULT GETDATE()
);
-- Koi index nahi — pure heap
-- Clustered Index table
CREATE TABLE dbo.LogEvents (
EventID INT IDENTITY PRIMARY KEY, -- CL index
EventMsg NVARCHAR(500),
CreatedAt DATETIME2 DEFAULT GETDATE()
);
Feature | Heap | Clustered Index Table |
|---|---|---|
Data order | Random | Sorted by CL key |
Insert speed | Fast (no sort needed) | Slightly slower (maintain order) |
Range queries | Slow (full scan) | Fast (seeks on CL key) |
Delete performance | Forwarded records issue | Cleaner |
Fragmentation | Forwarded pointers problem | Page fragmentation |
Best for | Staging/temp tables, pure insert | Most production tables |
6.5 Columnstore Index — Internal Architecture
Columnstore Row Group Structure:
------------------------------------
Row Group 1 (1,048,576 rows max):
Column 1 Segment: [compressed data]
Column 2 Segment: [compressed data]
...
Row Group 2:
Column 1 Segment: [compressed data]
...
Delta Store (for new inserts — row format):
[Recent rows not yet compressed]
Deleted Bitmap:
[Marks deleted rows]
Batch Mode Processing:
Traditional Row Mode: Ek ek row process
Columnstore Batch Mode: 900 rows ek saath process
Result: 10x-100x faster analytics!
-- Columnstore compression ratio check
SELECT
i.name,
SUM(a.total_pages) * 8 / 1024.0 AS TotalSizeMB,
SUM(a.used_pages) * 8 / 1024.0 AS UsedSizeMB,
COUNT(DISTINCT rg.row_group_id) AS RowGroups,
SUM(rg.total_rows) AS TotalRows
FROM sys.indexes i
INNER JOIN sys.partitions p
ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a
ON p.partition_id = a.container_id
LEFT JOIN sys.column_store_row_groups rg
ON i.object_id = rg.object_id AND i.index_id = rg.index_id
WHERE i.type IN (5, 6) -- 5=CCI, 6=NCCI
GROUP BY i.name;
6.6 Parallel Index Rebuild
-- Multiple CPUs use karke index rebuild — faster on large tables
ALTER INDEX IX_Orders_Date ON dbo.Orders
REBUILD WITH (
MAXDOP = 4, -- 4 parallel threads use karo
ONLINE = ON, -- No table lock
FILLFACTOR = 80,
DATA_COMPRESSION = PAGE -- Page compression bhi apply karo
);
6.7 Partitioned Index
-- Partition function aur scheme
CREATE PARTITION FUNCTION PF_OrderYear (INT)
AS RANGE RIGHT FOR VALUES (2021, 2022, 2023, 2024);
CREATE PARTITION SCHEME PS_OrderYear
AS PARTITION PF_OrderYear
ALL TO ([PRIMARY]);
-- Partitioned Clustered Index
CREATE TABLE dbo.Orders_Partitioned (
OrderID INT,
OrderYear INT,
OrderDate DATE,
Amount DECIMAL(12,2)
) ON PS_OrderYear(OrderYear);
CREATE CLUSTERED INDEX CIX_Orders_Year
ON dbo.Orders_Partitioned (OrderYear, OrderID)
ON PS_OrderYear(OrderYear);
7. Real-World Use Cases {#7-real-world-use-cases}
7.1 HRMS System — Payroll aur Attendance Queries
-- Employee lookup by ID (Primary Key — auto clustered)
SELECT * FROM dbo.TBL_EmployeeMaster
WHERE EmployeeID = 10045;
-- Clustered Index Seek ✅
-- Department wise active employees
CREATE NONCLUSTERED INDEX IX_EmpMaster_DeptActive
ON dbo.TBL_EmployeeMaster (DepartmentID, IsActive)
INCLUDE (EmployeeName, Designation, JoiningDate, MobileNo)
WHERE IsActive = 1;
-- Monthly salary processing — employee + month filter
CREATE NONCLUSTERED INDEX IX_Salary_EmpMonth
ON dbo.TBL_SalaryDetails (EmployeeID, SalaryMonth DESC)
INCLUDE (BasicSalary, HRA, SpecialAllowance, GrossSalary, NetSalary);
-- Attendance report — date range
CREATE NONCLUSTERED INDEX IX_Attendance_DateEmp
ON dbo.TBL_AttendanceMaster (AttendanceDate DESC, EmployeeID)
INCLUDE (InTime, OutTime, Status, WorkingHours)
WHERE Status IN ('P', 'A', 'HD');
7.2 Travel Management System
-- Travel requests by employee
CREATE NONCLUSTERED INDEX IX_TravelReq_EmpStatus
ON dbo.TBL_TravelRequest (EmployeeID, RequestStatus)
INCLUDE (TravelDate, Destination, Purpose, RequestedOn);
-- Pending approvals for a manager
CREATE NONCLUSTERED INDEX IX_TravelApproval_ApproverPending
ON dbo.TBL_TravelApproval (ApproverID, ApprovalStatus, ApprovalLevel)
INCLUDE (RequestID, EmployeeID, RequestedDate)
WHERE ApprovalStatus = 'PENDING';
-- Date range travel cost report
CREATE NONCLUSTERED INDEX IX_TravelExpense_DateRange
ON dbo.TBL_TravelExpense (ExpenseDate DESC, CompanyID)
INCLUDE (EmployeeID, ExpenseAmount, ExpenseType, IsApproved);
7.3 E-Commerce System
-- Product search by category and price
CREATE NONCLUSTERED INDEX IX_Products_CatPrice
ON dbo.Products (CategoryID, IsActive)
INCLUDE (ProductName, Price, StockQty, ImageURL)
WHERE IsActive = 1;
-- Order history by customer
CREATE NONCLUSTERED INDEX IX_Orders_CustDate
ON dbo.Orders (CustomerID, OrderDate DESC)
INCLUDE (OrderStatus, TotalAmount, TrackingNumber);
-- Pending orders processing
CREATE NONCLUSTERED INDEX IX_Orders_StatusDate
ON dbo.Orders (OrderStatus, OrderDate ASC)
INCLUDE (CustomerID, TotalAmount, ShipAddress)
WHERE OrderStatus IN ('PLACED', 'CONFIRMED', 'SHIPPED');
7.4 Reporting/Analytics System
-- Monthly sales summary
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Sales_Analytics
ON dbo.FactSales (SaleDate, ProductID, RegionID, SalesAmount, Quantity);
-- Date-wise aggregation
SELECT
YEAR(SaleDate) AS SaleYear,
MONTH(SaleDate) AS SaleMonth,
SUM(SalesAmount) AS TotalRevenue,
COUNT(*) AS TotalOrders
FROM dbo.FactSales
WHERE SaleDate >= '2023-01-01'
GROUP BY YEAR(SaleDate), MONTH(SaleDate)
ORDER BY 1, 2;
-- Columnstore = Blazing fast for this aggregation ✅
8. Practical Examples — Code + Line-by-line Explanation {#8-practical-examples}
8.1 Complete Index Strategy for an Employee Table
-- ==========================================
-- Step 1: Table Create karo
-- ==========================================
CREATE TABLE dbo.TBL_Employees (
EmployeeID INT NOT NULL IDENTITY(1,1),
EmployeeCode VARCHAR(20) NOT NULL,
EmployeeName NVARCHAR(100) NOT NULL,
EmailAddress VARCHAR(100),
MobileNo VARCHAR(15),
DepartmentID INT NOT NULL,
DesignationID INT NOT NULL,
JoiningDate DATE NOT NULL,
BasicSalary DECIMAL(12,2) NOT NULL,
IsActive BIT NOT NULL DEFAULT 1,
CreatedDate DATETIME2 NOT NULL DEFAULT GETDATE(),
-- Clustered Primary Key
CONSTRAINT PK_Employees PRIMARY KEY CLUSTERED (EmployeeID ASC)
);
-- PK_Employees → Clustered Index on EmployeeID
-- Data rows physically EmployeeID order mein sorted hain
-- ==========================================
-- Step 2: Business-driven indexes
-- ==========================================
-- Index 1: Employee Code search (HR frequently use karta hai)
-- EmployeeCode unique hona chahiye + fast lookup
CREATE UNIQUE NONCLUSTERED INDEX UIX_Employees_EmpCode
ON dbo.TBL_Employees (EmployeeCode);
-- Unique index → duplicate employee codes nahi ban sakte
-- Non-Clustered → separate B-tree structure
-- Index 2: Department wise employee list (most common report)
CREATE NONCLUSTERED INDEX IX_Employees_Dept_Active
ON dbo.TBL_Employees (DepartmentID ASC, IsActive)
INCLUDE (EmployeeName, DesignationID, JoiningDate, BasicSalary);
-- DepartmentID → leading key (WHERE DeptID = X)
-- IsActive → second key (WHERE IsActive = 1)
-- INCLUDE columns → SELECT mein ye columns chahiye, Key Lookup avoid ho
-- Covering index for: SELECT Name, Desig, Joining, Salary FROM Emp WHERE DeptID=X AND IsActive=1
-- Index 3: Email login lookup
CREATE UNIQUE NONCLUSTERED INDEX UIX_Employees_Email
ON dbo.TBL_Employees (EmailAddress)
WHERE EmailAddress IS NOT NULL;
-- Filtered Unique Index → NULL emails allowed, but non-NULL must be unique
-- WHERE EmailAddress IS NOT NULL → NULLs index mein include nahi honge
-- Login queries fast hongi: WHERE EmailAddress = 'rahul@company.com'
-- Index 4: New joinee / attrition reports (date range queries)
CREATE NONCLUSTERED INDEX IX_Employees_JoiningDate
ON dbo.TBL_Employees (JoiningDate DESC)
INCLUDE (EmployeeName, DepartmentID, IsActive)
WITH (FILLFACTOR = 90);
-- JoiningDate DESC → newest joinee first
-- FILLFACTOR = 90 → 10% free space because new employees keep joining
-- Date range queries fast hongi
-- ==========================================
-- Step 3: Query test karo
-- ==========================================
-- Query 1: Department wise active employees
-- Uses: IX_Employees_Dept_Active → Index Seek + No Key Lookup ✅
SELECT EmployeeName, DesignationID, JoiningDate, BasicSalary
FROM dbo.TBL_Employees
WHERE DepartmentID = 5 AND IsActive = 1;
-- Query 2: Employee by email login
-- Uses: UIX_Employees_Email → Unique Index Seek ✅
SELECT EmployeeID, EmployeeName, DepartmentID
FROM dbo.TBL_Employees
WHERE EmailAddress = 'rahul.sharma@company.com';
-- Query 3: This month joiners
-- Uses: IX_Employees_JoiningDate → Index Seek + Key Lookup (EmployeeCode nahi INCLUDE mein) ⚠️
SELECT EmployeeCode, EmployeeName, DepartmentID
FROM dbo.TBL_Employees
WHERE JoiningDate >= DATEADD(DAY, 1, EOMONTH(GETDATE(), -1))
AND JoiningDate <= EOMONTH(GETDATE());
-- EmployeeCode INCLUDE mein nahi hai → Key Lookup hogi
-- Fix: EmployeeCode bhi INCLUDE mein add karo
-- ==========================================
-- Step 4: Index stats check karo
-- ==========================================
SELECT
i.name AS IndexName,
i.type_desc,
ISNULL(s.user_seeks, 0) AS Seeks,
ISNULL(s.user_scans, 0) AS Scans,
ISNULL(s.user_lookups, 0) AS Lookups,
ISNULL(s.user_updates, 0) AS Updates,
CAST(ps.used_page_count * 8.0 / 1024 AS DECIMAL(10,2)) AS SizeMB
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats s
ON i.object_id = s.object_id
AND i.index_id = s.index_id
AND s.database_id = DB_ID()
LEFT JOIN sys.dm_db_partition_stats ps
ON i.object_id = ps.object_id
AND i.index_id = ps.index_id
WHERE i.object_id = OBJECT_ID('dbo.TBL_Employees')
ORDER BY ISNULL(s.user_seeks, 0) + ISNULL(s.user_scans, 0) DESC;
8.2 Smart Index Maintenance Stored Procedure
CREATE OR ALTER PROCEDURE dbo.usp_SmartIndexMaintenance
@TableName SYSNAME = NULL, -- NULL = all tables
@DryRun BIT = 1, -- 1 = only print, 0 = execute
@MinPageCount INT = 100, -- Ignore small indexes
@ReorganizeThreshold DECIMAL(5,2) = 10.0,
@RebuildThreshold DECIMAL(5,2) = 30.0
AS
BEGIN
SET NOCOUNT ON;
-- Temp table for fragmentation data
CREATE TABLE #IndexFrag (
TableName SYSNAME,
IndexName SYSNAME,
IndexType VARCHAR(50),
FragPercent DECIMAL(5,2),
PageCount BIGINT,
Action VARCHAR(20)
);
-- Fragmentation data collect karo
INSERT INTO #IndexFrag
SELECT
OBJECT_NAME(ips.object_id),
i.name,
ips.index_type_desc,
CAST(ips.avg_fragmentation_in_percent AS DECIMAL(5,2)),
ips.page_count,
CASE
WHEN ips.avg_fragmentation_in_percent >= @RebuildThreshold THEN 'REBUILD'
WHEN ips.avg_fragmentation_in_percent >= @ReorganizeThreshold THEN 'REORGANIZE'
ELSE 'SKIP'
END
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
INNER JOIN sys.indexes i
ON ips.object_id = i.object_id AND ips.index_id = i.index_id
WHERE ips.page_count >= @MinPageCount
AND i.type > 0 -- Heap nahi
AND (@TableName IS NULL OR OBJECT_NAME(ips.object_id) = @TableName);
-- Summary print karo
PRINT REPLICATE('=', 60);
PRINT 'Index Maintenance Report — ' + CONVERT(VARCHAR, GETDATE(), 120);
PRINT REPLICATE('=', 60);
DECLARE @tbl SYSNAME, @idx SYSNAME, @frag DECIMAL(5,2),
@action VARCHAR(20), @sql NVARCHAR(500);
DECLARE cur CURSOR FOR
SELECT TableName, IndexName, FragPercent, Action
FROM #IndexFrag
WHERE Action <> 'SKIP'
ORDER BY FragPercent DESC;
OPEN cur;
FETCH NEXT FROM cur INTO @tbl, @idx, @frag, @action;
WHILE @@FETCH_STATUS = 0
BEGIN
IF @action = 'REBUILD'
SET @sql = 'ALTER INDEX [' + @idx + '] ON dbo.[' + @tbl + '] REBUILD WITH (FILLFACTOR=80, ONLINE=ON)';
ELSE
SET @sql = 'ALTER INDEX [' + @idx + '] ON dbo.[' + @tbl + '] REORGANIZE';
PRINT '[' + @action + '] ' + @tbl + '.' + @idx +
' (Frag: ' + CAST(@frag AS VARCHAR) + '%)';
PRINT ' SQL: ' + @sql;
IF @DryRun = 0
BEGIN
BEGIN TRY
EXEC sp_executesql @sql;
PRINT ' [OK] Done.';
END TRY
BEGIN CATCH
PRINT ' [ERROR] ' + ERROR_MESSAGE();
END CATCH
END
FETCH NEXT FROM cur INTO @tbl, @idx, @frag, @action;
END
CLOSE cur; DEALLOCATE cur;
IF @DryRun = 1
PRINT CHAR(10) + '*** DRY RUN mode — set @DryRun=0 to execute ***';
DROP TABLE #IndexFrag;
END;
GO
-- Usage:
EXEC dbo.usp_SmartIndexMaintenance @DryRun = 1; -- Preview
EXEC dbo.usp_SmartIndexMaintenance @TableName = 'TBL_Employees', @DryRun = 0; -- Execute
9. Edge Cases aur Errors {#9-edge-cases-and-errors}
9.1 Common Mistakes aur Fixes
❌ Mistake 1: SELECT * pe Index banana
-- BAD: SELECT * use karna → koi bhi index covering nahi ban sakta
SELECT * FROM dbo.Orders WHERE CustomerID = 1001;
-- GOOD: Specific columns select karo → covering index banao
SELECT OrderID, OrderDate, TotalAmount
FROM dbo.Orders WHERE CustomerID = 1001;
CREATE NONCLUSTERED INDEX IX_Orders_Cust
ON dbo.Orders (CustomerID)
INCLUDE (OrderDate, TotalAmount);
❌ Mistake 2: Functions in WHERE Clause — Index Kill!
-- BAD: Function column pe apply ki → Index use nahi hoga!
SELECT * FROM dbo.Employees
WHERE YEAR(JoiningDate) = 2023;
-- → Clustered Index Scan ❌
-- BAD: Implicit conversion
SELECT * FROM dbo.Employees
WHERE EmployeeCode = 12345; -- EmployeeCode VARCHAR hai, int diya
-- → Implicit cast → Index scan ❌
-- GOOD: SARGable (Search ARGument able) condition
SELECT * FROM dbo.Employees
WHERE JoiningDate >= '2023-01-01'
AND JoiningDate < '2024-01-01';
-- → Index Seek ✅
-- GOOD: Explicit type
SELECT * FROM dbo.Employees
WHERE EmployeeCode = '12345';
-- → Index Seek ✅
SARGable vs Non-SARGable:
Non-SARGable ❌ | SARGable ✅ |
|---|---|
|
|
|
|
|
|
|
|
|
|
❌ Mistake 3: Too Many Indexes on a Write-Heavy Table
-- BAD: Transaction table pe 15 indexes → INSERT bahut slow!
-- Har INSERT ke saath 15 index structures update honge
-- GOOD: Rule of thumb:
-- OLTP tables: 3-5 indexes max
-- Reporting tables: 5-10 indexes ok
-- Keep only what's actually being used!
❌ Mistake 4: Wrong Column Order in Composite Index
-- BAD: Date pehle, CustomerID baad mein
CREATE INDEX IX_Wrong ON dbo.Orders (OrderDate, CustomerID);
-- Query: WHERE CustomerID = 1001 → Index NAHI use karega!
-- CustomerID leading column nahi hai
-- GOOD: Zyada selective / zyada filter wala column pehle
CREATE INDEX IX_Right ON dbo.Orders (CustomerID, OrderDate);
-- Query: WHERE CustomerID = 1001 → Index Seek ✅
-- Query: WHERE CustomerID = 1001 AND OrderDate > '2024-01-01' → Index Seek ✅
❌ Mistake 5: GUID as Clustered Index — Nightmare!
-- BAD: Random GUID as Clustered Index
CREATE TABLE dbo.Products (
ProductID UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,
-- ...
);
-- NEWID() → completely random → massive page splits → severe fragmentation!
-- GOOD Option 1: Sequential GUID
ProductID UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID() PRIMARY KEY,
-- GOOD Option 2: INT IDENTITY as CL key, GUID as non-clustered
CREATE TABLE dbo.Products (
RowID INT IDENTITY PRIMARY KEY, -- Clustered
ProductID UNIQUEIDENTIFIER DEFAULT NEWID() -- Non-Clustered Unique
);
CREATE UNIQUE NONCLUSTERED INDEX UIX_Products_GUID
ON dbo.Products (ProductID);
❌ Mistake 6: Index on Low Cardinality Column
-- BAD: Gender column pe index (M/F — sirf 2 values)
CREATE INDEX IX_Emp_Gender ON dbo.Employees (Gender);
-- SQL Server yeh index kabhi use nahi karega
-- 50% rows M, 50% rows F → Full scan better hai!
-- Index cardinality rule:
-- High cardinality (many unique values) → Good index candidate
-- Low cardinality (few unique values) → Bad index candidate
-- EXCEPTION: Filtered index on low-cardinality works!
CREATE INDEX IX_Emp_VIP ON dbo.Employees (EmployeeID)
WHERE IsVIP = 1; -- Only 1% employees VIP hain → Filtered = OK ✅
9.2 Common Errors aur Solutions
Error 1: Cannot create index on view without SCHEMABINDING
-- Error: "Cannot create index on view unless view is schema bound"
-- Fix:
CREATE VIEW dbo.vw_ActiveEmployees
WITH SCHEMABINDING -- ← Yeh add karo
AS
SELECT EmployeeID, EmployeeName, DepartmentID
FROM dbo.TBL_Employees
WHERE IsActive = 1;
-- Ab Indexed View ban sakti hai
CREATE UNIQUE CLUSTERED INDEX CIX_vw_ActiveEmployees
ON dbo.vw_ActiveEmployees (EmployeeID);
Error 2: Duplicate key row detected when creating unique index
-- Error: "Cannot create unique index. Duplicate key values exist."
-- Find duplicates karo:
SELECT EmailAddress, COUNT(*) AS DupCount
FROM dbo.Employees
GROUP BY EmailAddress
HAVING COUNT(*) > 1;
-- Fix duplicates, phir index banao
-- Ya phir IGNORE_DUP_KEY use karo (inserts pe silent ignore):
CREATE UNIQUE INDEX UIX_Emp_Email
ON dbo.Employees (EmailAddress)
WITH (IGNORE_DUP_KEY = ON);
-- Warning: Duplicate insert silently ignored hogi — production mein sochke use karo
Error 3: Index rebuild fails with "online rebuild not supported"
-- Error: Online index operations not available on this edition
-- SQL Server Standard Edition mein ONLINE = ON nahi milta
-- Fix: OFFLINE rebuild karo (downtime mein)
ALTER INDEX IX_Orders_Date ON dbo.Orders
REBUILD WITH (FILLFACTOR = 80); -- ONLINE = ON remove karo
Error 4: Key Lookup causing performance issues
-- Identify Key Lookups in execution plan text:
SELECT TOP 10
qs.total_logical_reads / qs.execution_count AS AvgLogicalReads,
qs.execution_count,
SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS QueryText
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY AvgLogicalReads DESC;
-- High logical reads wali queries → Key Lookup suspects
-- Fix: INCLUDE columns add karo in relevant index
10. Pro Developer Insights — Best Practices {#10-pro-developer-insights}
10.1 Index Design Golden Rules 🏆
✅ DO:
├── Primary Key pe hamesha Clustered Index rakho (INT IDENTITY preferred)
├── JOIN columns pe Non-Clustered Index banao
├── High-frequency WHERE columns pe index banao
├── Selective columns choose karo (high cardinality)
├── INCLUDE columns se Key Lookups eliminate karo
├── Filtered Index use karo sparse/partial data ke liye
├── Fill Factor 75-85% rakho write-heavy tables ke liye
├── Statistics regularly update karo (weekly ya schedule pe)
├── Index fragmentation monitor karo (monthly ya automated)
└── Index usage stats se unused indexes identify aur drop karo
❌ DON'T:
├── SELECT * pe index optimize mat karo
├── WHERE clause mein functions use mat karo indexed columns pe
├── GUID/NEWID() ko Clustered Index banana
├── Low-cardinality columns pe standalone index banana
├── 10+ indexes ek OLTP table pe banana
├── Missing index suggestions blindly implement karna
├── Index hints har jagah use karna
└── Index maintain karna bhoolna
10.2 Index Naming Convention
-- Naming Pattern: {Type}_{TableName}_{Columns}[_{Qualifier}]
-- Clustered Index
PK_TableName -- Primary Key (auto)
CIX_TableName_ColumnName -- Custom Clustered
-- Non-Clustered
IX_TableName_Column1_Column2 -- Regular
IX_TableName_Column1_Incl -- With INCLUDE
IX_TableName_Column1_Active -- Filtered
UIX_TableName_Column1 -- Unique
-- Examples:
PK_Employees -- Primary Key
CIX_Orders_OrderDate -- Clustered on OrderDate
IX_Employees_DeptID_Active -- Filtered non-clustered
IX_Orders_CustID_OrderDate_Incl -- With includes
UIX_Employees_Email -- Unique email
10.3 Index Monitoring Dashboard Query
-- Weekly index health report
SELECT
t.name AS TableName,
i.name AS IndexName,
i.type_desc,
CAST(ps.used_page_count * 8.0 / 1024 AS DECIMAL(10,1)) AS SizeMB,
ps.row_count AS RowCount,
CAST(ips.avg_fragmentation_in_percent AS DECIMAL(5,1)) AS FragPct,
ISNULL(us.user_seeks, 0) AS Seeks,
ISNULL(us.user_scans, 0) AS Scans,
ISNULL(us.user_lookups, 0) AS Lookups,
ISNULL(us.user_updates, 0) AS Updates,
CASE
WHEN ISNULL(us.user_seeks,0) + ISNULL(us.user_scans,0) = 0
AND ISNULL(us.user_updates,0) > 100
THEN '⚠️ UNUSED - Consider DROP'
WHEN ips.avg_fragmentation_in_percent > 30
THEN '🔴 HIGH FRAG - REBUILD needed'
WHEN ips.avg_fragmentation_in_percent > 10
THEN '🟡 MOD FRAG - REORGANIZE needed'
ELSE '✅ Healthy'
END AS HealthStatus
FROM sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
LEFT JOIN sys.dm_db_index_usage_stats us
ON i.object_id = us.object_id AND i.index_id = us.index_id
AND us.database_id = DB_ID()
LEFT JOIN sys.dm_db_partition_stats ps
ON i.object_id = ps.object_id AND i.index_id = ps.index_id
LEFT JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
ON i.object_id = ips.object_id AND i.index_id = ips.index_id
WHERE i.type > 0 -- No heap
AND ps.row_count > 1000 -- Small tables skip
ORDER BY ps.row_count DESC, FragPct DESC;
10.4 Security — Index aur Permissions
-- Index create karne ke liye ALTER TABLE ya ALTER ANY INDEX permission chahiye
GRANT ALTER ON dbo.Employees TO [AppUser]; -- Risky — production mein avoid
-- Better: Dedicated maintenance user
CREATE LOGIN IndexMaintenanceLogin WITH PASSWORD = 'StrongP@ss!';
CREATE USER IndexMaintenanceUser FOR LOGIN IndexMaintenanceLogin;
GRANT ALTER ANY INDEX TO IndexMaintenanceUser;
-- Application user ko sirf SELECT/INSERT/UPDATE/DELETE do
-- Index maintenance DBA ka kaam hai, application ka nahi
11. Comparison — Index vs Alternatives {#11-comparison}
11.1 Index Types Comparison Table
Feature | Clustered | Non-Clustered | Covering | Columnstore | Filtered |
|---|---|---|---|---|---|
Max per table | 1 | 999 | 999 | 1+1 | 999 |
Data storage | Table = Index | Separate | Separate + Included | Column-wise | Partial rows |
Best for | PK, ranges | Selective WHERE | Eliminate lookups | OLAP/Analytics | Sparse data |
Write overhead | Low | Medium | Medium-High | Low (batch) | Low |
Read speed | Very Fast | Fast | Very Fast | Extreme (analytics) | Fast |
Compression | Page/Row | Page/Row | Page/Row | High (columnar) | Page/Row |
SQL Server version | All | All | 2005+ | 2012+ | 2008+ |
11.2 Index vs Other Performance Techniques
Technique | When to Use | Tradeoff |
|---|---|---|
Index | Frequent queries on specific columns | Write overhead, storage |
Partitioning | Very large tables (100M+ rows), date-based archiving | Complex setup |
Materialized View (Indexed View) | Complex aggregation, pre-computed results | Tight schema dependency |
Query Hints | One-off fixes, forcing specific plan | Hard to maintain, brittle |
Statistics Update | Wrong cardinality estimates | Temporary fix |
Query Rewrite | Non-SARGable conditions | Developer effort |
Caching (Redis) | Read-heavy, rarely changing data | Data staleness |
Read Replicas | Reporting vs OLTP separation | Sync overhead |
11.3 SQL Server vs Other Databases — Index Concepts
Concept | SQL Server | MySQL (InnoDB) | PostgreSQL | Oracle |
|---|---|---|---|---|
Clustered Index | Yes (1 per table) | Yes (PK = CL) | No (CLUSTER cmd) | Index-Organized Tables |
Non-Clustered | 999 max | Secondary indexes | Heap + Index | Non-unique indexes |
Covering/Include | INCLUDE keyword | Covering index | INCLUDE (v11+) | Functional index |
Filtered | WHERE clause | Partial index (limited) | Partial index | Function-based |
Columnstore | Yes (2012+) | No (native) | No (native) | Limited |
Fill Factor | 1-100% | No direct equivalent | FILLFACTOR param | PCTFREE |
Online Rebuild | Enterprise only | Yes (5.6+) | Yes | Yes (Enterprise) |
12. Data Science Perspective {#12-data-science-perspective}
12.1 Data Pipeline Performance mein Index ka Role
Data Science projects mein aksar large tables se data extract karna hota hai:
-- Machine Learning Feature Extraction Query
-- 10M row FactSales table se monthly features nikaalte hain
-- WITHOUT proper index → 45 seconds
SELECT
CustomerID,
YEAR(SaleDate) AS Year,
MONTH(SaleDate) AS Month,
SUM(Amount) AS TotalSpend,
COUNT(*) AS OrderCount,
AVG(Amount) AS AvgOrderValue
FROM dbo.FactSales
WHERE SaleDate >= '2022-01-01'
GROUP BY CustomerID, YEAR(SaleDate), MONTH(SaleDate);
-- WITH Columnstore Index → 1.2 seconds!
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_FactSales_DS
ON dbo.FactSales (CustomerID, SaleDate, Amount);
-- Same query → 1.2 seconds ✅
12.2 ETL Pipelines mein Index Management
-- ETL Best Practice: Bulk load ke time indexes disable karo, baad mein rebuild karo
-- Step 1: Before bulk load — indexes disable
ALTER INDEX ALL ON dbo.Staging_SalesData DISABLE;
-- Step 2: Bulk load karo (fast — no index maintenance during load)
BULK INSERT dbo.Staging_SalesData
FROM 'C:\Data\sales_2024.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', FIRSTROW = 2);
-- Step 3: After load — indexes rebuild karo
ALTER INDEX ALL ON dbo.Staging_SalesData REBUILD;
-- Result: 10x faster bulk load compared to loading with indexes active
12.3 Pandas + SQL Server Index — Python Developer ke liye
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('mssql+pyodbc://sa:pass@server/db?driver=ODBC+Driver+17+for+SQL+Server')
# BAD: Full table scan → Index pe query nahi gaya
df = pd.read_sql("SELECT * FROM dbo.FactSales", engine)
# GOOD: Indexed column pe WHERE → Index Seek
df = pd.read_sql("""
SELECT CustomerID, SaleDate, Amount
FROM dbo.FactSales
WHERE SaleDate >= '2024-01-01'
AND SaleDate < '2025-01-01'
""", engine)
# BEST: Chunksize use karo large data ke liye + indexed query
for chunk in pd.read_sql("""
SELECT CustomerID, SaleDate, Amount
FROM dbo.FactSales
WHERE SaleDate >= '2024-01-01'
""", engine, chunksize=10000):
# process each chunk
pass
12.4 SQL Server Index → Data Quality aur ML
-- Unique Index = Data Quality guarantee for ML training data
-- Duplicate records → Model training mein bias
CREATE UNIQUE NONCLUSTERED INDEX UIX_CustomerTransactions_Dedup
ON dbo.CustomerTransactions (CustomerID, TransactionDate, Amount)
WHERE IsValid = 1;
-- Filtered Index for labeled training data
CREATE NONCLUSTERED INDEX IX_TrainingData_Labeled
ON dbo.MLTrainingData (FeatureDate, CustomerSegment)
INCLUDE (Feature1, Feature2, Feature3, Label)
WHERE IsLabeled = 1 AND IsOutlier = 0;
-- Fast feature retrieval
SELECT Feature1, Feature2, Feature3, Label
FROM dbo.MLTrainingData
WHERE IsLabeled = 1 AND IsOutlier = 0
-- Filtered Index → Only labeled, clean data → Fast ✅
13. Interview Questions {#13-interview-questions}
🟢 Basic Level
Q1: Clustered aur Non-Clustered Index mein kya fark hai?
Clustered Index table data ko physically sort karta hai. Ek table mein sirf ek Clustered Index ho sakta hai. Non-Clustered Index ek separate structure hai jo data row ke pointers store karta hai. Ek table mein 999 Non-Clustered Indexes ho sakte hain.
Q2: Index Seek aur Index Scan mein kya fark hai?
Seek = direct jump to specific rows (fast ✅). Scan = poora index/table read karo (slow ❌). Seek tabhi hota hai jab query SARGable ho aur selective condition ho.
Q3: Heap kya hota hai?
Jab table mein koi Clustered Index nahi hota, rows unordered pages mein store hoti hain — ise Heap kehte hain.
Q4: Primary Key aur Clustered Index ek hi cheez hai?
Nahi, necessarily nahi. By default, Primary Key ek Clustered Unique Index create karta hai. Lekin tum Primary Key ko
NONCLUSTEREDspecify karke alag Clustered Index bhi bana sakte ho.
Q5: Fill Factor kya hai aur kyun important hai?
Fill Factor index page ko kitna percent fill kare — baacha space future inserts ke liye. Isse page splits avoid hote hain. Default 0 (100% equivalent). Write-heavy tables ke liye 70-85% recommended.
🟡 Intermediate Level
Q6: Covering Index kya hai aur Key Lookup se kaise bachata hai?
Covering Index wo index hai jisme SELECT query ke saare required columns include hote hain (INCLUDE clause ke through). Isse Clustered Index tak extra trip (Key Lookup) avoid hoti hai, jo significant performance improvement deta hai.
Q7: Composite Index mein column order kyun matter karta hai?
Left-Most Prefix Rule ke kaaran. Agar index (A, B, C) pe hai toh query A, A+B, ya A+B+C mein efficient hogi. Sirf B ya sirf C pe filter karne pe index use nahi hoga.
Q8: Filtered Index kab use karein?
Jab table mein bahut saari NULL rows ho ya sirf ek specific subset (like Active records, Pending orders) ke liye queries hon. Smaller, more selective index banata hai.
Q9: Statistics kya hoti hain aur kab update karni chahiye?
Statistics column values ka distribution information hoti hai jo Query Optimizer use karta hai plan banane ke liye. Auto Update Statistics ON rakho. Large data changes ke baad manually update karo. Production pe
AUTO_UPDATE_STATISTICS_ASYNC ONbest practice hai.
Q10: Index Fragmentation kya hai aur kaise handle karein?
Data changes (INSERT/UPDATE/DELETE) ke kaaran index pages out-of-order ya partially empty ho jaati hain. Under 10% → ignore, 10-30% → REORGANIZE, 30%+ → REBUILD.
🔴 Advanced Level
Q11: Columnstore Index Row Mode aur Batch Mode mein kya fark hai?
Row Mode: Traditional, ek ek row process. Batch Mode: 900 rows simultaneously process — SIMD instructions use karke. Columnstore indexes Batch Mode trigger karte hain jo analytics queries mein 10-100x faster hoti hain.
Q12: GUID ko Clustered Index banana kyun problem hai?
NEWID()completely random GUID generate karta hai. New rows random positions pe insert honge — har insert pe page split hogi. Result: Extreme fragmentation, massive I/O, slow performance. Fix:NEWSEQUENTIALID()use karo ya INT IDENTITY Clustered key rakho.
Q13: Index Hints kab use karein aur risks kya hain?
Jab Query Optimizer galat plan choose kare aur tum sure ho ki specific index better hai. Risk: Data distribution change hone par hint outdated ho jaata hai. Hard-coded hints maintenance burden hain. Prefer karo statistics update aur query rewrite.
Q14: Partitioned Table mein index alignment kya hoti hai?
Aligned partition: Index table ke partition scheme par bana ho. Non-aligned: Index alag filegroup pe. Aligned indexes partition elimination aur faster parallel queries enable karte hain. Production mein hamesha aligned indexes prefer karo.
Q15: Indexed View (Materialized View) kab use karein?
Complex aggregations (SUM, COUNT, JOIN) jo frequently re-computed hoti hain. View ke upar UNIQUE CLUSTERED INDEX banao — result pre-computed store hota hai. Requirement:
WITH SCHEMABINDING, deterministic functions only, certain SET options required.
🟣 Scenario-Based Questions
Q16: Ek query 5 minute le rahi hai. Index already hai. Tum kya check karoge?
Execution Plan dekho — Index Seek ho raha hai ya Scan? 2. Key Lookup hai? → INCLUDE columns add karo. 3. Index Fragmentation check karo. 4. Statistics outdated? → Update karo. 5. Parameter Sniffing issue? →
OPTION(RECOMPILE)test karo. 6. SARGable conditions check karo. 7. Missing Index suggestions dekho.
Q17: OLTP table pe INSERT bahut slow hai. Index se related cause kya ho sakta hai?
Zyada indexes → Har INSERT pe sab indexes update hone chahiye. Solution: Unused indexes drop karo, Fill Factor appropriately set karo, Write-heavy pe kam indexes rakho (3-5 max), Index consolidation karo — multiple narrow indexes ke bajaaye few covering indexes.
Q18: 500 GB table pe online index rebuild possible hai kya?
SQL Server Enterprise Edition mein
ONLINE = ONpossible hai — table accessible rehti hai. Standard Edition mein offline karna padega (downtime). Alternative: Partition-wise rebuild karo, ya Index creation peSORT_IN_TEMPDB = ONuse karo TempDB space ke saath.
14. Conclusion {#14-conclusion}
Key Learnings Summary 🎯
Index = Database ki backbone
───────────────────────────────────────────────────
📌 Clustered Index → Table ka physical order define karta hai
Ek table mein sirf ek
PK pe hona chahiye (INT IDENTITY)
📌 Non-Clustered → Alag B-Tree structure
999 max per table
High-frequency WHERE/JOIN columns pe
📌 Covering Index → INCLUDE se Key Lookups eliminate karo
Most impactful optimization
📌 Filtered Index → Partial data ke liye smart, small indexes
📌 Columnstore → Analytics ke liye game-changer
10-100x faster aggregations
📌 Fill Factor → Page splits control karo (70-85% for OLTP)
📌 Statistics → Query Optimizer ka brain — fresh rakhna zaroori
📌 Fragmentation → Regular maintenance se avoid karo
When to Use What — Final Decision Guide
Kaunsa index use karein?
┌─────────────────────────────────────────────────────────┐
│ Primary Key / Main lookup column? │
│ → Clustered Index (INT IDENTITY preferred) │
│ │
│ Frequent WHERE / JOIN column? │
│ → Non-Clustered Index │
│ │
│ SELECT list mein extra columns + Key Lookup problem? │
│ → Add INCLUDE columns → Covering Index │
│ │
│ Only subset of rows frequently queried? │
│ → Filtered Index (WHERE clause ke saath) │
│ │
│ Analytics / SUM / COUNT / GROUP BY on large data? │
│ → Columnstore Index │
│ │
│ Text search in large columns? │
│ → Full-Text Index │
│ │
│ Duplicate prevention on a column? │
│ → Unique Index │
└─────────────────────────────────────────────────────────┘
Final Advice 💡
"Indexing ek art hai — na zyada, na kum."
Beginner ho? Pehle Clustered aur Non-Clustered samjho, execution plans padhna seekho.
Intermediate ho? Covering indexes, filtered indexes aur fragmentation management seekho.
Advanced DBA banna hai? Statistics, columnstore internals, partition-aligned indexes, aur index maintenance automation master karo.
Index banao, monitor karo, tune karo — aur kabhi blindly mat banao. Har index ek responsibility hai!
📝 Blog by HelpInData | SQL Server Series | Hinglish mein seekho, practically use karo
🔗 helpindata.web.app | Tags: #SQLServer #Database #DBA #Indexing #Performance #HindiTech