SQL Server Index Complete Guide

SQL Server Index kya hota hai? Clustered, Non-Clustered, Covering, Filtered, Columnstore — sab kuch ek hi blog mein. Beginners se experts tak, Hinglish mein...

Mar 19, 2026 38 min read

"Database mein Index wahi kaam karta hai jo library mein catalog karta hai — bina uske, har cheez dhundhne ke liye poori library khangalni padegi."

1. Introduction — Index Kya Hai aur Kyun Zaroori Hai?

Real-Life Example se Samjho

Socho tumhare paas ek 1000 page ki dictionary hai. Tumhe "Ephemeral" word dhundhna hai.

Without Index: Page 1 se start karo, ek ek page palto — 1000 pages scan.
With Index: Dictionary ke peeche alphabetical index dekho → directly "E" section pe jao → 2 seconds mein mila!

SQL Server mein bhi exactly yahi hota hai.

Jab tum query likhte ho:

SELECT * FROM Employees WHERE EmployeeID = 10045;

Without Index: SQL Server poori Employees table ko ek ek row scan karta hai (isko Full Table Scan kehte hain). Agar table mein 10 lakh rows hain — 10 lakh rows check hongi!

With Index: SQL Server seedha ek ordered data structure mein jump karta hai aur milliseconds mein result deta hai.

Index Kyun Important Hai?

Without Index	With Index
Full Table Scan — poori table padhi	Seekback — seedha data location pe jump
Slow queries	Fast queries
High CPU & I/O usage	Low resource consumption
Poor user experience	Smooth, responsive apps
Server overload on large data	Scales well with millions of rows

2. Index Types — Sab Types Detail Mein

2.1 Clustered Index

Definition: Clustered Index table ka physical data order define karta hai. Iska matlab — table ki rows actual disk pe is index ki key order mein sorted hoti hain.

Key Points:

Ek table mein sirf EK Clustered Index ho sakta hai
Primary Key by default Clustered Index banta hai
Table itself is the index — no separate structure

Visual:

Clustered Index (EmployeeID ASC)
----------------------------------------------
Leaf Level = Actual Table Data

| EmployeeID | Name    | Dept  | Salary  |
|------------|---------|-------|---------|
| 1001       | Rahul   | IT    | 55000   |  ← Physically sorted
| 1002       | Priya   | HR    | 48000   |
| 1003       | Amit    | FIN   | 62000   |
| 1004       | Sneha   | IT    | 58000   |

Kab use karein:

Primary Key columns
Jo column sabse zyada range queries mein use ho (e.g., OrderDate, TransactionDate)
High-selectivity, always-used column

Example:

CREATE TABLE Employees (
    EmployeeID   INT           NOT NULL,
    EmployeeName NVARCHAR(100) NOT NULL,
    DeptID       INT,
    Salary       DECIMAL(10,2),
    JoiningDate  DATE,
    CONSTRAINT PK_Employees PRIMARY KEY CLUSTERED (EmployeeID ASC)
);

2.2 Non-Clustered Index

Definition: Non-Clustered Index ek alag structure hai jo table data ke upar bana hota hai. Isme leaf nodes mein actual rows nahi, balki row pointers hote hain.

Key Points:

Ek table mein maximum 999 Non-Clustered Indexes ban sakte hain (SQL Server 2008+)
Read queries speed up karta hai
Write operations (INSERT/UPDATE/DELETE) thoda slow karta hai kyunki index bhi update hota hai

Visual:

Non-Clustered Index (DeptID)
-----------------------------
Leaf Level = Key + Pointer to actual data

| DeptID | → Pointer (Clustered Key ya RID) |
|--------|-----------------------------------|
| FIN    | → EmployeeID: 1003               |
| HR     | → EmployeeID: 1002               |
| IT     | → EmployeeID: 1001               |
| IT     | → EmployeeID: 1004               |

DeptID = 'IT' query mein:

Index seek → DeptID = 'IT' leaf nodes mile
Pointer follow karo → Clustered Index se actual row fetch karo (Key Lookup)

Example:

CREATE NONCLUSTERED INDEX IX_Employees_DeptID
ON Employees (DeptID ASC);

2.3 Covering Index (with INCLUDE)

Definition: Covering Index woh Non-Clustered Index hota hai jo query ke saare required columns ko apne andar rakhta hai — na Clustered Index tak jaana padta hai, na Key Lookup hoti hai.

Kab zaroori hai:

-- Agar yeh query frequently run hoti hai:
SELECT EmployeeName, Salary
FROM Employees
WHERE DeptID = 'IT';

Agar sirf DeptID pe index hai toh:

Index seek → DeptID = 'IT' rows mile
Key Lookup → Har row ke liye Clustered Index se EmployeeName, Salary fetch karo

10,000 IT employees hain toh 10,000 Key Lookups! ❌

Solution — Covering Index:

CREATE NONCLUSTERED INDEX IX_Employees_DeptID_Covering
ON Employees (DeptID)
INCLUDE (EmployeeName, Salary);

Ab:

Index seek → DeptID = 'IT'
Leaf node mein hi EmployeeName aur Salary milenge (INCLUDE columns hain)
No Key Lookup!

💡 Rule: Key columns mein woh columns daalo jo WHERE/JOIN/ORDER BY mein use ho. INCLUDE mein woh columns daalo jo SELECT list mein hain.

3.4 Composite Index (Multi-Column Index)

Definition: Ek index jisme multiple key columns hote hain. Column order bahut important hota hai.

CREATE NONCLUSTERED INDEX IX_Orders_CustDate
ON Orders (CustomerID ASC, OrderDate DESC);

Column Order ka Rule — "Left-Most Prefix Rule":

Yeh index in queries mein kaam karega:

✅ WHERE CustomerID = 5
✅ WHERE CustomerID = 5 AND OrderDate > '2024-01-01'
✅ ORDER BY CustomerID, OrderDate DESC

Yeh queries mein index use nahi hoga:

❌ WHERE OrderDate > '2024-01-01' (CustomerID skip kiya — leading column missing)
❌ WHERE OrderDate > '2024-01-01' AND CustomerID = 5 — SQL Server optimize kar sakta hai, but generally avoid karo

Real-world example:

-- HRMS: Employee ki attendance by date range
CREATE NONCLUSTERED INDEX IX_Attendance_EmpDate
ON TBL_Attendance (EmployeeID ASC, AttendanceDate DESC)
INCLUDE (InTime, OutTime, Status);

3.5 🔴 Filtered Index

Definition: Ek Non-Clustered Index jo WHERE clause ke saath define hota hai. Sirf un rows ka index banta hai jo condition satisfy karein.

Kab use karein:

Table mein bahut saari NULL rows hain
Specific status (e.g., Active records only)
Sparse data scenarios

-- Sirf Active employees ka index
CREATE NONCLUSTERED INDEX IX_Employees_Active
ON Employees (DeptID, Salary)
WHERE IsActive = 1;

-- Sirf pending orders ka index
CREATE NONCLUSTERED INDEX IX_Orders_Pending
ON Orders (OrderDate, CustomerID)
WHERE Status = 'PENDING';

Benefits:

Smaller index size (kam pages)
Faster maintenance (kam rows update hone par)
More selective — better query plans

3.6 🟣 Columnstore Index (OLAP/Analytics ke liye)

Definition: Columnstore Index data ko row-wise nahi balki column-wise store karta hai. Yeh analytics aur reporting queries ke liye designed hai.

Row Store vs Column Store:

Row Store (Traditional):
Row 1: [1001, Rahul, IT, 55000]
Row 2: [1002, Priya, HR, 48000]
Row 3: [1003, Amit, FIN, 62000]

Column Store:
EmployeeID column: [1001, 1002, 1003, ...]
Name column:       [Rahul, Priya, Amit, ...]
DeptID column:     [IT, HR, FIN, ...]
Salary column:     [55000, 48000, 62000, ...]

Query: SELECT SUM(Salary) FROM Employees WHERE DeptID = 'IT'

Row Store: Har row read karo, DeptID check karo, Salary add karo
Column Store: Sirf DeptID aur Salary columns read karo — baaki columns skip!

Types:

Type	Table Type	DML Operations	Best For
Clustered Columnstore	Columnstore table	Supported (Delta Store)	Data Warehouse
Non-Clustered Columnstore	Regular table	Read-only	Hybrid HTAP

-- Clustered Columnstore — Pure analytics table
CREATE CLUSTERED COLUMNSTORE INDEX CCI_SalesData
ON FactSalesData;

-- Non-Clustered Columnstore — Regular table pe analytics
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Employees_Analytics
ON Employees (DeptID, Salary, JoiningDate);

3.7 🔵 Unique Index

Definition: Ensures ki index ke key columns mein duplicate values na aayein.

CREATE UNIQUE NONCLUSTERED INDEX UIX_Employees_Email
ON Employees (EmailAddress);

-- Primary Key automatically ek Unique Clustered Index hai
-- Unique Constraint bhi internally Unique Index banaata hai

3.8 Full-Text Index

Definition: Large text columns (VARCHAR/NVARCHAR) mein word-level searching ke liye. LIKE '%word%' se bahut fast.

-- Full-Text Catalog banana
CREATE FULLTEXT CATALOG FTC_Employees AS DEFAULT;

-- Full-Text Index banana
CREATE FULLTEXT INDEX ON Employees(JobDescription, Bio)
KEY INDEX PK_Employees;

-- Query
SELECT EmployeeID, EmployeeName
FROM Employees
WHERE CONTAINS(JobDescription, 'Python AND "data analysis"');

3.9 Spatial Index

Definition: Geographic/geometric data (GEOGRAPHY, GEOMETRY types) ke liye.

CREATE SPATIAL INDEX SIX_Locations_GeoData
ON OfficeLocations(GeoLocation)
USING GEOGRAPHY_GRID;

Index Types Summary Table

Index Type	Physical Storage	Max per Table	Best For
Clustered	Data = Index	1	PK, Range queries
Non-Clustered	Separate structure	999	Selective WHERE, JOIN
Covering (INCLUDE)	Separate + Included cols	999	Eliminate Key Lookups
Composite	Separate, multi-col key	999	Multi-column filters
Filtered	Partial rows	999	Sparse/conditional data
Columnstore (CCI)	Column-wise	1 (clustered)	OLAP, Analytics
Columnstore (NCCI)	Column-wise + row	1	Hybrid workloads
Unique	Separate	999	Uniqueness enforcement
Full-Text	Inverted index	Multiple	Text search
Spatial	Grid-based	Multiple	Geo queries

4. Index Create/Manage Karna — DDL Commands {#4-ddl-commands}

4.1 Index Create karna

-- Basic Syntax
CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX index_name
ON table_name (column1 [ASC|DESC], column2 [ASC|DESC], ...)
[INCLUDE (col1, col2, ...)]
[WHERE filter_predicate]
[WITH (
    FILLFACTOR = n,
    PAD_INDEX = ON|OFF,
    SORT_IN_TEMPDB = ON|OFF,
    DROP_EXISTING = ON|OFF,
    ONLINE = ON|OFF,
    STATISTICS_NORECOMPUTE = ON|OFF,
    DATA_COMPRESSION = NONE|ROW|PAGE
)]
[ON filegroup_name];

Practical Examples:

-- 1. Simple Non-Clustered Index
CREATE NONCLUSTERED INDEX IX_Emp_DeptID
ON dbo.Employees (DeptID);

-- 2. Composite Index with INCLUDE
CREATE NONCLUSTERED INDEX IX_Emp_Dept_Covering
ON dbo.Employees (DeptID ASC, JoiningDate DESC)
INCLUDE (EmployeeName, Salary, EmailAddress);

-- 3. Filtered Index — sirf active employees
CREATE NONCLUSTERED INDEX IX_Emp_Active_Email
ON dbo.Employees (EmailAddress)
WHERE IsActive = 1;

-- 4. Unique Index
CREATE UNIQUE NONCLUSTERED INDEX UIX_Emp_Email
ON dbo.Employees (EmailAddress)
WHERE EmailAddress IS NOT NULL;

-- 5. Index with Fill Factor
CREATE NONCLUSTERED INDEX IX_Orders_CustDate
ON dbo.Orders (CustomerID, OrderDate DESC)
WITH (FILLFACTOR = 80, ONLINE = ON);

4.2 Index Rebuild aur Reorganize

Fragmentation check karna:

-- Fragmentation check
SELECT
    DB_NAME()                                AS DatabaseName,
    OBJECT_NAME(ips.object_id)              AS TableName,
    i.name                                  AS IndexName,
    i.type_desc                             AS IndexType,
    ips.index_type_desc,
    ips.avg_fragmentation_in_percent        AS FragPercent,
    ips.page_count                          AS PageCount,
    ips.record_count                        AS RowCount
FROM sys.dm_db_index_physical_stats(
        DB_ID(), NULL, NULL, NULL, 'LIMITED') AS ips
INNER JOIN sys.indexes AS i
    ON ips.object_id = i.object_id
    AND ips.index_id = i.index_id
WHERE ips.avg_fragmentation_in_percent > 5
  AND ips.page_count > 100
ORDER BY ips.avg_fragmentation_in_percent DESC;

Fragmentation ke hisaab se action:

Fragmentation %	Action
< 5%	Kuch nahi karna
5% – 30%	REORGANIZE (online, minimal locking)
> 30%	REBUILD (more resources, but full cleanup)

-- REORGANIZE — online operation, low resource
ALTER INDEX IX_Emp_DeptID ON dbo.Employees REORGANIZE;

-- REBUILD — full index rebuild
ALTER INDEX IX_Emp_DeptID ON dbo.Employees
REBUILD WITH (FILLFACTOR = 80, ONLINE = ON);

-- Ek table ke sab indexes rebuild karo
ALTER INDEX ALL ON dbo.Employees
REBUILD WITH (FILLFACTOR = 80, ONLINE = ON);

💡 ONLINE = ON production pe use karo — table lock nahi lega. SQL Server Enterprise Edition needed.

4.3 Index Drop karna

-- Single index drop
DROP INDEX IX_Emp_DeptID ON dbo.Employees;

-- Check karke drop karo
IF EXISTS (
    SELECT 1 FROM sys.indexes
    WHERE name = 'IX_Emp_DeptID'
    AND object_id = OBJECT_ID('dbo.Employees')
)
    DROP INDEX IX_Emp_DeptID ON dbo.Employees;

4.4 Index Disable karna

-- Index disable karo (structure remain karta hai, data nahi)
ALTER INDEX IX_Emp_DeptID ON dbo.Employees DISABLE;

-- Re-enable karna ho toh rebuild karo
ALTER INDEX IX_Emp_DeptID ON dbo.Employees REBUILD;

4.5 Index Rename karna

EXEC sp_rename
    'dbo.Employees.IX_Emp_DeptID',
    'IX_Employees_DepartmentID',
    'INDEX';

4.6 Index Information Queries

-- Table ke sab indexes dekho
SELECT
    i.name              AS IndexName,
    i.type_desc         AS IndexType,
    i.is_unique,
    i.is_primary_key,
    i.is_unique_constraint,
    i.filter_definition AS FilterCondition,
    STRING_AGG(c.name, ', ')
        WITHIN GROUP (ORDER BY ic.key_ordinal)  AS KeyColumns
FROM sys.indexes i
INNER JOIN sys.index_columns ic
    ON i.object_id = ic.object_id
    AND i.index_id  = ic.index_id
INNER JOIN sys.columns c
    ON ic.object_id = c.object_id
    AND ic.column_id = c.column_id
WHERE i.object_id = OBJECT_ID('dbo.Employees')
  AND ic.is_included_column = 0
GROUP BY i.name, i.type_desc, i.is_unique,
         i.is_primary_key, i.is_unique_constraint,
         i.filter_definition
ORDER BY i.type_desc, i.name;

-- Index usage stats (kaunsa index kitna use ho raha hai)
SELECT
    OBJECT_NAME(i.object_id)    AS TableName,
    i.name                      AS IndexName,
    i.type_desc                 AS IndexType,
    ius.user_seeks,
    ius.user_scans,
    ius.user_lookups,
    ius.user_updates,
    ius.last_user_seek,
    ius.last_user_scan
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius
    ON i.object_id = ius.object_id
    AND i.index_id  = ius.index_id
    AND ius.database_id = DB_ID()
WHERE i.object_id = OBJECT_ID('dbo.Employees')
ORDER BY (ISNULL(ius.user_seeks,0) + ISNULL(ius.user_scans,0)) DESC;

5. Intermediate Usage — Real Workflows aur Combinations {#5-intermediate-usage}

5.1 Execution Plan Read Karna — Index Use Ho Raha Hai Ya Nahi?

-- Execution Plan enable karo
SET STATISTICS IO ON;
SET STATISTICS TIME ON;

-- Query run karo
SELECT EmployeeID, EmployeeName, Salary
FROM dbo.Employees
WHERE DeptID = 'IT'
ORDER BY Salary DESC;

SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;

SSMS mein:

Ctrl + M → Actual Execution Plan enable karo
Query run karo
Plan mein icons dekho:
- 🗂️ Index Seek — Good! ✅
- 📄 Index Scan — Acceptable, but check karo
- 📋 Table Scan / Clustered Index Scan — Investigate! ⚠️
- 🔗 Key Lookup — Covering index ki zaroorat hai! ⚠️

5.2 Missing Index Suggestions

SQL Server automatically missing indexes suggest karta hai execution plan mein aur DMVs mein:

-- SQL Server ke missing index suggestions
SELECT TOP 20
    mid.statement                               AS TableName,
    migs.avg_total_user_cost * migs.avg_user_impact
        * (migs.user_seeks + migs.user_scans)   AS ImprovementScore,
    migs.avg_user_impact                        AS AvgImpact,
    migs.user_seeks,
    migs.user_scans,
    mid.equality_columns,
    mid.inequality_columns,
    mid.included_columns,
    -- Auto-generate CREATE INDEX statement
    'CREATE NONCLUSTERED INDEX IX_' +
        REPLACE(REPLACE(mid.statement, '[', ''), ']', '') + '_' +
        CAST(mig.index_group_id AS VARCHAR) +
    ' ON ' + mid.statement +
    ' (' + ISNULL(mid.equality_columns, '') +
    CASE WHEN mid.equality_columns IS NOT NULL
              AND mid.inequality_columns IS NOT NULL
         THEN ', ' ELSE '' END +
    ISNULL(mid.inequality_columns, '') + ')' +
    ISNULL(' INCLUDE (' + mid.included_columns + ')', '')
                                                AS SuggestedIndexSQL
FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs
    ON mig.index_group_id = migs.group_handle
INNER JOIN sys.dm_db_missing_index_details mid
    ON mig.index_handle = mid.index_handle
WHERE mid.database_id = DB_ID()
ORDER BY ImprovementScore DESC;

⚠️ Warning: Missing index suggestions ko blindly implement mat karo! Evaluate karo — kai baar zyada indexes write performance hurt karte hain.

5.3 Unused Indexes Find Karna

-- Unused indexes — waste of space and write performance
SELECT
    OBJECT_NAME(i.object_id)        AS TableName,
    i.name                          AS IndexName,
    i.type_desc,
    ISNULL(ius.user_seeks, 0)       AS UserSeeks,
    ISNULL(ius.user_scans, 0)       AS UserScans,
    ISNULL(ius.user_lookups, 0)     AS UserLookups,
    ISNULL(ius.user_updates, 0)     AS UserUpdates,
    ius.last_user_seek,
    ius.last_user_scan,
    -- Index size
    ps.used_page_count * 8 / 1024.0 AS IndexSizeMB
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius
    ON i.object_id = ius.object_id
    AND i.index_id  = ius.index_id
    AND ius.database_id = DB_ID()
LEFT JOIN sys.dm_db_partition_stats ps
    ON i.object_id = ps.object_id
    AND i.index_id  = ps.index_id
WHERE i.type_desc = 'NONCLUSTERED'
  AND ISNULL(ius.user_seeks, 0) = 0
  AND ISNULL(ius.user_scans, 0) = 0
  AND ISNULL(ius.user_lookups, 0) = 0
  AND ISNULL(ius.user_updates, 0) > 0   -- Sirf update ho raha hai, read nahi
  AND i.object_id = OBJECT_ID('dbo.Employees')
ORDER BY ISNULL(ius.user_updates, 0) DESC;

5.4 Index Hints — Force Karna

Kabhi kabhi Query Optimizer galat plan choose karta hai. Tum manually index hint de sakte ho:

-- INDEX hint — specific index use karo
SELECT EmployeeID, EmployeeName
FROM dbo.Employees WITH (INDEX(IX_Emp_DeptID))
WHERE DeptID = 'IT';

-- NOLOCK hint (dirty reads) — reporting queries ke liye
SELECT COUNT(*) FROM dbo.Orders WITH (NOLOCK)
WHERE OrderDate >= '2024-01-01';

-- FORCESEEK — seek force karo, scan nahi
SELECT EmployeeID FROM dbo.Employees WITH (FORCESEEK)
WHERE DeptID = 'IT';

-- FORCESCAN — scan force karo
SELECT EmployeeID FROM dbo.Employees WITH (FORCESCAN)
WHERE DeptID = 'IT';

⚠️ Index hints carefully use karo — data distribution change hone par plan outdated ho sakta hai.

5.5 Key Lookup Eliminate Karna — Step by Step

Problem Scenario:

-- Table structure
CREATE TABLE dbo.Orders (
    OrderID     INT         PRIMARY KEY,  -- Clustered Index
    CustomerID  INT         NOT NULL,
    OrderDate   DATE        NOT NULL,
    TotalAmount DECIMAL(12,2),
    Status      VARCHAR(20),
    ShipAddress NVARCHAR(200)
);

-- Existing index
CREATE NONCLUSTERED INDEX IX_Orders_Customer
ON dbo.Orders (CustomerID);

-- Frequent query
SELECT OrderID, OrderDate, TotalAmount
FROM dbo.Orders
WHERE CustomerID = 1001;

Execution Plan mein dikhega:

Index Seek (IX_Orders_Customer) → Key Lookup (Clustered) → Nested Loops

Fix — Covering Index:

-- DROP existing, better index create karo
DROP INDEX IX_Orders_Customer ON dbo.Orders;

CREATE NONCLUSTERED INDEX IX_Orders_Customer_Covering
ON dbo.Orders (CustomerID)
INCLUDE (OrderDate, TotalAmount);

Ab: Index Seek only — No Key Lookup ✅

6. Advanced Concepts — Internal Working, Fill Factor, Statistics {#6-advanced-concepts}

6.1 Fill Factor Deep Dive

Definition: Fill Factor index page ko kitna % fill kare — baacha space future inserts ke liye hota hai.

FILLFACTOR = 100 (default):
| DATA DATA DATA DATA DATA DATA DATA DATA | ← 100% full
Naya row aaya → Page Split! → 2 pages ban gaye → Fragmentation ❌

FILLFACTOR = 80:
| DATA DATA DATA DATA DATA DATA [20% empty] |
Naye rows easily fit ho jaate hain → Kam page splits → Kam fragmentation ✅

Fill Factor	Best For	Drawback
100%	Read-only tables	Page splits on insert
80-90%	Mixed OLTP	Some wasted space
60-70%	High insert/update tables	More space consumed
50%	Very high churn tables	Lot of wasted space

-- Server-level default Fill Factor change karna
EXEC sp_configure 'fill factor', 80;
RECONFIGURE;

-- Index-level
CREATE NONCLUSTERED INDEX IX_Orders_Date
ON dbo.Orders (OrderDate)
WITH (FILLFACTOR = 75, PAD_INDEX = ON);
-- PAD_INDEX = ON ensures intermediate pages bhi same fill factor use karein

6.2 Statistics — Query Optimizer ka Brain

Definition: Statistics SQL Server mein column values ka distribution information hoti hai. Query Optimizer in statistics ke basis pe decide karta hai ki index use karna chahiye ya nahi, aur kaunsa join strategy best hai.

Statistics histogram for OrderDate column:
Range: Jan 2024 - Dec 2024
Jan: 5,000 rows
Feb: 4,800 rows
...
Dec: 12,000 rows  ← Peak

Query: WHERE OrderDate BETWEEN '2024-11-01' AND '2024-12-31'
Optimizer decides: Estimated rows = 23,000 → Index Seek profitable

Statistics Update karna:

-- Manual update
UPDATE STATISTICS dbo.Orders;
UPDATE STATISTICS dbo.Orders IX_Orders_Date;

-- Full scan (accurate but slower)
UPDATE STATISTICS dbo.Orders WITH FULLSCAN;

-- Sample (faster, less accurate)
UPDATE STATISTICS dbo.Orders WITH SAMPLE 30 PERCENT;

-- Statistics info dekho
DBCC SHOW_STATISTICS('dbo.Orders', 'IX_Orders_Date');

Auto Statistics:

-- Auto create/update enabled rakho (default ON)
ALTER DATABASE YourDB SET AUTO_CREATE_STATISTICS ON;
ALTER DATABASE YourDB SET AUTO_UPDATE_STATISTICS ON;
ALTER DATABASE YourDB SET AUTO_UPDATE_STATISTICS_ASYNC ON; -- Production best practice

6.3 Index Internals — Page Splits in Detail

Page Split kya hota hai:

Before Split (Full Page):
Page 5: [Row1, Row2, Row3, Row4, Row5] ← 100% full

New Row4.5 insert karna hai (sort order mein Row4 aur Row5 ke beech):

After Split:
Page 5: [Row1, Row2, Row3]         ← 50% full
Page 6 (new): [Row4, Row4.5, Row5] ← 50% full

Problem:
- New page allocate karna pada = I/O overhead
- Pages non-contiguous ho gayi = Fragmentation
- Frequent splits = Severe performance degradation

Prevention:

Monotonically increasing Clustered Key use karo (IDENTITY, SEQUENCE, NEWSEQUENTIALID())
Appropriate Fill Factor set karo
Regular index maintenance karo

6.4 Heap vs Clustered Table — Kaunsa Better?

-- Heap table (no clustered index)
CREATE TABLE dbo.LogEvents (
    EventID     INT IDENTITY,
    EventMsg    NVARCHAR(500),
    CreatedAt   DATETIME2 DEFAULT GETDATE()
);
-- Koi index nahi — pure heap

-- Clustered Index table
CREATE TABLE dbo.LogEvents (
    EventID     INT IDENTITY PRIMARY KEY,  -- CL index
    EventMsg    NVARCHAR(500),
    CreatedAt   DATETIME2 DEFAULT GETDATE()
);

Feature	Heap	Clustered Index Table
Data order	Random	Sorted by CL key
Insert speed	Fast (no sort needed)	Slightly slower (maintain order)
Range queries	Slow (full scan)	Fast (seeks on CL key)
Delete performance	Forwarded records issue	Cleaner
Fragmentation	Forwarded pointers problem	Page fragmentation
Best for	Staging/temp tables, pure insert	Most production tables

6.5 Columnstore Index — Internal Architecture

Columnstore Row Group Structure:
------------------------------------
Row Group 1 (1,048,576 rows max):
  Column 1 Segment: [compressed data]
  Column 2 Segment: [compressed data]
  ...

Row Group 2:
  Column 1 Segment: [compressed data]
  ...

Delta Store (for new inserts — row format):
  [Recent rows not yet compressed]
  
Deleted Bitmap:
  [Marks deleted rows]

Batch Mode Processing:

Traditional Row Mode: Ek ek row process
Columnstore Batch Mode: 900 rows ek saath process
Result: 10x-100x faster analytics!

-- Columnstore compression ratio check
SELECT
    i.name,
    SUM(a.total_pages) * 8 / 1024.0   AS TotalSizeMB,
    SUM(a.used_pages) * 8 / 1024.0    AS UsedSizeMB,
    COUNT(DISTINCT rg.row_group_id)    AS RowGroups,
    SUM(rg.total_rows)                 AS TotalRows
FROM sys.indexes i
INNER JOIN sys.partitions p
    ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a
    ON p.partition_id = a.container_id
LEFT JOIN sys.column_store_row_groups rg
    ON i.object_id = rg.object_id AND i.index_id = rg.index_id
WHERE i.type IN (5, 6)   -- 5=CCI, 6=NCCI
GROUP BY i.name;

6.6 Parallel Index Rebuild

-- Multiple CPUs use karke index rebuild — faster on large tables
ALTER INDEX IX_Orders_Date ON dbo.Orders
REBUILD WITH (
    MAXDOP = 4,          -- 4 parallel threads use karo
    ONLINE = ON,         -- No table lock
    FILLFACTOR = 80,
    DATA_COMPRESSION = PAGE  -- Page compression bhi apply karo
);

6.7 Partitioned Index

-- Partition function aur scheme
CREATE PARTITION FUNCTION PF_OrderYear (INT)
AS RANGE RIGHT FOR VALUES (2021, 2022, 2023, 2024);

CREATE PARTITION SCHEME PS_OrderYear
AS PARTITION PF_OrderYear
ALL TO ([PRIMARY]);

-- Partitioned Clustered Index
CREATE TABLE dbo.Orders_Partitioned (
    OrderID   INT,
    OrderYear INT,
    OrderDate DATE,
    Amount    DECIMAL(12,2)
) ON PS_OrderYear(OrderYear);

CREATE CLUSTERED INDEX CIX_Orders_Year
ON dbo.Orders_Partitioned (OrderYear, OrderID)
ON PS_OrderYear(OrderYear);

7. Real-World Use Cases {#7-real-world-use-cases}

7.1 HRMS System — Payroll aur Attendance Queries

-- Employee lookup by ID (Primary Key — auto clustered)
SELECT * FROM dbo.TBL_EmployeeMaster
WHERE EmployeeID = 10045;
-- Clustered Index Seek ✅

-- Department wise active employees
CREATE NONCLUSTERED INDEX IX_EmpMaster_DeptActive
ON dbo.TBL_EmployeeMaster (DepartmentID, IsActive)
INCLUDE (EmployeeName, Designation, JoiningDate, MobileNo)
WHERE IsActive = 1;

-- Monthly salary processing — employee + month filter
CREATE NONCLUSTERED INDEX IX_Salary_EmpMonth
ON dbo.TBL_SalaryDetails (EmployeeID, SalaryMonth DESC)
INCLUDE (BasicSalary, HRA, SpecialAllowance, GrossSalary, NetSalary);

-- Attendance report — date range
CREATE NONCLUSTERED INDEX IX_Attendance_DateEmp
ON dbo.TBL_AttendanceMaster (AttendanceDate DESC, EmployeeID)
INCLUDE (InTime, OutTime, Status, WorkingHours)
WHERE Status IN ('P', 'A', 'HD');

7.2 Travel Management System

-- Travel requests by employee
CREATE NONCLUSTERED INDEX IX_TravelReq_EmpStatus
ON dbo.TBL_TravelRequest (EmployeeID, RequestStatus)
INCLUDE (TravelDate, Destination, Purpose, RequestedOn);

-- Pending approvals for a manager
CREATE NONCLUSTERED INDEX IX_TravelApproval_ApproverPending
ON dbo.TBL_TravelApproval (ApproverID, ApprovalStatus, ApprovalLevel)
INCLUDE (RequestID, EmployeeID, RequestedDate)
WHERE ApprovalStatus = 'PENDING';

-- Date range travel cost report
CREATE NONCLUSTERED INDEX IX_TravelExpense_DateRange
ON dbo.TBL_TravelExpense (ExpenseDate DESC, CompanyID)
INCLUDE (EmployeeID, ExpenseAmount, ExpenseType, IsApproved);

7.3 E-Commerce System

-- Product search by category and price
CREATE NONCLUSTERED INDEX IX_Products_CatPrice
ON dbo.Products (CategoryID, IsActive)
INCLUDE (ProductName, Price, StockQty, ImageURL)
WHERE IsActive = 1;

-- Order history by customer
CREATE NONCLUSTERED INDEX IX_Orders_CustDate
ON dbo.Orders (CustomerID, OrderDate DESC)
INCLUDE (OrderStatus, TotalAmount, TrackingNumber);

-- Pending orders processing
CREATE NONCLUSTERED INDEX IX_Orders_StatusDate
ON dbo.Orders (OrderStatus, OrderDate ASC)
INCLUDE (CustomerID, TotalAmount, ShipAddress)
WHERE OrderStatus IN ('PLACED', 'CONFIRMED', 'SHIPPED');

7.4 Reporting/Analytics System

-- Monthly sales summary
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Sales_Analytics
ON dbo.FactSales (SaleDate, ProductID, RegionID, SalesAmount, Quantity);

-- Date-wise aggregation
SELECT
    YEAR(SaleDate)  AS SaleYear,
    MONTH(SaleDate) AS SaleMonth,
    SUM(SalesAmount) AS TotalRevenue,
    COUNT(*)         AS TotalOrders
FROM dbo.FactSales
WHERE SaleDate >= '2023-01-01'
GROUP BY YEAR(SaleDate), MONTH(SaleDate)
ORDER BY 1, 2;
-- Columnstore = Blazing fast for this aggregation ✅

8. Practical Examples — Code + Line-by-line Explanation {#8-practical-examples}

8.1 Complete Index Strategy for an Employee Table

-- ==========================================
-- Step 1: Table Create karo
-- ==========================================
CREATE TABLE dbo.TBL_Employees (
    EmployeeID      INT            NOT NULL IDENTITY(1,1),
    EmployeeCode    VARCHAR(20)    NOT NULL,
    EmployeeName    NVARCHAR(100)  NOT NULL,
    EmailAddress    VARCHAR(100),
    MobileNo        VARCHAR(15),
    DepartmentID    INT            NOT NULL,
    DesignationID   INT            NOT NULL,
    JoiningDate     DATE           NOT NULL,
    BasicSalary     DECIMAL(12,2)  NOT NULL,
    IsActive        BIT            NOT NULL DEFAULT 1,
    CreatedDate     DATETIME2      NOT NULL DEFAULT GETDATE(),

    -- Clustered Primary Key
    CONSTRAINT PK_Employees PRIMARY KEY CLUSTERED (EmployeeID ASC)
);
-- PK_Employees → Clustered Index on EmployeeID
-- Data rows physically EmployeeID order mein sorted hain

-- ==========================================
-- Step 2: Business-driven indexes
-- ==========================================

-- Index 1: Employee Code search (HR frequently use karta hai)
-- EmployeeCode unique hona chahiye + fast lookup
CREATE UNIQUE NONCLUSTERED INDEX UIX_Employees_EmpCode
ON dbo.TBL_Employees (EmployeeCode);
-- Unique index → duplicate employee codes nahi ban sakte
-- Non-Clustered → separate B-tree structure

-- Index 2: Department wise employee list (most common report)
CREATE NONCLUSTERED INDEX IX_Employees_Dept_Active
ON dbo.TBL_Employees (DepartmentID ASC, IsActive)
INCLUDE (EmployeeName, DesignationID, JoiningDate, BasicSalary);
-- DepartmentID → leading key (WHERE DeptID = X)
-- IsActive → second key (WHERE IsActive = 1)
-- INCLUDE columns → SELECT mein ye columns chahiye, Key Lookup avoid ho
-- Covering index for: SELECT Name, Desig, Joining, Salary FROM Emp WHERE DeptID=X AND IsActive=1

-- Index 3: Email login lookup
CREATE UNIQUE NONCLUSTERED INDEX UIX_Employees_Email
ON dbo.TBL_Employees (EmailAddress)
WHERE EmailAddress IS NOT NULL;
-- Filtered Unique Index → NULL emails allowed, but non-NULL must be unique
-- WHERE EmailAddress IS NOT NULL → NULLs index mein include nahi honge
-- Login queries fast hongi: WHERE EmailAddress = 'rahul@company.com'

-- Index 4: New joinee / attrition reports (date range queries)
CREATE NONCLUSTERED INDEX IX_Employees_JoiningDate
ON dbo.TBL_Employees (JoiningDate DESC)
INCLUDE (EmployeeName, DepartmentID, IsActive)
WITH (FILLFACTOR = 90);
-- JoiningDate DESC → newest joinee first
-- FILLFACTOR = 90 → 10% free space because new employees keep joining
-- Date range queries fast hongi

-- ==========================================
-- Step 3: Query test karo
-- ==========================================

-- Query 1: Department wise active employees
-- Uses: IX_Employees_Dept_Active → Index Seek + No Key Lookup ✅
SELECT EmployeeName, DesignationID, JoiningDate, BasicSalary
FROM dbo.TBL_Employees
WHERE DepartmentID = 5 AND IsActive = 1;

-- Query 2: Employee by email login
-- Uses: UIX_Employees_Email → Unique Index Seek ✅
SELECT EmployeeID, EmployeeName, DepartmentID
FROM dbo.TBL_Employees
WHERE EmailAddress = 'rahul.sharma@company.com';

-- Query 3: This month joiners
-- Uses: IX_Employees_JoiningDate → Index Seek + Key Lookup (EmployeeCode nahi INCLUDE mein) ⚠️
SELECT EmployeeCode, EmployeeName, DepartmentID
FROM dbo.TBL_Employees
WHERE JoiningDate >= DATEADD(DAY, 1, EOMONTH(GETDATE(), -1))
  AND JoiningDate <= EOMONTH(GETDATE());
-- EmployeeCode INCLUDE mein nahi hai → Key Lookup hogi
-- Fix: EmployeeCode bhi INCLUDE mein add karo

-- ==========================================
-- Step 4: Index stats check karo
-- ==========================================
SELECT
    i.name                        AS IndexName,
    i.type_desc,
    ISNULL(s.user_seeks, 0)      AS Seeks,
    ISNULL(s.user_scans, 0)      AS Scans,
    ISNULL(s.user_lookups, 0)    AS Lookups,
    ISNULL(s.user_updates, 0)    AS Updates,
    CAST(ps.used_page_count * 8.0 / 1024 AS DECIMAL(10,2)) AS SizeMB
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats s
    ON i.object_id = s.object_id
    AND i.index_id  = s.index_id
    AND s.database_id = DB_ID()
LEFT JOIN sys.dm_db_partition_stats ps
    ON i.object_id = ps.object_id
    AND i.index_id  = ps.index_id
WHERE i.object_id = OBJECT_ID('dbo.TBL_Employees')
ORDER BY ISNULL(s.user_seeks, 0) + ISNULL(s.user_scans, 0) DESC;

8.2 Smart Index Maintenance Stored Procedure

CREATE OR ALTER PROCEDURE dbo.usp_SmartIndexMaintenance
    @TableName   SYSNAME   = NULL,   -- NULL = all tables
    @DryRun      BIT       = 1,      -- 1 = only print, 0 = execute
    @MinPageCount INT       = 100,   -- Ignore small indexes
    @ReorganizeThreshold DECIMAL(5,2) = 10.0,
    @RebuildThreshold    DECIMAL(5,2) = 30.0
AS
BEGIN
    SET NOCOUNT ON;

    -- Temp table for fragmentation data
    CREATE TABLE #IndexFrag (
        TableName    SYSNAME,
        IndexName    SYSNAME,
        IndexType    VARCHAR(50),
        FragPercent  DECIMAL(5,2),
        PageCount    BIGINT,
        Action       VARCHAR(20)
    );

    -- Fragmentation data collect karo
    INSERT INTO #IndexFrag
    SELECT
        OBJECT_NAME(ips.object_id),
        i.name,
        ips.index_type_desc,
        CAST(ips.avg_fragmentation_in_percent AS DECIMAL(5,2)),
        ips.page_count,
        CASE
            WHEN ips.avg_fragmentation_in_percent >= @RebuildThreshold    THEN 'REBUILD'
            WHEN ips.avg_fragmentation_in_percent >= @ReorganizeThreshold THEN 'REORGANIZE'
            ELSE 'SKIP'
        END
    FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
    INNER JOIN sys.indexes i
        ON ips.object_id = i.object_id AND ips.index_id = i.index_id
    WHERE ips.page_count >= @MinPageCount
      AND i.type > 0   -- Heap nahi
      AND (@TableName IS NULL OR OBJECT_NAME(ips.object_id) = @TableName);

    -- Summary print karo
    PRINT REPLICATE('=', 60);
    PRINT 'Index Maintenance Report — ' + CONVERT(VARCHAR, GETDATE(), 120);
    PRINT REPLICATE('=', 60);

    DECLARE @tbl SYSNAME, @idx SYSNAME, @frag DECIMAL(5,2),
            @action VARCHAR(20), @sql NVARCHAR(500);

    DECLARE cur CURSOR FOR
        SELECT TableName, IndexName, FragPercent, Action
        FROM #IndexFrag
        WHERE Action <> 'SKIP'
        ORDER BY FragPercent DESC;

    OPEN cur;
    FETCH NEXT FROM cur INTO @tbl, @idx, @frag, @action;

    WHILE @@FETCH_STATUS = 0
    BEGIN
        IF @action = 'REBUILD'
            SET @sql = 'ALTER INDEX [' + @idx + '] ON dbo.[' + @tbl + '] REBUILD WITH (FILLFACTOR=80, ONLINE=ON)';
        ELSE
            SET @sql = 'ALTER INDEX [' + @idx + '] ON dbo.[' + @tbl + '] REORGANIZE';

        PRINT '[' + @action + '] ' + @tbl + '.' + @idx +
              ' (Frag: ' + CAST(@frag AS VARCHAR) + '%)';
        PRINT '  SQL: ' + @sql;

        IF @DryRun = 0
        BEGIN
            BEGIN TRY
                EXEC sp_executesql @sql;
                PRINT '  [OK] Done.';
            END TRY
            BEGIN CATCH
                PRINT '  [ERROR] ' + ERROR_MESSAGE();
            END CATCH
        END

        FETCH NEXT FROM cur INTO @tbl, @idx, @frag, @action;
    END

    CLOSE cur; DEALLOCATE cur;

    IF @DryRun = 1
        PRINT CHAR(10) + '*** DRY RUN mode — set @DryRun=0 to execute ***';

    DROP TABLE #IndexFrag;
END;
GO

-- Usage:
EXEC dbo.usp_SmartIndexMaintenance @DryRun = 1;               -- Preview
EXEC dbo.usp_SmartIndexMaintenance @TableName = 'TBL_Employees', @DryRun = 0;  -- Execute

9. Edge Cases aur Errors {#9-edge-cases-and-errors}

9.1 Common Mistakes aur Fixes

❌ Mistake 1: SELECT * pe Index banana

-- BAD: SELECT * use karna → koi bhi index covering nahi ban sakta
SELECT * FROM dbo.Orders WHERE CustomerID = 1001;

-- GOOD: Specific columns select karo → covering index banao
SELECT OrderID, OrderDate, TotalAmount
FROM dbo.Orders WHERE CustomerID = 1001;

CREATE NONCLUSTERED INDEX IX_Orders_Cust
ON dbo.Orders (CustomerID)
INCLUDE (OrderDate, TotalAmount);

❌ Mistake 2: Functions in WHERE Clause — Index Kill!

-- BAD: Function column pe apply ki → Index use nahi hoga!
SELECT * FROM dbo.Employees
WHERE YEAR(JoiningDate) = 2023;
-- → Clustered Index Scan ❌

-- BAD: Implicit conversion
SELECT * FROM dbo.Employees
WHERE EmployeeCode = 12345;  -- EmployeeCode VARCHAR hai, int diya
-- → Implicit cast → Index scan ❌

-- GOOD: SARGable (Search ARGument able) condition
SELECT * FROM dbo.Employees
WHERE JoiningDate >= '2023-01-01'
  AND JoiningDate <  '2024-01-01';
-- → Index Seek ✅

-- GOOD: Explicit type
SELECT * FROM dbo.Employees
WHERE EmployeeCode = '12345';
-- → Index Seek ✅

SARGable vs Non-SARGable:

Non-SARGable ❌	SARGable ✅
`WHERE YEAR(col) = 2023`	`WHERE col >= '2023-01-01' AND col < '2024-01-01'`
`WHERE UPPER(name) = 'RAHUL'`	`WHERE name = 'Rahul'` (case-insensitive collation)
`WHERE col + 5 = 10`	`WHERE col = 5`
`WHERE CONVERT(VARCHAR, id) = '100'`	`WHERE id = 100`
`WHERE col LIKE '%rahul%'`	`WHERE col LIKE 'rahul%'` (leading wildcard avoid karo)

❌ Mistake 3: Too Many Indexes on a Write-Heavy Table

-- BAD: Transaction table pe 15 indexes → INSERT bahut slow!
-- Har INSERT ke saath 15 index structures update honge

-- GOOD: Rule of thumb:
-- OLTP tables: 3-5 indexes max
-- Reporting tables: 5-10 indexes ok
-- Keep only what's actually being used!

❌ Mistake 4: Wrong Column Order in Composite Index

-- BAD: Date pehle, CustomerID baad mein
CREATE INDEX IX_Wrong ON dbo.Orders (OrderDate, CustomerID);

-- Query: WHERE CustomerID = 1001 → Index NAHI use karega!
-- CustomerID leading column nahi hai

-- GOOD: Zyada selective / zyada filter wala column pehle
CREATE INDEX IX_Right ON dbo.Orders (CustomerID, OrderDate);
-- Query: WHERE CustomerID = 1001 → Index Seek ✅
-- Query: WHERE CustomerID = 1001 AND OrderDate > '2024-01-01' → Index Seek ✅

❌ Mistake 5: GUID as Clustered Index — Nightmare!

-- BAD: Random GUID as Clustered Index
CREATE TABLE dbo.Products (
    ProductID UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,
    -- ...
);
-- NEWID() → completely random → massive page splits → severe fragmentation!

-- GOOD Option 1: Sequential GUID
ProductID UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID() PRIMARY KEY,

-- GOOD Option 2: INT IDENTITY as CL key, GUID as non-clustered
CREATE TABLE dbo.Products (
    RowID     INT IDENTITY PRIMARY KEY,        -- Clustered
    ProductID UNIQUEIDENTIFIER DEFAULT NEWID() -- Non-Clustered Unique
);
CREATE UNIQUE NONCLUSTERED INDEX UIX_Products_GUID
ON dbo.Products (ProductID);

❌ Mistake 6: Index on Low Cardinality Column

-- BAD: Gender column pe index (M/F — sirf 2 values)
CREATE INDEX IX_Emp_Gender ON dbo.Employees (Gender);
-- SQL Server yeh index kabhi use nahi karega
-- 50% rows M, 50% rows F → Full scan better hai!

-- Index cardinality rule:
-- High cardinality (many unique values) → Good index candidate
-- Low cardinality (few unique values) → Bad index candidate

-- EXCEPTION: Filtered index on low-cardinality works!
CREATE INDEX IX_Emp_VIP ON dbo.Employees (EmployeeID)
WHERE IsVIP = 1;  -- Only 1% employees VIP hain → Filtered = OK ✅

9.2 Common Errors aur Solutions

Error 1: Cannot create index on view without SCHEMABINDING

-- Error: "Cannot create index on view unless view is schema bound"
-- Fix:
CREATE VIEW dbo.vw_ActiveEmployees
WITH SCHEMABINDING  -- ← Yeh add karo
AS
SELECT EmployeeID, EmployeeName, DepartmentID
FROM dbo.TBL_Employees
WHERE IsActive = 1;

-- Ab Indexed View ban sakti hai
CREATE UNIQUE CLUSTERED INDEX CIX_vw_ActiveEmployees
ON dbo.vw_ActiveEmployees (EmployeeID);

Error 2: Duplicate key row detected when creating unique index

-- Error: "Cannot create unique index. Duplicate key values exist."
-- Find duplicates karo:
SELECT EmailAddress, COUNT(*) AS DupCount
FROM dbo.Employees
GROUP BY EmailAddress
HAVING COUNT(*) > 1;

-- Fix duplicates, phir index banao
-- Ya phir IGNORE_DUP_KEY use karo (inserts pe silent ignore):
CREATE UNIQUE INDEX UIX_Emp_Email
ON dbo.Employees (EmailAddress)
WITH (IGNORE_DUP_KEY = ON);
-- Warning: Duplicate insert silently ignored hogi — production mein sochke use karo

Error 3: Index rebuild fails with "online rebuild not supported"

-- Error: Online index operations not available on this edition
-- SQL Server Standard Edition mein ONLINE = ON nahi milta
-- Fix: OFFLINE rebuild karo (downtime mein)
ALTER INDEX IX_Orders_Date ON dbo.Orders
REBUILD WITH (FILLFACTOR = 80);  -- ONLINE = ON remove karo

Error 4: Key Lookup causing performance issues

-- Identify Key Lookups in execution plan text:
SELECT TOP 10
    qs.total_logical_reads / qs.execution_count AS AvgLogicalReads,
    qs.execution_count,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
          ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS QueryText
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY AvgLogicalReads DESC;
-- High logical reads wali queries → Key Lookup suspects
-- Fix: INCLUDE columns add karo in relevant index

10. Pro Developer Insights — Best Practices {#10-pro-developer-insights}

10.1 Index Design Golden Rules 🏆

✅ DO:
├── Primary Key pe hamesha Clustered Index rakho (INT IDENTITY preferred)
├── JOIN columns pe Non-Clustered Index banao
├── High-frequency WHERE columns pe index banao
├── Selective columns choose karo (high cardinality)
├── INCLUDE columns se Key Lookups eliminate karo
├── Filtered Index use karo sparse/partial data ke liye
├── Fill Factor 75-85% rakho write-heavy tables ke liye
├── Statistics regularly update karo (weekly ya schedule pe)
├── Index fragmentation monitor karo (monthly ya automated)
└── Index usage stats se unused indexes identify aur drop karo

❌ DON'T:
├── SELECT * pe index optimize mat karo
├── WHERE clause mein functions use mat karo indexed columns pe
├── GUID/NEWID() ko Clustered Index banana
├── Low-cardinality columns pe standalone index banana
├── 10+ indexes ek OLTP table pe banana
├── Missing index suggestions blindly implement karna
├── Index hints har jagah use karna
└── Index maintain karna bhoolna

10.2 Index Naming Convention

-- Naming Pattern: {Type}_{TableName}_{Columns}[_{Qualifier}]

-- Clustered Index
PK_TableName                          -- Primary Key (auto)
CIX_TableName_ColumnName              -- Custom Clustered

-- Non-Clustered
IX_TableName_Column1_Column2          -- Regular
IX_TableName_Column1_Incl             -- With INCLUDE
IX_TableName_Column1_Active           -- Filtered
UIX_TableName_Column1                 -- Unique

-- Examples:
PK_Employees                          -- Primary Key
CIX_Orders_OrderDate                  -- Clustered on OrderDate
IX_Employees_DeptID_Active            -- Filtered non-clustered
IX_Orders_CustID_OrderDate_Incl       -- With includes
UIX_Employees_Email                   -- Unique email

10.3 Index Monitoring Dashboard Query

-- Weekly index health report
SELECT
    t.name                                          AS TableName,
    i.name                                          AS IndexName,
    i.type_desc,
    CAST(ps.used_page_count * 8.0 / 1024 AS DECIMAL(10,1)) AS SizeMB,
    ps.row_count                                    AS RowCount,
    CAST(ips.avg_fragmentation_in_percent AS DECIMAL(5,1)) AS FragPct,
    ISNULL(us.user_seeks, 0)                        AS Seeks,
    ISNULL(us.user_scans, 0)                        AS Scans,
    ISNULL(us.user_lookups, 0)                      AS Lookups,
    ISNULL(us.user_updates, 0)                      AS Updates,
    CASE
        WHEN ISNULL(us.user_seeks,0) + ISNULL(us.user_scans,0) = 0
             AND ISNULL(us.user_updates,0) > 100
        THEN '⚠️ UNUSED - Consider DROP'
        WHEN ips.avg_fragmentation_in_percent > 30
        THEN '🔴 HIGH FRAG - REBUILD needed'
        WHEN ips.avg_fragmentation_in_percent > 10
        THEN '🟡 MOD FRAG - REORGANIZE needed'
        ELSE '✅ Healthy'
    END                                             AS HealthStatus
FROM sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
LEFT JOIN sys.dm_db_index_usage_stats us
    ON i.object_id = us.object_id AND i.index_id = us.index_id
    AND us.database_id = DB_ID()
LEFT JOIN sys.dm_db_partition_stats ps
    ON i.object_id = ps.object_id AND i.index_id = ps.index_id
LEFT JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
    ON i.object_id = ips.object_id AND i.index_id = ips.index_id
WHERE i.type > 0   -- No heap
  AND ps.row_count > 1000  -- Small tables skip
ORDER BY ps.row_count DESC, FragPct DESC;

10.4 Security — Index aur Permissions

-- Index create karne ke liye ALTER TABLE ya ALTER ANY INDEX permission chahiye
GRANT ALTER ON dbo.Employees TO [AppUser];  -- Risky — production mein avoid

-- Better: Dedicated maintenance user
CREATE LOGIN IndexMaintenanceLogin WITH PASSWORD = 'StrongP@ss!';
CREATE USER IndexMaintenanceUser FOR LOGIN IndexMaintenanceLogin;
GRANT ALTER ANY INDEX TO IndexMaintenanceUser;

-- Application user ko sirf SELECT/INSERT/UPDATE/DELETE do
-- Index maintenance DBA ka kaam hai, application ka nahi

11. Comparison — Index vs Alternatives {#11-comparison}

11.1 Index Types Comparison Table

Feature	Clustered	Non-Clustered	Covering	Columnstore	Filtered
Max per table	1	999	999	1+1	999
Data storage	Table = Index	Separate	Separate + Included	Column-wise	Partial rows
Best for	PK, ranges	Selective WHERE	Eliminate lookups	OLAP/Analytics	Sparse data
Write overhead	Low	Medium	Medium-High	Low (batch)	Low
Read speed	Very Fast	Fast	Very Fast	Extreme (analytics)	Fast
Compression	Page/Row	Page/Row	Page/Row	High (columnar)	Page/Row
SQL Server version	All	All	2005+	2012+	2008+

11.2 Index vs Other Performance Techniques

Technique	When to Use	Tradeoff
Index	Frequent queries on specific columns	Write overhead, storage
Partitioning	Very large tables (100M+ rows), date-based archiving	Complex setup
Materialized View (Indexed View)	Complex aggregation, pre-computed results	Tight schema dependency
Query Hints	One-off fixes, forcing specific plan	Hard to maintain, brittle
Statistics Update	Wrong cardinality estimates	Temporary fix
Query Rewrite	Non-SARGable conditions	Developer effort
Caching (Redis)	Read-heavy, rarely changing data	Data staleness
Read Replicas	Reporting vs OLTP separation	Sync overhead

11.3 SQL Server vs Other Databases — Index Concepts

Concept	SQL Server	MySQL (InnoDB)	PostgreSQL	Oracle
Clustered Index	Yes (1 per table)	Yes (PK = CL)	No (CLUSTER cmd)	Index-Organized Tables
Non-Clustered	999 max	Secondary indexes	Heap + Index	Non-unique indexes
Covering/Include	INCLUDE keyword	Covering index	INCLUDE (v11+)	Functional index
Filtered	WHERE clause	Partial index (limited)	Partial index	Function-based
Columnstore	Yes (2012+)	No (native)	No (native)	Limited
Fill Factor	1-100%	No direct equivalent	FILLFACTOR param	PCTFREE
Online Rebuild	Enterprise only	Yes (5.6+)	Yes	Yes (Enterprise)

12. Data Science Perspective {#12-data-science-perspective}

12.1 Data Pipeline Performance mein Index ka Role

Data Science projects mein aksar large tables se data extract karna hota hai:

-- Machine Learning Feature Extraction Query
-- 10M row FactSales table se monthly features nikaalte hain

-- WITHOUT proper index → 45 seconds
SELECT
    CustomerID,
    YEAR(SaleDate)  AS Year,
    MONTH(SaleDate) AS Month,
    SUM(Amount)     AS TotalSpend,
    COUNT(*)        AS OrderCount,
    AVG(Amount)     AS AvgOrderValue
FROM dbo.FactSales
WHERE SaleDate >= '2022-01-01'
GROUP BY CustomerID, YEAR(SaleDate), MONTH(SaleDate);

-- WITH Columnstore Index → 1.2 seconds!
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_FactSales_DS
ON dbo.FactSales (CustomerID, SaleDate, Amount);
-- Same query → 1.2 seconds ✅

12.2 ETL Pipelines mein Index Management

-- ETL Best Practice: Bulk load ke time indexes disable karo, baad mein rebuild karo

-- Step 1: Before bulk load — indexes disable
ALTER INDEX ALL ON dbo.Staging_SalesData DISABLE;

-- Step 2: Bulk load karo (fast — no index maintenance during load)
BULK INSERT dbo.Staging_SalesData
FROM 'C:\Data\sales_2024.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', FIRSTROW = 2);

-- Step 3: After load — indexes rebuild karo
ALTER INDEX ALL ON dbo.Staging_SalesData REBUILD;

-- Result: 10x faster bulk load compared to loading with indexes active

12.3 Pandas + SQL Server Index — Python Developer ke liye

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('mssql+pyodbc://sa:pass@server/db?driver=ODBC+Driver+17+for+SQL+Server')

# BAD: Full table scan → Index pe query nahi gaya
df = pd.read_sql("SELECT * FROM dbo.FactSales", engine)

# GOOD: Indexed column pe WHERE → Index Seek
df = pd.read_sql("""
    SELECT CustomerID, SaleDate, Amount
    FROM dbo.FactSales
    WHERE SaleDate >= '2024-01-01'
      AND SaleDate < '2025-01-01'
""", engine)

# BEST: Chunksize use karo large data ke liye + indexed query
for chunk in pd.read_sql("""
    SELECT CustomerID, SaleDate, Amount
    FROM dbo.FactSales
    WHERE SaleDate >= '2024-01-01'
""", engine, chunksize=10000):
    # process each chunk
    pass

12.4 SQL Server Index → Data Quality aur ML

-- Unique Index = Data Quality guarantee for ML training data
-- Duplicate records → Model training mein bias
CREATE UNIQUE NONCLUSTERED INDEX UIX_CustomerTransactions_Dedup
ON dbo.CustomerTransactions (CustomerID, TransactionDate, Amount)
WHERE IsValid = 1;

-- Filtered Index for labeled training data
CREATE NONCLUSTERED INDEX IX_TrainingData_Labeled
ON dbo.MLTrainingData (FeatureDate, CustomerSegment)
INCLUDE (Feature1, Feature2, Feature3, Label)
WHERE IsLabeled = 1 AND IsOutlier = 0;

-- Fast feature retrieval
SELECT Feature1, Feature2, Feature3, Label
FROM dbo.MLTrainingData
WHERE IsLabeled = 1 AND IsOutlier = 0
-- Filtered Index → Only labeled, clean data → Fast ✅

13. Interview Questions {#13-interview-questions}

🟢 Basic Level

Q1: Clustered aur Non-Clustered Index mein kya fark hai?

Clustered Index table data ko physically sort karta hai. Ek table mein sirf ek Clustered Index ho sakta hai. Non-Clustered Index ek separate structure hai jo data row ke pointers store karta hai. Ek table mein 999 Non-Clustered Indexes ho sakte hain.

Q2: Index Seek aur Index Scan mein kya fark hai?

Seek = direct jump to specific rows (fast ✅). Scan = poora index/table read karo (slow ❌). Seek tabhi hota hai jab query SARGable ho aur selective condition ho.

Q3: Heap kya hota hai?

Jab table mein koi Clustered Index nahi hota, rows unordered pages mein store hoti hain — ise Heap kehte hain.

Q4: Primary Key aur Clustered Index ek hi cheez hai?

Nahi, necessarily nahi. By default, Primary Key ek Clustered Unique Index create karta hai. Lekin tum Primary Key ko NONCLUSTERED specify karke alag Clustered Index bhi bana sakte ho.

Q5: Fill Factor kya hai aur kyun important hai?

Fill Factor index page ko kitna percent fill kare — baacha space future inserts ke liye. Isse page splits avoid hote hain. Default 0 (100% equivalent). Write-heavy tables ke liye 70-85% recommended.

🟡 Intermediate Level

Q6: Covering Index kya hai aur Key Lookup se kaise bachata hai?

Covering Index wo index hai jisme SELECT query ke saare required columns include hote hain (INCLUDE clause ke through). Isse Clustered Index tak extra trip (Key Lookup) avoid hoti hai, jo significant performance improvement deta hai.

Q7: Composite Index mein column order kyun matter karta hai?

Left-Most Prefix Rule ke kaaran. Agar index (A, B, C) pe hai toh query A, A+B, ya A+B+C mein efficient hogi. Sirf B ya sirf C pe filter karne pe index use nahi hoga.

Q8: Filtered Index kab use karein?

Jab table mein bahut saari NULL rows ho ya sirf ek specific subset (like Active records, Pending orders) ke liye queries hon. Smaller, more selective index banata hai.

Q9: Statistics kya hoti hain aur kab update karni chahiye?

Statistics column values ka distribution information hoti hai jo Query Optimizer use karta hai plan banane ke liye. Auto Update Statistics ON rakho. Large data changes ke baad manually update karo. Production pe AUTO_UPDATE_STATISTICS_ASYNC ON best practice hai.

Q10: Index Fragmentation kya hai aur kaise handle karein?

Data changes (INSERT/UPDATE/DELETE) ke kaaran index pages out-of-order ya partially empty ho jaati hain. Under 10% → ignore, 10-30% → REORGANIZE, 30%+ → REBUILD.

🔴 Advanced Level

Q11: Columnstore Index Row Mode aur Batch Mode mein kya fark hai?

Row Mode: Traditional, ek ek row process. Batch Mode: 900 rows simultaneously process — SIMD instructions use karke. Columnstore indexes Batch Mode trigger karte hain jo analytics queries mein 10-100x faster hoti hain.

Q12: GUID ko Clustered Index banana kyun problem hai?

NEWID() completely random GUID generate karta hai. New rows random positions pe insert honge — har insert pe page split hogi. Result: Extreme fragmentation, massive I/O, slow performance. Fix: NEWSEQUENTIALID() use karo ya INT IDENTITY Clustered key rakho.

Q13: Index Hints kab use karein aur risks kya hain?

Jab Query Optimizer galat plan choose kare aur tum sure ho ki specific index better hai. Risk: Data distribution change hone par hint outdated ho jaata hai. Hard-coded hints maintenance burden hain. Prefer karo statistics update aur query rewrite.

Q14: Partitioned Table mein index alignment kya hoti hai?

Aligned partition: Index table ke partition scheme par bana ho. Non-aligned: Index alag filegroup pe. Aligned indexes partition elimination aur faster parallel queries enable karte hain. Production mein hamesha aligned indexes prefer karo.

Q15: Indexed View (Materialized View) kab use karein?

Complex aggregations (SUM, COUNT, JOIN) jo frequently re-computed hoti hain. View ke upar UNIQUE CLUSTERED INDEX banao — result pre-computed store hota hai. Requirement: WITH SCHEMABINDING, deterministic functions only, certain SET options required.

🟣 Scenario-Based Questions

Q16: Ek query 5 minute le rahi hai. Index already hai. Tum kya check karoge?

Execution Plan dekho — Index Seek ho raha hai ya Scan? 2. Key Lookup hai? → INCLUDE columns add karo. 3. Index Fragmentation check karo. 4. Statistics outdated? → Update karo. 5. Parameter Sniffing issue? → OPTION(RECOMPILE) test karo. 6. SARGable conditions check karo. 7. Missing Index suggestions dekho.

Q17: OLTP table pe INSERT bahut slow hai. Index se related cause kya ho sakta hai?

Zyada indexes → Har INSERT pe sab indexes update hone chahiye. Solution: Unused indexes drop karo, Fill Factor appropriately set karo, Write-heavy pe kam indexes rakho (3-5 max), Index consolidation karo — multiple narrow indexes ke bajaaye few covering indexes.

Q18: 500 GB table pe online index rebuild possible hai kya?

SQL Server Enterprise Edition mein ONLINE = ON possible hai — table accessible rehti hai. Standard Edition mein offline karna padega (downtime). Alternative: Partition-wise rebuild karo, ya Index creation pe SORT_IN_TEMPDB = ON use karo TempDB space ke saath.

14. Conclusion {#14-conclusion}

Key Learnings Summary 🎯

Index = Database ki backbone
───────────────────────────────────────────────────
📌 Clustered Index  → Table ka physical order define karta hai
                      Ek table mein sirf ek
                      PK pe hona chahiye (INT IDENTITY)

📌 Non-Clustered    → Alag B-Tree structure
                      999 max per table
                      High-frequency WHERE/JOIN columns pe

📌 Covering Index   → INCLUDE se Key Lookups eliminate karo
                      Most impactful optimization

📌 Filtered Index   → Partial data ke liye smart, small indexes

📌 Columnstore      → Analytics ke liye game-changer
                      10-100x faster aggregations

📌 Fill Factor      → Page splits control karo (70-85% for OLTP)

📌 Statistics       → Query Optimizer ka brain — fresh rakhna zaroori

📌 Fragmentation    → Regular maintenance se avoid karo

When to Use What — Final Decision Guide

Kaunsa index use karein?
┌─────────────────────────────────────────────────────────┐
│ Primary Key / Main lookup column?                        │
│ → Clustered Index (INT IDENTITY preferred)               │
│                                                          │
│ Frequent WHERE / JOIN column?                            │
│ → Non-Clustered Index                                    │
│                                                          │
│ SELECT list mein extra columns + Key Lookup problem?     │
│ → Add INCLUDE columns → Covering Index                   │
│                                                          │
│ Only subset of rows frequently queried?                  │
│ → Filtered Index (WHERE clause ke saath)                 │
│                                                          │
│ Analytics / SUM / COUNT / GROUP BY on large data?        │
│ → Columnstore Index                                      │
│                                                          │
│ Text search in large columns?                            │
│ → Full-Text Index                                        │
│                                                          │
│ Duplicate prevention on a column?                        │
│ → Unique Index                                           │
└─────────────────────────────────────────────────────────┘

Final Advice 💡

"Indexing ek art hai — na zyada, na kum."

Beginner ho? Pehle Clustered aur Non-Clustered samjho, execution plans padhna seekho.
Intermediate ho? Covering indexes, filtered indexes aur fragmentation management seekho.
Advanced DBA banna hai? Statistics, columnstore internals, partition-aligned indexes, aur index maintenance automation master karo.

Index banao, monitor karo, tune karo — aur kabhi blindly mat banao. Har index ek responsibility hai!

📝 Blog by HelpInData | SQL Server Series | Hinglish mein seekho, practically use karo

🔗 helpindata.web.app | Tags: #SQLServer #Database #DBA #Indexing #Performance #HindiTech