Data Warehouse me ek bahut important concept hota hai:
👉 Slowly Changing Dimensions (SCD)
Jab hum dimensional modeling karte hain, hum assume karte hain ki dimension tables mostly static hoti hain.
Lekin real world me:
- Customer address change hota hai
- Product category change hoti hai
- Employee department change hota hai
- Organization restructure hoti hai
Toh question ye hai:
In changes ko Data Warehouse me kaise handle karein?
Yahi solve karta hai SCD.
Why Slowly Changing Dimensions Important Hain?
Data Warehouse ka main purpose hota hai:
✔ Historical analysis
✔ Trend analysis
✔ Accurate reporting
Agar dimension change ko properly handle nahi kiya:
❌ History wrong ho sakti hai
❌ Reports inaccurate ho sakti hain
❌ Business trust lose ho sakta hai
Isliye har dimension attribute ke liye strategy decide karna zaruri hai.
SCD Type 0 – No Change Allowed
Ye sabse simple case hai.
Isme:
👉 Dimension attribute kabhi change nahi hota
👉 Ya change reflect karna required nahi
Example:
Date Dimension
Year, Month, Day usually change nahi hote.
Is case me:
No update required
No history tracking required
SCD Type 1 – Overwrite (No History)
Ye sabse simple aur commonly used approach hai.
Isme:
👉 Old value overwrite kar dete hain
👉 History maintain nahi hoti
Example
Before change:
| Product_SK | Product_Name | Category |
|---|---|---|
| 1 | Oatmeal Biscuits | Sweets |
After change (Type 1):
| Product_SK | Product_Name | Category |
|---|---|---|
| 1 | Oatmeal Biscuits | Biscuits |
Purani category replace ho gayi.
Kab Use Kare?
✔ Minor changes (spelling correction)
✔ Jab history important nahi ho
✔ Sirf current state chahiye
Example:
"Oatmeal Biscuits" → "Delicious Oatmeal Biscuits"
SCD Type 2 – Full History Maintained
Ye most powerful aur default recommended strategy hai.
Isme:
👉 Old row delete ya update nahi hoti
👉 New row insert hoti hai
👉 New surrogate key assign hoti hai
Example
Before change:
| Product_SK | Product_ID | Category |
|---|---|---|
| 1 | 1001 | Sweets |
After change (Type 2):
| Product_SK | Product_ID | Category |
|---|---|---|
| 1 | 1001 | Sweets |
| 4 | 1001 | Biscuits |
New row add hui.
Old row safe hai.
Effective & Expiration Dates (Very Important)
Type 2 properly manage karne ke liye add karte hain:
- Effective_Date
- Expiration_Date
- Current_Flag
Example:
| Product_SK | Category | Effective_Date | Expiration_Date | Current_Flag |
|---|---|---|---|---|
| 1 | Sweets | 2022-01-01 | 2023-05-31 | N |
| 4 | Biscuits | 2023-06-01 | 2099-12-31 | Y |
Important:
Expiration_Date me NULL mat use karo.
Future date use karo (e.g., 2099-12-31).
Surrogate Key Is Mandatory in Type 2
Type 2 me:
Natural key duplicate ho jati hai.
Isliye:
👉 Surrogate key required hai
👉 Primary key artificial integer hona chahiye
Without surrogate key, Type 2 possible nahi hai.
Fact Table Me Correct Foreign Key Kaise Use Kare?
ETL Process:
1️⃣ New dimension row insert karo
2️⃣ Natural key + date condition se lookup karo
3️⃣ Correct surrogate key fact table me use karo
Example condition:
Current_Date BETWEEN Effective_Date AND Expiration_Date
Isse correct version milega.
Mixing Type 1 and Type 2 in Same Dimension
Good news 😎
Ek hi dimension table me:
Some attributes → Type 1
Some attributes → Type 2
Example:
Product_Name → Type 1 (minor change)
Category → Type 2 (major change)
Important:
Decision technical nahi hota.
Business requirement par depend karta hai.
🟣 SCD Type 3 – Limited History
Type 3 Type 1 aur Type 2 ke beech ka solution hai.
Isme:
👉 New row nahi banate
👉 Extra column add karte hain
Example:
| Product_ID | Category_Current | Category_Previous |
|---|---|---|
| 1001 | Biscuits | Sweets |
Sirf current aur previous state store karte hain.
Kab Use Kare?
✔ One-time restructuring
✔ Predictable big change
✔ Business wants comparison
Example:
Region restructure:
Old → North, West
New → North, West, South
Manager compare karna chahta hai.
Type 3 useful hoga.
📊 Complete SCD Comparison
| Type | History | Implementation | Best Use Case |
|---|---|---|---|
| Type 0 | No change | Do nothing | Static attributes |
| Type 1 | No history | Overwrite | Minor corrections |
| Type 2 | Full history | New row + surrogate key | Important business changes |
| Type 3 | Limited history | Extra column | One-time restructuring |
🧠 Simple Memory Trick
Type 1 → Replace
Type 2 → Add new version
Type 3 → Keep current + previous
🏁 Final Conclusion
Slowly Changing Dimensions allow us to handle changes in dimension tables without breaking historical accuracy.
Data Warehouse me history important hoti hai.
Isliye SCD design carefully business requirement ke sath milkar decide karna chahiye.
Default strategy:
👉 Important attributes → Type 2
👉 Minor changes → Type 1
👉 Big restructuring → Type 3
Always use surrogate keys.
Always think about history.
Agar chaho to next main:
🔥 Complete Star Schema + SCD practical example
🔥 End-to-end ETL implementation example
🔥 Interview questions from entire dimensional modeling
Kya kare next Boss? 😎🚀