Before going deep into Star Schema and Fact Tables, pehle ek basic sawal:
Dimensional Modeling hota kya hai?
Dimensional Modeling ek technique hai jo Data Warehouse me data ko organize karne ke liye use hoti hai — specially reporting aur OLAP analysis ke liye.
Iska main goal hai:
👉 Fast data retrieval
👉 High performance
👉 High usability
Basic Idea: Data Organized into Facts & Dimensions
Dimensional model me saara data do categories me divide hota hai:
1️⃣ Facts
2️⃣ Dimensions
🔹 Fact Kya Hota Hai?
Fact wo cheez hoti hai jo measure hoti hai.
Example:
- Sales
- Profit
- Quantity
- Revenue
Ye numeric values hoti hain jo aggregate ho sakti hain (SUM, AVG, COUNT).
Example:
Profit = 10,000
Dimension Kya Hoti Hai?
Dimension context provide karti hai.
Ye batati hai:
“Ye profit kis cheez ka hai?”
Example:
- Time (Year, Month, Day)
- Product
- Customer
- Region
Example:
Profit by Year
Profit by Product
Profit by Region
Yaha “by” keyword usually dimension indicate karta hai.
Star Schema Kyu Bolte Hain?
Dimensional modeling me:
Fact table center me hoti hai
Dimensions uske around hoti hain
Visual structure star jaisa dikhta hai.
Isliye ise Star Schema bolte hain.
Structure kuch aisa hota hai:
Fact_Sales (center)
→ Dim_Product
→ Dim_Customer
→ Dim_Date
→ Dim_Region
Usability Ka Benefit
Dimensional model me:
Product related data → Product table
Date related data → Date table
Customer related data → Customer table
Ab user ko 100 columns wali table me search nahi karna padta.
Wo simply:
Profit by Month
Profit by Category
Profit by Customer
Easily analyze kar sakta hai.
What Is a Fact Table?
Star Schema me Fact Table center me hoti hai.
Ye Data Warehouse ki foundation hoti hai.
Fact table contain karti hai:
✔ Business measurements
✔ Numeric values
✔ Aggregatable data
🔹 Fact Kya Hota Hai?
Fact wo hota hai jo measure hota hai.
Examples:
- Sales
- Profit
- Revenue
- Units Sold
- Cost
- Budget
Ye sab numeric values hote hain jinko add, average, count kiya ja sakta hai.
Example:
Total Profit = 10,000
Total Units Sold = 500
Ye sab aggregate ho sakta hai.
Fact Table Structure
Fact table me generally hota hai:
1️⃣ Primary Key
2️⃣ Foreign Keys (Dimensions ke liye)
3️⃣ Facts (measures)
Example:
Fact_Sales Table:
| Date_ID | Product_ID | Customer_ID | Profit | Units |
|---|
Yaha:
- Date_ID → Dim_Date
- Product_ID → Dim_Product
- Customer_ID → Dim_Customer
- Profit & Units → Facts
What Is a Dimension Table?
Ab baat karte hain Dimension ki.
Dimension table fact ko context deti hai.
Agar sirf total profit dekhoge:
Profit = 10,000
Ye useful nahi hai.
But agar dekho:
Profit by Year
Profit by Product
Profit by Region
Ab insight milta hai.
Dimension Kya Hota Hai?
Dimension descriptive hoti hai.
Ye measure nahi hoti.
Examples:
- Product Name
- Product Category
- Customer Name
- Region
- Date
- Month
- Year
Dimension Table Structure
Example:
Dim_Product
| Product_ID | Product_Name | Category |
|---|
Dim_Customer
| Customer_ID | Customer_Name | City |
Dim_Date
| Date_ID | Date | Month | Year | Weekday |
Types of Fact Tables in Data Warehouse
In Data Warehousing, fact tables store measurable business data. There are mainly three core types, plus one special case.
1️⃣ Transaction Fact Table
This stores each individual business transaction at the lowest level of detail (granular level).
- Insert-only (generally not updated)
- Highly detailed
- Measures are usually fully additive
Example: Sales transactions, ATM withdrawals, online orders.
2️⃣ Periodic Snapshot Fact Table
This stores data captured at regular time intervals (daily, weekly, monthly).
- Time-based summary
- Often contains semi-additive facts
- Used for trend analysis
Example: Daily inventory level, monthly account balance.
3️⃣ Accumulating Snapshot Fact Table
This tracks a business process lifecycle from start to finish.
- Contains multiple date columns
- Updated as process moves through stages
- Used for process performance analysis
Example: Order-to-delivery process, loan approval workflow.
4️⃣ Factless Fact Table (Special Case)
Contains no numeric measures, only event tracking.
Example: Student attendance, employee training participation.
🔥 Quick Summary for Interview
- Transaction → One row per event
- Periodic Snapshot → One row per time period
- Accumulating Snapshot → One row per process lifecycle
- Factless → Event tracking without measures