Azure Data Factory (ADF) Kya Hai?
Soch le tu ek delivery company chalata hai – jaise Swiggy/Zomato.
- Restaurant se khana uthaya → pack kiya → raste me check kiya → customer ke ghar deliver kar diya.
Bas yahi kaam Azure Data Factory data ke saath karta hai.
ADF ek Cloud-based ETL tool hai.
- E → Extract (data uthana – jaise SQL, Excel, API, Blob storage se)
- T → Transform (data ko clean, join, format karna)
- L → Load (data ko destination me bhejna – jaise Data Warehouse, Database, Power BI, etc.)
🔹 ADF ke Main Components
- Pipelines
- Jaise ek “workflow” ya “plan of action”.
- Isme multiple steps hote hain (jaise data copy karna, transform karna).
- Activities
- Har pipeline ke andar ka step.
- Example: ek activity → SQL se data uthana, doosra activity → usko Blob storage me dalna.
- Datasets
- Dataset = "Data ka description".
- Example: ek dataset ho sakta hai "SQL table X", dusra ho sakta hai "CSV file Y".
- Linked Services
- Jaise tu courier ke liye delivery boy hire karta hai.
- Linked service = kaunsa source/destination use karna hai (SQL server, Blob storage, etc.).
- Integration Runtime (IR)
- Ye “engine” hai jo actually data ko move karta hai.
- Teen types hote hain:
- Azure IR – cloud ke andar kaam karega.
- Self-hosted IR – agar data on-premise (apne server pe) hai.
- SSIS IR – agar tu purana SSIS package chalana chahta hai.
🔹 Use Cases of ADF
- SQL se data nikal ke Data Lake me dalna.
- CSV/Excel files ko automate karke Data Warehouse me load karna.
- Real-time ya batch data pipelines banana.
- Power BI reports ke liye backend data arrange karna.
🔹 ADF Ka Flow Example
Maan le:
- Source = SQL Database
- Destination = Azure Data Lake
- Process = Roz subah 6 baje data copy ho jaye
ADF me tu banayega:
- Linked service → SQL aur Data Lake dono ke liye.
- Dataset → source table aur destination folder.
- Pipeline → ek copy activity jo SQL → Data Lake kare.
- Trigger → daily 6 AM.
Aur bas! Har din data automatically move ho jayega.