Back to all posts

Azure Data Factory (ADF)

Azure Data Factory (ADF) Kya Hai? Soch le tu ek delivery company chalata hai – jaise Swiggy/Zomato. Restaurant se khana uthaya → pack kiya → raste me check…

Azure Data Factory (ADF) Kya Hai?

Soch le tu ek delivery company chalata hai – jaise Swiggy/Zomato.

  • Restaurant se khana uthaya → pack kiya → raste me check kiya → customer ke ghar deliver kar diya.
    Bas yahi kaam Azure Data Factory data ke saath karta hai.

ADF ek Cloud-based ETL tool hai.

  • E → Extract (data uthana – jaise SQL, Excel, API, Blob storage se)
  • T → Transform (data ko clean, join, format karna)
  • L → Load (data ko destination me bhejna – jaise Data Warehouse, Database, Power BI, etc.)

🔹 ADF ke Main Components

  1. Pipelines
    • Jaise ek “workflow” ya “plan of action”.
    • Isme multiple steps hote hain (jaise data copy karna, transform karna).
  2. Activities
    • Har pipeline ke andar ka step.
    • Example: ek activity → SQL se data uthana, doosra activity → usko Blob storage me dalna.
  3. Datasets
    • Dataset = "Data ka description".
    • Example: ek dataset ho sakta hai "SQL table X", dusra ho sakta hai "CSV file Y".
  4. Linked Services
    • Jaise tu courier ke liye delivery boy hire karta hai.
    • Linked service = kaunsa source/destination use karna hai (SQL server, Blob storage, etc.).
  5. Integration Runtime (IR)
    • Ye “engine” hai jo actually data ko move karta hai.
    • Teen types hote hain:
      • Azure IR – cloud ke andar kaam karega.
      • Self-hosted IR – agar data on-premise (apne server pe) hai.
      • SSIS IR – agar tu purana SSIS package chalana chahta hai.

🔹 Use Cases of ADF

  • SQL se data nikal ke Data Lake me dalna.
  • CSV/Excel files ko automate karke Data Warehouse me load karna.
  • Real-time ya batch data pipelines banana.
  • Power BI reports ke liye backend data arrange karna.

🔹 ADF Ka Flow Example

Maan le:

  • Source = SQL Database
  • Destination = Azure Data Lake
  • Process = Roz subah 6 baje data copy ho jaye

ADF me tu banayega:

  1. Linked service → SQL aur Data Lake dono ke liye.
  2. Dataset → source table aur destination folder.
  3. Pipeline → ek copy activity jo SQL → Data Lake kare.
  4. Trigger → daily 6 AM.

Aur bas! Har din data automatically move ho jayega.

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.