Back to all posts
Data Science

What is Vector Database

Vector Database Kya Hota Hai? Vector Database ek special database hota hai jo data ko Vector Embeddings ke form me store karta hai aur similarity search kart...

Vector Database Kya Hota Hai?

Vector Database ek special database hota hai jo data ko Vector Embeddings ke form me store karta hai aur similarity search karta hai.

Normal database exact match dhoondhta hai:

SQL
SELECT * 
FROM Documents
WHERE Name = 'SQL Server';

Lekin Vector Database meaning (semantic meaning) ke basis par search karta hai.


Pehle Embedding Samjho

Maan lo sentence hai:

Plain Text
I love SQL

Embedding model ise numbers me convert karega:

Plain Text
[0.25, 0.78, -0.11, 0.92, ...]

Ye numbers sentence ka meaning represent karte hain.

Isi tarah:

Plain Text
I like SQL

ka vector ho sakta hai:

Plain Text
[0.27, 0.80, -0.09, 0.90, ...]

Dono vectors bahut similar honge kyunki dono ka meaning similar hai.


Vector Database Kya Store Karta Hai?

Example:

Document

Vector

SQL Tutorial

[0.25,0.78,...]

Python Tutorial

[0.91,0.12,...]

Azure Guide

[0.44,0.33,...]

Ye vectors database me store hote hain.


Search Kaise Hota Hai?

User puchta hai:

Plain Text
How to learn SQL Server?

Embedding Model:

Plain Text
[0.24,0.79,...]

me convert karega.

Ab Vector DB check karega:

Plain Text
Query Vector
       ↓
Compare with Stored Vectors
       ↓
Most Similar Documents

Aur SQL Tutorial wala document return karega.


Similarity Kaise Measure Hoti Hai?

Sabse common:

1. Cosine Similarity

Do vectors ke beech angle compare karta hai.

  • 1 = Exactly Similar

  • 0 = Unrelated

  • -1 = Opposite

Conceptually:

\cos(\theta)=\frac{A\cdot B}{|A||B|}

Jitna result 1 ke paas hoga, utni similarity zyada.


RAG Me Vector Database Ka Role

Maan lo aapke paas:

Plain Text
100 PDFs
10,000 Pages

Flow:

Plain Text
PDF
 ↓
Chunking
 ↓
Embeddings
 ↓
Vector Database
 ↓
User Question
 ↓
Question Embedding
 ↓
Similarity Search
 ↓
Top Relevant Chunks
 ↓
LLM
 ↓
Answer

Popular Vector Databases

  • Chroma

  • Pinecone

  • Weaviate

  • Milvus

  • Qdrant

  • FAISS


Aapke PDF RAG Project Me

Jo aap karna chahte ho:

Python
PDF
 ↓
Chunks
 ↓
Embeddings
 ↓
Vector DB
 ↓
User Prompt
 ↓
Prompt Embedding
 ↓
Similarity Search
 ↓
Relevant Chunks
 ↓
LLM Answer

Ye poora process hi practical RAG system hai.

Interview Me Short Answer

Vector Database ek specialized database hai jo text, image ya documents ki embeddings (vectors) ko store karta hai aur similarity search ke through sabse relevant information retrieve karta hai. RAG systems me Vector Database documents ko semantic search ke liye use karta hai.

Ek SQL Developer ke perspective se socho:

  • SQL Database → Exact Match Search

  • Vector Database → Meaning-Based Search

Yahi sabse bada difference hai.

Kya Vector Database data ko JSON me store karta hai?

Answer:
Nahi. Vector Database data ko internally JSON me store nahi karta. API ke through data JSON format me dikh sakta hai, lekin database ke andar vectors binary format aur optimized index structures me store hote hain.

Fir JSON kahan use hota hai?

Answer:
JSON data insert aur retrieve karne ke liye use hota hai.

Example:

SQL
{
  "id": "1",
  "vector": [0.12, 0.45, -0.78],
  "metadata": {
    "file": "sql.pdf",
    "page": 5
  }
}

Ye API request/response ka format hai, actual storage format nahi.

One-Line Interview Answer

Answer:

Vector Database API level par JSON accept karta hai, lekin internally vectors ko binary format aur specialized indexes (HNSW, IVF, PQ) me store karta hai taaki similarity search fast aur scalable ho sake.

Kya Vector Database sirf vector hi store karta hai?

Answer:
Nahi. Vector ke saath additional information bhi store karta hai:

  • ID

  • Original Text

  • Metadata

  • File Name

  • Page Number

  • Tags

Example:

SQL
{
  "id": "123",
  "vector": [0.12, 0.45, -0.78],
  "document": "SQL Server Tutorial",
  "metadata": {
    "file": "sql.pdf",
    "page": 5
  }
}

0 likes

Rate this post

No rating

Tap a star to rate

0 comments

Latest comments

0 comments

No comments yet.

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.