- Pengenalan Vector Database
- Embeddings — Representasi Data sebagai Vektor
- Setup Pinecone
- CRUD Operations — Upsert, Query, Delete
- Similarity Search & Filtering
- RAG — Retrieval Augmented Generation
- Use Cases: Semantic Search & Recommendation
- Pinecone vs Alternatives
- Best Practices & Optimasi
- Quiz Pemahaman
1. Pengenalan Vector Database
Vector Database adalah database khusus yang menyimpan dan mengelola data dalam bentuk vektor (angka berdimensi tinggi). Berbeda dari database tradisional yang mencari data berdasarkan keyword yang persis sama, vector database mencari data berdasarkan kemiripan makna (semantic similarity).
Bayangkan Anda mencari "sepatu olahraga" di database tradisional — hanya menemukan dokumen yang mengandung kata "sepatu olahraga". Di vector database, Anda bisa menemukan dokumen tentang "running shoes", "sneakers untuk jogging", atau "footwear aktivitas fisik" — karena semua ini memiliki makna yang mirip.
┌─────────────────────────────────────────────────────────────────┐ │ VECTOR DATABASE CONCEPT │ │ │ │ Input Text → Embedding Model → Vector (angka) → Simpan di DB │ │ │ │ "sepatu olahraga" → [0.23, -0.45, 0.87, ..., 0.12] (384D)│ │ "running shoes" → [0.25, -0.42, 0.85, ..., 0.15] (384D)│ │ "sneakers joging" → [0.21, -0.48, 0.82, ..., 0.10] (384D)│ │ "mobil sport" → [-0.67, 0.34, -0.21, ..., 0.78] (384D)│ │ │ │ Semantic Search: "sepatu lari" → vector query │ │ │ │ Hasil (by cosine similarity): │ │ 1. "running shoes" → 0.98 (sangat mirip) ✅ │ │ 2. "sepatu olahraga" → 0.95 (mirip) ✅ │ │ 3. "sneakers joging" → 0.93 (mirip) ✅ │ │ 4. "mobil sport" → 0.12 (tidak mirip) ❌ │ │ │ │ Data yang "bermakna sama" punya vektor yang berdekatan! │ └─────────────────────────────────────────────────────────────────┘
Mengapa Vector Database Penting?
| Use Case | Contoh Aplikasi | Mengapa Butuh Vector DB? |
|---|---|---|
| Semantic Search | Search engine cerdas | Cari berdasarkan makna, bukan keyword |
| RAG (Retrieval Augmented Generation) | ChatGPT + knowledge base | LLM bisa jawab pertanyaan dari data Anda |
| Recommendation | Produk serupa | Temukan item dengan embedding mirip |
| Image Search | Pencarian gambar visual | Gambar → vektor → cari kemiripan |
| Anomaly Detection | Deteksi fraud | Data outlier punya vektor jauh dari cluster |
| Clustering | Pengelompokan otomatis | Group data berdasarkan kemiripan semantik |
Apa itu Pinecone?
Pinecone adalah vector database managed (fully hosted) yang paling populer. Keunggulan Pinecone: tidak perlu setup server, skalabilitas otomatis, latensi rendah (<50ms), dan integrasi mudah dengan ekosistem AI/ML seperti OpenAI, LangChain, dan LlamaIndex.
2. Embeddings — Representasi Data sebagai Vektor
Embedding adalah proses mengubah data (teks, gambar, audio) menjadi vektor angka berdimensi tinggi. Vektor ini merepresentasikan "makna" atau "fitur" dari data tersebut dalam ruang matematika.
Model Embedding Populer
| Model | Dimensi | Provider | Cocok Untuk |
|---|---|---|---|
| text-embedding-3-small | 1536 | OpenAI | Teks umum, cost-effective |
| text-embedding-3-large | 3072 | OpenAI | Teks, akurasi tinggi |
| all-MiniLM-L6-v2 | 384 | Sentence Transformers | Gratis, cepat, lokal |
| multilingual-e5-large | 1024 | Microsoft | Multilingual (termasuk Indonesia) |
| embed-english-v3.0 | 1024 | Cohere | Teks bahasa Inggris |
| gecko | 768 | Multi-purpose |
# =============================================
# EMBEDDING dengan OpenAI
# =============================================
# pip install openai
import openai
client = openai.OpenAI(api_key="sk-...")
# Buat embedding untuk satu teks
response = client.embeddings.create(
model="text-embedding-3-small",
input="Mesin ini adalah panduan lengkap belajar Python untuk pemula"
)
vector = response.data[0].embedding
print(f"Dimensi: {len(vector)}") # 1536
print(f"5 elemen pertama: {vector[:5]}")
# [0.0234, -0.0456, 0.0789, -0.0123, 0.0567]
# Buat embedding untuk banyak teks sekaligus (batch)
texts = [
"Cara belajar Python untuk pemula",
"Tutorial JavaScript dasar",
"Panduan database MySQL",
"Resep nasi goreng spesial",
"Jadwal pertandingan sepak bola"
]
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
embeddings = [item.embedding for item in response.data]
print(f"Jumlah embedding: {len(embeddings)}") # 5
# =============================================
# EMBEDDING dengan Sentence Transformers (GRATIS, LOKAL)
# =============================================
# pip install sentence-transformers
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
# Embedding satu teks
vector = model.encode("Belajar Python dari nol")
print(f"Dimensi: {len(vector)}") # 384
# Embedding batch
texts = [
"Cara belajar Python untuk pemula",
"Tutorial JavaScript dasar",
"Panduan database MySQL",
"Resep nasi goreng spesial"
]
vectors = model.encode(texts)
print(f"Shape: {vectors.shape}") # (4, 384)
# =============================================
# EMBEDDING MULTILINGUAL (untuk bahasa Indonesia)
# =============================================
model_multi = SentenceTransformer('intfloat/multilingual-e5-large')
# Embedding bahasa Indonesia
vector_id = model_multi.encode("query: Apa itu machine learning?")
print(f"Dimensi: {len(vector_id)}") # 1024
- Konsistensi model — gunakan model yang SAMA untuk indexing dan querying
- Bahasa Indonesia — gunakan multilingual model (multilingual-e5, BGE-M3)
- Chunking — pecah dokumen panjang menjadi chunk 200-500 token sebelum embedding
- Prefix query — beberapa model butuh prefix "query:" atau "passage:"
3. Setup Pinecone
# =============================================
# STEP 1: Install
# =============================================
# pip install pinecone
# =============================================
# STEP 2: Inisialisasi Pinecone
# =============================================
from pinecone import Pinecone, ServerlessSpec
# Inisialisasi client
pc = Pinecone(api_key="YOUR_PINECONE_API_KEY")
# =============================================
# STEP 3: Buat Index (database untuk vektor)
# =============================================
# Cek index yang sudah ada
existing_indexes = pc.list_indexes().names()
print(f"Index yang ada: {existing_indexes}")
# Buat index baru jika belum ada
if 'tutorial-index' not in existing_indexes:
pc.create_index(
name='tutorial-index',
dimension=1536, # Sesuai dimensi embedding model
metric='cosine', # cosine, euclidean, dotproduct
spec=ServerlessSpec(
cloud='aws',
region='us-east-1'
)
)
print("Index 'tutorial-index' berhasil dibuat!")
# Connect ke index
index = pc.Index('tutorial-index')
# Cek statistik index
stats = index.describe_index_stats()
print(f"Total vectors: {stats.total_vector_count}")
print(f"Dimension: {stats.dimension}")
print(f"Namespaces: {list(stats.namespaces.keys())}")
# =============================================
# METRIC DISTANCE: Pilih yang sesuai
# =============================================
# cosine → Cocok untuk semantic search (paling umum)
# Mengukur sudut antara 2 vektor (0-1, 1=identik)
#
# euclidean → Mengukur jarak fisik antara 2 titik
# Cocok untuk data numerik/spatial
#
# dotproduct → Seperti cosine tapi tanpa normalisasi
# Cocok untuk vektor yang sudah normalized
┌─────────────────────────────────────────────────────────────────┐ │ SIMILARITY METRICS │ │ │ │ COSINE SIMILARITY │ │ ───────────────── │ │ Mengukur SUDUT antara 2 vektor │ │ │ │ v1 ●────────● v2 │ │ \θ / cos(θ) = 1 → identik │ │ \ / cos(θ) = 0 → orthogonal (tidak mirip) │ │ \ / cos(θ) = -1 → berlawanan │ │ \/ │ │ │ │ Range: [-1, 1] — lebih tinggi = lebih mirip │ │ Best untuk: text embeddings, semantic search │ │ │ │ EUCLIDEAN DISTANCE │ │ ──────────────────── │ │ Mengukur JARAK FISIK antara 2 titik │ │ │ │ (1,4) ● d = √(Σ(a-b)²) │ │ \ d = 0 → identik │ │ \ d kecil → mirip │ │ ● (4,1) d besar → berbeda │ │ │ │ Range: [0, ∞) — lebih kecil = lebih mirip │ │ Best untuk: spatial data, numerical features │ └─────────────────────────────────────────────────────────────────┘
4. CRUD Operations — Upsert, Query, Delete
# =============================================
# UPSERT: Menyimpan vektor (create/update)
# =============================================
# Format: list of (id, vector, metadata)
vectors_to_upsert = [
{
"id": "doc_001",
"values": [0.023, -0.045, 0.078, 0.012, 0.056], # ... 1536 dim
"metadata": {
"title": "Tutorial Python Pemula",
"category": "programming",
"language": "id",
"source": "beebanelabs.com",
"year": 2026,
"chunk_text": "Python adalah bahasa pemrograman serbaguna..."
}
},
{
"id": "doc_002",
"values": [0.025, -0.042, 0.085, 0.015, 0.050],
"metadata": {
"title": "Belajar JavaScript untuk Pemula",
"category": "programming",
"language": "id",
"source": "beebanelabs.com",
"year": 2026,
"chunk_text": "JavaScript adalah bahasa pemrograman web..."
}
},
{
"id": "doc_003",
"values": [-0.067, 0.034, -0.021, 0.078, 0.091],
"metadata": {
"title": "Resep Nasi Goreng Spesial",
"category": "cooking",
"language": "id",
"source": "resepmama.com",
"year": 2025,
"chunk_text": "Nasi goreng adalah makanan khas Indonesia..."
}
}
]
# Upsert ke index
index.upsert(vectors=vectors_to_upsert)
print(f"Berhasil upsert {len(vectors_to_upsert)} vektor")
# Cek statistik setelah upsert
stats = index.describe_index_stats()
print(f"Total vektor: {stats.total_vector_count}")
# =============================================
# UPSERT dengan NAMESPACE (partition data)
# =============================================
# Namespace memisahkan data dalam index yang sama
index.upsert(
vectors=[
{"id": "art_001", "values": [...], "metadata": {"title": "..."}}
],
namespace="articles" # Namespace untuk artikel
)
index.upsert(
vectors=[
{"id": "prod_001", "values": [...], "metadata": {"title": "..."}}
],
namespace="products" # Namespace untuk produk
)
# =============================================
# QUERY: Mencari vektor mirip
# =============================================
# Buat embedding untuk query
query_text = "cara belajar coding untuk pemula"
query_vector = get_embedding(query_text) # Fungsi embedding Anda
# Cari 3 vektor paling mirip
results = index.query(
vector=query_vector,
top_k=3,
include_metadata=True
)
# Tampilkan hasil
for match in results.matches:
print(f"ID: {match.id}")
print(f"Score: {match.score:.4f}") # Similarity score
print(f"Title: {match.metadata.get('title')}")
print(f"Category: {match.metadata.get('category')}")
print(f"Text: {match.metadata.get('chunk_text', '')[:100]}...")
print("---")
# =============================================
# QUERY dengan FILTER
# =============================================
# Filter berdasarkan metadata
filtered_results = index.query(
vector=query_vector,
top_k=5,
include_metadata=True,
filter={
"category": {"$eq": "programming"}, # Hanya programming
"year": {"$gte": 2025} # Tahun >= 2025
}
)
# Filter operators:
# $eq — sama dengan
# $ne — tidak sama
# $gt — lebih besar
# $gte — lebih besar atau sama
# $lt — kurang dari
# $lte — kurang dari atau sama
# $in — dalam daftar
# $nin — tidak dalam daftar
# =============================================
# FETCH: Ambil vektor by ID
# =============================================
fetched = index.fetch(ids=["doc_001", "doc_002"])
for vid, vector_data in fetched.vectors.items():
print(f"ID: {vid}")
print(f"Metadata: {vector_data.metadata}")
# =============================================
# UPDATE: Ubah metadata vektor
# =============================================
index.update(
id="doc_001",
set_metadata={"views": 1500, "updated": True}
)
# =============================================
# DELETE: Hapus vektor
# =============================================
# Hapus by ID
index.delete(ids=["doc_003"])
# Hapus semua dalam namespace
index.delete(delete_all=True, namespace="articles")
# Hapus dengan filter
index.delete(filter={"category": {"$eq": "deprecated"}})
5. Similarity Search & Filtering
# =============================================
# SEMANTIC SEARCH: Pencarian bermakna
# =============================================
def semantic_search(query, top_k=5, filter_dict=None):
"""Pencarian semantic yang bisa difilter"""
query_vector = get_embedding(query)
results = index.query(
vector=query_vector,
top_k=top_k,
include_metadata=True,
filter=filter_dict
)
return [
{
"id": match.id,
"score": match.score,
"title": match.metadata.get("title", ""),
"text": match.metadata.get("chunk_text", ""),
"category": match.metadata.get("category", ""),
"metadata": match.metadata
}
for match in results.matches
]
# Contoh penggunaan:
# "machine learning" → menemukan artikel tentang ML, AI, deep learning
results = semantic_search("machine learning untuk pemula")
for r in results:
print(f"[{r['score']:.3f}] {r['title']}")
# Dengan filter:
results = semantic_search(
"tutorial coding",
filter_dict={"category": "programming", "language": "id"}
)
# =============================================
# HYBRID SEARCH: Vector + Metadata filter
# =============================================
# Cari artikel programming, tapi yang terbaru
results = index.query(
vector=get_embedding("framework web terbaik"),
top_k=10,
include_metadata=True,
filter={
"category": {"$in": ["programming", "web-dev"]},
"year": {"$gte": 2025},
"language": {"$eq": "id"}
}
)
# =============================================
# MULTI-QUERY: Gabungkan beberapa query
# =============================================
import numpy as np
def multi_query_search(queries, weights=None, top_k=5):
"""Gabungkan beberapa query menjadi satu vektor"""
if weights is None:
weights = [1.0 / len(queries)] * len(queries)
# Buat embedding untuk setiap query
vectors = [get_embedding(q) for q in queries]
# Weighted average
combined = np.zeros_like(vectors[0])
for vec, weight in zip(vectors, weights):
combined += np.array(vec) * weight
# Normalize
combined = combined / np.linalg.norm(combined)
return index.query(
vector=combined.tolist(),
top_k=top_k,
include_metadata=True
)
# Contoh: cari yang tentang "Python" DAN "data science"
results = multi_query_search(
["Python programming", "data science tutorial"],
weights=[0.6, 0.4] # Python lebih diprioritaskan
)
# =============================================
# SEARCH + RE-RANKING
# =============================================
def search_with_rerank(query, top_k=20, final_k=5):
"""Ambil banyak hasil, lalu re-rank"""
# Step 1: Ambil kandidat dari vector search
candidates = index.query(
vector=get_embedding(query),
top_k=top_k,
include_metadata=True
)
# Step 2: Re-rank berdasarkan relevance (misal dengan LLM)
texts = [m.metadata.get("chunk_text", "") for m in candidates.matches]
# Contoh sederhana: re-rank berdasarkan keyword overlap
query_words = set(query.lower().split())
scored = []
for match, text in zip(candidates.matches, texts):
text_words = set(text.lower().split())
overlap = len(query_words & text_words)
final_score = match.score * 0.7 + (overlap / len(query_words)) * 0.3
scored.append((match, final_score))
scored.sort(key=lambda x: x[1], reverse=True)
return scored[:final_k]
6. RAG — Retrieval Augmented Generation
RAG (Retrieval Augmented Generation) adalah teknik yang menggabungkan retrieval (mengambil data relevan dari vector database) dengan generation (LLM menghasilkan jawaban). Ini memungkinkan LLM menjawab pertanyaan berdasarkan data spesifik Anda — tanpa perlu fine-tuning!
┌─────────────────────────────────────────────────────────────────┐ │ RAG PIPELINE │ │ │ │ User Query: "Berapa harga paket Enterprise BeebaneLabs?" │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────┐ │ │ │ STEP 1: RETRIEVAL │ │ │ │ Query → Embedding → Vector Search │ │ │ │ di Pinecone │ │ │ └──────────────────┬──────────────────┘ │ │ │ │ │ ▼ │ │ Top-K Results (dokumen relevan): │ │ • "Paket Enterprise: Rp 5jt/bulan..." (score: 0.95) │ │ • "Fitur Enterprise mencakup..." (score: 0.89) │ │ • "Perbandingan paket..." (score: 0.82) │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────┐ │ │ │ STEP 2: AUGMENTATION │ │ │ │ Gabungkan: System Prompt + │ │ │ │ Context (retrieved docs) + │ │ │ │ User Query │ │ │ └──────────────────┬──────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────┐ │ │ │ STEP 3: GENERATION │ │ │ │ LLM (GPT-4, Claude, dll) │ │ │ │ generate jawaban berdasarkan │ │ │ │ context yang diberikan │ │ │ └──────────────────┬──────────────────┘ │ │ │ │ │ ▼ │ │ "Paket Enterprise BeebaneLabs │ │ │ harganya Rp 5.000.000 per bulan. │ │ │ Termasuk fitur unlimited users, │ │ │ priority support, dan custom domain." │ │ │ │ │ │ ✅ Jawaban akurat dari data Anda! │ │ │ ✅ Tanpa fine-tuning! │ │ │ ✅ Bisa cite sumbernya! │ │ └─────────────────────────────────────────────────────────────────┘
# =============================================
# FULL RAG PIPELINE
# =============================================
# pip install pinecone openai
import openai
from pinecone import Pinecone
# Setup
pc = Pinecone(api_key="YOUR_PINECONE_KEY")
index = pc.Index("knowledge-base")
openai_client = openai.OpenAI(api_key="YOUR_OPENAI_KEY")
def get_embedding(text, model="text-embedding-3-small"):
"""Buat embedding dari teks"""
response = openai_client.embeddings.create(
model=model, input=text
)
return response.data[0].embedding
def retrieve_context(query, top_k=5, namespace=None):
"""Ambil konteks relevan dari Pinecone"""
query_vector = get_embedding(query)
results = index.query(
vector=query_vector,
top_k=top_k,
include_metadata=True,
namespace=namespace
)
contexts = []
sources = []
for match in results.matches:
if match.score > 0.7: # Threshold kemiripan
contexts.append(match.metadata.get("chunk_text", ""))
sources.append({
"id": match.id,
"title": match.metadata.get("title", ""),
"score": match.score
})
return contexts, sources
def generate_answer(query, contexts, sources):
"""Generate jawaban menggunakan LLM"""
context_text = "\n\n---\n\n".join(contexts)
system_prompt = """Anda adalah asisten AI yang membantu menjawab
pertanyaan berdasarkan konteks yang diberikan.
Aturan:
1. Jawab HANYA berdasarkan konteks yang diberikan
2. Jika informasi tidak ada di konteks, katakan "Saya tidak menemukan
informasi tersebut dalam database kami"
3. Berikan jawaban yang jelas dan informatif
4. Sebutkan sumber jika relevan"""
user_prompt = f"""Konteks dari database:
---
{context_text}
---
Pertanyaan pengguna: {query}
Jawab pertanyaan berdasarkan konteks di atas."""
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
temperature=0.3, # Rendah = lebih faktual
max_tokens=1000
)
return response.choices[0].message.content
def rag_pipeline(query, namespace=None):
"""Full RAG pipeline"""
# Step 1: Retrieve
contexts, sources = retrieve_context(query, namespace=namespace)
if not contexts:
return "Maaf, saya tidak menemukan informasi yang relevan.", []
# Step 2 + 3: Augment + Generate
answer = generate_answer(query, contexts, sources)
return answer, sources
# =============================================
# CONTOH PENGGUNAAN
# =============================================
# Pertanyaan tentang data Anda
query = "Bagaimana cara menginstall Python di Windows?"
answer, sources = rag_pipeline(query)
print("Jawaban:", answer)
print("\nSumber:")
for s in sources:
print(f" - {s['title']} (score: {s['score']:.3f})")
# =============================================
# INGEST DATA: Memasukkan dokumen ke Pinecone
# =============================================
def ingest_document(doc_id, text, metadata, chunk_size=500):
"""Pecah dokumen jadi chunks dan masukkan ke Pinecone"""
# Step 1: Chunking (pecah teks panjang)
words = text.split()
chunks = []
for i in range(0, len(words), chunk_size):
chunk = " ".join(words[i:i + chunk_size])
chunks.append(chunk)
# Step 2: Embedding + Upsert
vectors = []
for i, chunk in enumerate(chunks):
embedding = get_embedding(chunk)
chunk_id = f"{doc_id}_chunk_{i}"
vectors.append({
"id": chunk_id,
"values": embedding,
"metadata": {
**metadata,
"chunk_text": chunk,
"chunk_index": i,
"parent_doc_id": doc_id
}
})
# Batch upsert (Pinecone max 1000 per batch)
batch_size = 100
for i in range(0, len(vectors), batch_size):
batch = vectors[i:i + batch_size]
index.upsert(vectors=batch)
print(f"Berhasil ingest {len(chunks)} chunks untuk doc {doc_id}")
# Contoh ingest:
ingest_document(
doc_id="tutorial_python",
text="Python adalah bahasa pemrograman serbaguna yang diciptakan oleh Guido van Rossum...",
metadata={
"title": "Tutorial Python Lengkap",
"category": "programming",
"source": "beebanelabs.com"
}
)
7. Use Cases: Semantic Search & Recommendation
# =============================================
# USE CASE 1: Semantic Search Engine
# =============================================
class SemanticSearchEngine:
def __init__(self, index_name, embedding_model="text-embedding-3-small"):
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
self.index = pc.Index(index_name)
self.model = embedding_model
def search(self, query, filters=None, top_k=10):
vector = get_embedding(query, self.model)
return self.index.query(
vector=vector,
top_k=top_k,
include_metadata=True,
filter=filters
)
def search_with_threshold(self, query, threshold=0.75, **kwargs):
results = self.search(query, **kwargs)
return [m for m in results.matches if m.score >= threshold]
# Usage:
engine = SemanticSearchEngine("knowledge-base")
results = engine.search("bagaimana cara deploy aplikasi ke cloud")
# =============================================
# USE CASE 2: Product Recommendation
# =============================================
def recommend_products(product_id, top_k=5):
"""Rekomendasikan produk serupa"""
# Fetch vektor produk yang sedang dilihat
product = index.fetch(ids=[product_id])
product_vector = product.vectors[product_id].values
# Cari produk dengan embedding mirip
similar = index.query(
vector=product_vector,
top_k=top_k + 1, # +1 karena produk sendiri juga muncul
include_metadata=True,
filter={"in_stock": {"$eq": True}}
)
# Exclude produk yang sedang dilihat
return [m for m in similar.matches if m.id != product_id][:top_k]
# =============================================
# USE CASE 3: Duplicate Detection
# =============================================
def find_duplicates(texts, threshold=0.95):
"""Temukan teks yang hampir identik"""
embeddings = [get_embedding(t) for t in texts]
duplicates = []
for i in range(len(texts)):
results = index.query(
vector=embeddings[i],
top_k=5,
include_metadata=True
)
for match in results.matches:
if match.score >= threshold and match.id != f"doc_{i}":
duplicates.append({
"original": texts[i],
"duplicate": match.metadata.get("text", ""),
"score": match.score
})
return duplicates
# =============================================
# USE CASE 4: Image Similarity Search
# =============================================
# pip install transformers torch
# from transformers import CLIPProcessor, CLIPModel
# def image_to_vector(image_path):
# model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
# processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
#
# image = Image.open(image_path)
# inputs = processor(images=image, return_tensors="pt")
# vector = model.get_image_features(**inputs)
# return vector.detach().numpy().flatten().tolist()
#
# def search_similar_images(image_path, top_k=5):
# query_vector = image_to_vector(image_path)
# return index.query(vector=query_vector, top_k=top_k)
8. Pinecone vs Alternatives
| Database | Tipe | Kelebihan | Kekurangan |
|---|---|---|---|
| Pinecone | Managed cloud | Mudah, cepat, scalable | Vendor lock-in, biaya |
| Weaviate | Open-source / cloud | Modular, GraphQL API | Setup lebih kompleks |
| Qdrant | Open-source / cloud | Filter canggih, Rust | Ekosistem lebih kecil |
| Milvus | Open-source | Sangat scalable, GPU support | Butuh infra besar |
| ChromaDB | Open-source, embedded | Simple, Python-first | Belum production-ready |
| pgvector | PostgreSQL extension | Reuse PG infrastructure | Performa lebih rendah |
| FAISS | Library (bukan DB) | Sangat cepat, dari Meta | Tidak punya persistence |
9. Best Practices & Optimasi
- Chunking yang baik — pecah dokumen jadi 200-500 token, overlap 50-100 token
- Konsistensi model — selalu gunakan model embedding yang SAMA untuk index dan query
- Metadata filter — filter metadata sebelum vector search untuk efisiensi
- Namespace — pisahkan data per use case (articles, products, users)
- Batch upsert — jangan satu per satu, gunakan batch untuk efisiensi
- Top-K selection — mulai dengan top_k=10-20, lalu re-rank untuk akurasi
- Threshold — tetapkan minimum similarity score (0.7-0.8) untuk filter noise
- Monitor — track recall@k dan latency untuk evaluasi performa
10. Quiz Pemahaman
1. Apa yang disimpan oleh Vector Database?
2. Mengapa "sepatu olahraga" dan "running shoes" bisa ditemukan oleh semantic search?
3. Apa itu RAG?
4. Mengapa chunking diperlukan sebelum menyimpan dokumen ke vector database?
5. Metric apa yang paling umum digunakan untuk semantic search?
Rangkuman
- Vector Database — menyimpan vektor embedding, mencari berdasarkan kemiripan makna
- Embeddings — ubah teks/gambar menjadi vektor angka berdimensi tinggi
- Cosine similarity — metric paling umum untuk semantic search
- Pinecone — managed vector DB, mudah setup, latensi rendah
- RAG — gabungkan retrieval dari vector DB + LLM generation = jawaban akurat dari data Anda
- Chunking — pecah dokumen jadi potongan kecil untuk retrieval presisi
- Konsistensi model — selalu gunakan model embedding yang sama