1. Pengenalan Elasticsearch
Elasticsearch adalah search engine dan analytics database yang dibangun di atas library Apache Lucene. Elasticsearch sangat cepat untuk full-text search, filtering, dan aggregations pada data dalam skala besar — dari jutaan hingga milyaran dokumen.
Elasticsearch adalah bagian dari Elastic Stack (sebelumnya ELK Stack) yang terdiri dari:
| Komponen | Fungsi |
|---|---|
| Elasticsearch | Search engine & analytics database |
| Logstash | Data pipeline — collect, transform, load |
| Kibana | Visualization & dashboard UI |
| Beats | Lightweight data shippers (Filebeat, Metricbeat) |
Kapan Menggunakan Elasticsearch?
| Skenario | Cocok? |
|---|---|
| Full-text search (e-commerce, blog, docs) | ✅ Sangat cocok |
| Log monitoring & analytics | ✅ Sangat cocok |
| Real-time analytics (dashboard) | ✅ Cocok |
| ACID transaction database | ❌ Tidak cocok — pakai RDBMS |
| Relational data dengan JOIN | ❌ Tidak cocok — pakai RDBMS |
| Data archiving (cold storage) | ⚠️ Bisa, tapi mahal — pakai S3 |
┌─────────────────────────────────────────────────────────────────┐ │ ELASTICSEARCH CLUSTER │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Master Node │ │ Master Node │ │ Master Node │ │ │ │ (eligible) │ │ (eligible) │ │ (active) │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ Index: products (3 shards, 1 replica) │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ Shard 0 │ │ Shard 1 │ │ Shard 2 │ ← Primary │ │ │ (Node 1) │ │ (Node 2) │ │ (Node 3) │ │ │ └────────────┘ └────────────┘ └────────────┘ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ Replica 0 │ │ Replica 1 │ │ Replica 2 │ ← Replicas │ │ │ (Node 2) │ │ (Node 3) │ │ (Node 1) │ │ │ └────────────┘ └────────────┘ └────────────┘ │ │ │ │ Client → Mengirim request ke salah satu node (coordinating) │ │ Node → Route ke shard yang tepat → return hasil │ └─────────────────────────────────────────────────────────────────┘
2. Instalasi & Setup
# ============================================= # DOCKER (Cara Tercepat untuk Development) # ============================================= # Elasticsearch + Kibana docker network create elastic docker run -d --name elasticsearch \ --net elastic \ -p 9200:9200 -p 9300:9300 \ -e "discovery.type=single-node" \ -e "xpack.security.enabled=false" \ -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \ docker.elastic.co/elasticsearch/elasticsearch:8.14.0 docker run -d --name kibana \ --net elastic \ -p 5601:5601 \ -e "ELASTICSEARCH_HOSTS=http://elasticsearch:9200" \ docker.elastic.co/kibana/kibana:8.14.0 # Tunggu ~30 detik, lalu test curl http://localhost:9200 # Kibana: http://localhost:5601 # ============================================= # UBUNTU/DEBIAN (Instalasi Manual) # ============================================= wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - sudo apt install apt-transport-https echo "deb https://artifacts.elastic.co/packages/8.x/apt stable main" | \ sudo tee /etc/apt/sources.list.d/elastic-8.x.list sudo apt update && sudo apt install elasticsearch # Edit config sudo nano /etc/elasticsearch/elasticsearch.yml # cluster.name: my-cluster # network.host: 0.0.0.0 # discovery.type: single-node # xpack.security.enabled: false # Start sudo systemctl start elasticsearch sudo systemctl enable elasticsearch # ============================================= # TEST API # ============================================= # Cluster health curl -s http://localhost:9200/_cluster/health | python3 -m json.tool # Cluster info curl -s http://localhost:9200 | python3 -m json.tool # Node stats curl -s http://localhost:9200/_nodes/stats | python3 -m json.tool
3. Index & Document Management
Di Elasticsearch, data disimpan sebagai documents dalam index. Analoginya: index = tabel, document = baris, field = kolom. Tapi berbeda dari RDBMS, setiap document adalah JSON object yang fleksibel (schema-free).
# =============================================
# MEMBUAT INDEX
# =============================================
# Create index dengan settings
curl -X PUT "http://localhost:9200/products" -H 'Content-Type: application/json' -d '{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"indonesian": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "stop"]
}
}
}
},
"mappings": {
"properties": {
"name": { "type": "text", "analyzer": "indonesian" },
"description": { "type": "text" },
"price": { "type": "float" },
"category": { "type": "keyword" },
"stock": { "type": "integer" },
"created_at": { "type": "date" },
"tags": { "type": "keyword" }
}
}
}'
# Cek index
curl -s "http://localhost:9200/products" | python3 -m json.tool
# Hapus index
curl -X DELETE "http://localhost:9200/products"
# =============================================
# INDEX DOCUMENT (Create/Update)
# =============================================
# Index document dengan ID otomatis
curl -X POST "http://localhost:9200/products/_doc" -H 'Content-Type: application/json' -d '{
"name": "Laptop ASUS ROG Strix",
"description": "Laptop gaming dengan processor Intel i9 dan GPU RTX 4090",
"price": 25000000,
"category": "Elektronik",
"stock": 15,
"tags": ["gaming", "laptop", "asus"],
"created_at": "2026-06-26"
}'
# Index document dengan ID spesifik
curl -X PUT "http://localhost:9200/products/_doc/1" -H 'Content-Type: application/json' -d '{
"name": "Keyboard Mechanical Keychron K2",
"description": "Keyboard mechanical wireless dengan switch Gateron",
"price": 1200000,
"category": "Aksesoris",
"stock": 50,
"tags": ["keyboard", "mechanical", "wireless"],
"created_at": "2026-06-20"
}'
# Bulk index (banyak document sekaligus)
curl -X POST "http://localhost:9200/_bulk" -H 'Content-Type: application/json' -d '
{"index": {"_index": "products", "_id": "2"}}
{"name": "Mouse Logitech MX Master 3", "description": "Mouse wireless ergonomis", "price": 1500000, "category": "Aksesoris", "stock": 30, "created_at": "2026-06-22"}
{"index": {"_index": "products", "_id": "3"}}
{"name": "Monitor LG 27 4K", "description": "Monitor 4K UHD untuk desain dan gaming", "price": 5500000, "category": "Elektronik", "stock": 10, "created_at": "2026-06-21"}
{"index": {"_index": "products", "_id": "4"}}
{"name": "Samsung Galaxy S24 Ultra", "description": "Smartphone flagship dengan S-Pen dan kamera 200MP", "price": 19000000, "category": "Elektronik", "stock": 25, "created_at": "2026-06-15"}
{"index": {"_index": "products", "_id": "5"}}
{"name": "Meja Gaming Secretlab", "description": "Meja gaming dengan cable management dan RGB", "price": 3500000, "category": "Furniture", "stock": 8, "created_at": "2026-06-18"}
'
# =============================================
# GET DOCUMENT
# =============================================
curl -s "http://localhost:9200/products/_doc/1" | python3 -m json.tool
# =============================================
# UPDATE DOCUMENT
# =============================================
curl -X POST "http://localhost:9200/products/_update/1" -H 'Content-Type: application/json' -d '{
"doc": {
"price": 1100000,
"stock": 45
}
}'
# Scripted update (increment)
curl -X POST "http://localhost:9200/products/_update/1" -H 'Content-Type: application/json' -d '{
"script": {
"source": "ctx._source.stock -= params.qty",
"params": { "qty": 2 }
}
}'
# =============================================
# DELETE DOCUMENT
# =============================================
curl -X DELETE "http://localhost:9200/products/_doc/1"
4. Mappings — Schema Definition
Mapping mendefinisikan bagaimana Elasticsearch menyimpan dan mengindeks setiap field. Mirip dengan schema di RDBMS, tapi lebih fleksibel.
# =============================================
# TIPE DATA UMUM
# =============================================
# text → Full-text searchable (di-analyze, di-tokenize)
# keyword → Exact match (tidak di-analyze)
# integer, long, float, double → Numeric
# boolean → true/false
# date → Tanggal (berbagai format)
# object → JSON nested object
# nested → Array of objects (independently searchable)
# geo_point → Koordinat lat/lon
# ip → IP address
curl -X PUT "http://localhost:9200/articles" -H 'Content-Type: application/json' -d '{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "standard",
"fields": {
"keyword": { "type": "keyword", "ignore_above": 256 }
}
},
"content": { "type": "text" },
"author": {
"properties": {
"name": { "type": "text" },
"email": { "type": "keyword" }
}
},
"tags": { "type": "keyword" },
"views": { "type": "long" },
"rating": { "type": "float" },
"published_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
},
"status": { "type": "keyword" },
"comments": {
"type": "nested",
"properties": {
"user": { "type": "keyword" },
"text": { "type": "text" },
"date": { "type": "date" }
}
},
"location": { "type": "geo_point" }
}
}
}'
# =============================================
# TEXT vs KEYWORD (Perbedaan Penting!)
# =============================================
# text: "Laptop Gaming ASUS" → dipecah jadi ["laptop", "gaming", "asus"]
# → Cocok untuk: search "laptop" bisa menemukan "Laptop Gaming ASUS"
# → TIDAK bisa untuk: sorting, exact aggregation
# keyword: "Laptop Gaming ASUS" → tetap utuh "Laptop Gaming ASUS"
# → Cocok untuk: filter exact, sorting, aggregation
# → TIDAK bisa untuk: partial search
# Multi-field: keduanya!
# "title": { "type": "text", "fields": { "keyword": { "type": "keyword" } } }
# → title → bisa search full-text
# → title.keyword → bisa sort & aggregate
# =============================================
# CEK & UPDATE MAPPING
# =============================================
# Get mapping
curl -s "http://localhost:9200/articles/_mapping" | python3 -m json.tool
# Tambah field baru (TIDAK bisa ubah tipe field yang sudah ada!)
curl -X PUT "http://localhost:9200/articles/_mapping" -H 'Content-Type: application/json' -d '{
"properties": {
"language": { "type": "keyword" },
"read_time_minutes": { "type": "integer" }
}
}'
Sekali field didefinisikan, tipe datanya tidak bisa diubah. Jika perlu ubah, Anda harus buat index baru dengan mapping yang benar, lalu reindex data dari index lama ke index baru menggunakan _reindex API.
5. Query DSL — Search Documents
Elasticsearch menggunakan Query DSL (Domain Specific Language) berbasis JSON untuk pencarian. Ada dua kategori query: Leaf queries (match, term, range) dan Compound queries (bool, dis_max).
# =============================================
# MATCH ALL (ambil semua)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": { "match_all": {} }
}'
# =============================================
# MATCH (full-text search)
# =============================================
# Cari "gaming laptop" di field name
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"match": {
"name": "gaming laptop"
}
}
}'
# Ini akan menemukan: "Laptop ASUS ROG Strix" (karena ada "laptop" di nama)
# =============================================
# MATCH_PHRASE (exact phrase)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"match_phrase": {
"description": "wireless ergonomis"
}
}
}'
# Hanya menemukan jika frasa "wireless ergonomis" berurutan
# =============================================
# TERM (exact match — untuk keyword fields)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"term": {
"category": "Elektronik"
}
}
}'
# =============================================
# RANGE (numeric/date range)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"range": {
"price": {
"gte": 1000000,
"lte": 10000000
}
}
}
}'
# Date range
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"range": {
"created_at": {
"gte": "2026-06-01",
"lte": "2026-06-30",
"format": "yyyy-MM-dd"
}
}
}
}'
# =============================================
# EXISTS (field ada/tidak)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"exists": {
"field": "description"
}
}
}'
# =============================================
# WILDCARD & PREFIX
# =============================================
# Prefix
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"prefix": {
"name": "lap"
}
}
}'
# Wildcard
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"wildcard": {
"name": "*rog*"
}
}
}'
# =============================================
# SORTING & PAGINATION
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": { "match_all": {} },
"sort": [
{ "price": { "order": "desc" } },
{ "created_at": { "order": "desc" } }
],
"from": 0,
"size": 10
}'
# =============================================
# SOURCE FILTERING (ambil kolom tertentu)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": { "match_all": {} },
"_source": ["name", "price", "category"]
}'
6. Bool Query — Kombinasi Filter
Bool query adalah query paling sering digunakan — mengkombinasikan beberapa kondisi dengan AND, OR, NOT.
# =============================================
# BOOL QUERY STRUCTURE
# =============================================
# {
# "bool": {
# "must": [...], // AND — semua harus terpenuhi (affects score)
# "filter": [...], // AND — semua harus terpenuhi (NO score, cached)
# "should": [...], // OR — minimal satu terpenuhi (affects score)
# "must_not": [...] // NOT — semua harus TIDAK terpenuhi
# }
# }
# =============================================
# CONTOH: Cari laptop gaming murah
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"bool": {
"must": [
{ "match": { "name": "laptop" } }
],
"filter": [
{ "term": { "category": "Elektronik" } },
{ "range": { "price": { "lte": 20000000 } } },
{ "range": { "stock": { "gt": 0 } } }
]
}
}
}'
# =============================================
# CONTOH: Search dengan kategori opsional
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"bool": {
"must": [
{ "match": { "description": "wireless gaming" } }
],
"should": [
{ "term": { "category": "Elektronik" } },
{ "term": { "category": "Aksesoris" } }
],
"minimum_should_match": 1,
"must_not": [
{ "term": { "status": "deleted" } }
]
}
}
}'
# =============================================
# MUST vs FILTER: Perbedaan Scoring
# =============================================
# must: Mempengaruhi _score (relevancy score)
# filter: TIDAK mempengaruhi _score (lebih cepat, di-cache)
# Contoh: filter lebih cepat untuk kondisi yang pasti
# "harga di bawah 5 juta" → filter (tidak perlu score)
# "mengandung kata gaming" → must (perlu score untuk ranking)
# =============================================
# NESTED QUERY (untuk nested fields)
# =============================================
curl -X GET "http://localhost:9200/articles/_search" -H 'Content-Type: application/json' -d '{
"query": {
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{ "match": { "comments.text": "artikel bagus" } },
{ "term": { "comments.user": "budi" } }
]
}
}
}
}
}'
7. Full-Text Search & Scoring
# =============================================
# MULTI_MATCH (search di beberapa field)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"multi_match": {
"query": "gaming wireless",
"fields": ["name^3", "description^2", "tags"],
"type": "best_fields",
"tie_breaker": 0.3
}
}
}'
# name^3 = bobot 3x lebih penting dari default
# best_fields = ambil score terbaik dari satu field
# tie_breaker = bonus score dari field lain (0-1)
# =============================================
# FUZZY SEARCH (toleransi typo)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"match": {
"name": {
"query": "kyboard mekanikal",
"fuzziness": "AUTO"
}
}
}
}'
# "kyboard" → bisa menemukan "keyboard"
# fuzziness: AUTO = otomatis sesuaikan berdasarkan panjang kata
# =============================================
# HIGHLIGHT (sorot kata yang cocok)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"query": {
"match": {
"description": "gaming laptop"
}
},
"highlight": {
"fields": {
"description": {
"pre_tags": ["<em>"],
"post_tags": ["</em>"],
"fragment_size": 150,
"number_of_fragments": 3
}
}
}
}'
# Hasil: "Laptop <em>gaming</em> dengan processor Intel..."
# =============================================
# CUSTOM ANALYZER (untuk bahasa Indonesia)
# =============================================
curl -X PUT "http://localhost:9200/produk_id" -H 'Content-Type: application/json' -d '{
"settings": {
"analysis": {
"analyzer": {
"indonesian_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"indonesian_stop",
"indonesian_stemmer"
]
}
},
"filter": {
"indonesian_stop": {
"type": "stop",
"stopwords": ["dan", "di", "ke", "dari", "yang", "ini", "itu", "untuk", "dengan", "pada"]
},
"indonesian_stemmer": {
"type": "stemmer",
"language": "indonesian"
}
}
}
},
"mappings": {
"properties": {
"nama": { "type": "text", "analyzer": "indonesian_analyzer" },
"deskripsi": { "type": "text", "analyzer": "indonesian_analyzer" }
}
}
}'
8. Aggregations — Analytics
Aggregations memungkinkan Anda melakukan analisis data — hitung rata-rata, distribusi, top-N, histogram, dan banyak lagi. Mirip GROUP BY di SQL, tapi jauh lebih powerful.
# =============================================
# BUCKET AGGREGATION (kelompokkan data)
# =============================================
# Distribusi per kategori
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"per_kategori": {
"terms": {
"field": "category",
"size": 10
}
}
}
}'
# Hasil:
# "buckets": [
# { "key": "Elektronik", "doc_count": 3 },
# { "key": "Aksesoris", "doc_count": 2 },
# { "key": "Furniture", "doc_count": 1 }
# ]
# =============================================
# METRIC AGGREGATION (hitung statistik)
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"harga_stats": {
"stats": {
"field": "price"
}
},
"harga_avg": {
"avg": { "field": "price" }
},
"harga_max": {
"max": { "field": "price" }
},
"harga_min": {
"min": { "field": "price" }
},
"total_stock": {
"sum": { "field": "stock" }
},
"jumlah_produk": {
"value_count": { "field": "price" }
}
}
}'
# stats menghasilkan: count, min, max, avg, sum
# =============================================
# NESTED AGGREGATION (bucket + metric)
# =============================================
# Rata-rata harga per kategori
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"per_kategori": {
"terms": { "field": "category" },
"aggs": {
"avg_harga": { "avg": { "field": "price" } },
"max_harga": { "max": { "field": "price" } },
"min_harga": { "min": { "field": "price" } },
"harga_stats": { "stats": { "field": "price" } }
}
}
}
}'
# =============================================
# RANGE & HISTOGRAM
# =============================================
# Range buckets
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"harga_ranges": {
"range": {
"field": "price",
"ranges": [
{ "key": "Murah (<1jt)", "to": 1000000 },
{ "key": "Sedang (1-5jt)", "from": 1000000, "to": 5000000 },
{ "key": "Mahal (>5jt)", "from": 5000000 }
]
}
}
}
}'
# Histogram
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"harga_histogram": {
"histogram": {
"field": "price",
"interval": 2000000
}
}
}
}'
# Date histogram (time-series)
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"per_bulan": {
"date_histogram": {
"field": "created_at",
"calendar_interval": "month",
"format": "yyyy-MM"
},
"aggs": {
"total_revenue": {
"sum": { "field": "price" }
}
}
}
}
}'
# =============================================
# PERCENTILES
# =============================================
curl -X GET "http://localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"harga_percentiles": {
"percentiles": {
"field": "price",
"percents": [25, 50, 75, 95, 99]
}
}
}
}'
9. Python Client (elasticsearch-py)
from elasticsearch import Elasticsearch
from datetime import datetime
# =============================================
# KONEKSI
# =============================================
es = Elasticsearch(
"http://localhost:9200",
# basic_auth=("username", "password") # jika security aktif
)
# Cek koneksi
print(es.info())
# =============================================
# CREATE INDEX
# =============================================
def create_products_index():
if es.indices.exists(index="products"):
es.indices.delete(index="products")
es.indices.create(index="products", body={
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {"keyword": {"type": "keyword"}}
},
"description": {"type": "text"},
"price": {"type": "float"},
"category": {"type": "keyword"},
"stock": {"type": "integer"},
"tags": {"type": "keyword"},
"created_at": {"type": "date"}
}
}
})
# =============================================
# INDEX DOCUMENT
# =============================================
def index_product(product_id, data):
data["created_at"] = datetime.now().isoformat()
return es.index(index="products", id=product_id, body=data)
# Contoh
index_product("1", {
"name": "Laptop ASUS ROG Strix",
"description": "Laptop gaming dengan Intel i9 dan RTX 4090",
"price": 25000000,
"category": "Elektronik",
"stock": 15,
"tags": ["gaming", "laptop", "asus"]
})
# =============================================
# SEARCH
# =============================================
def search_products(query_text, category=None, min_price=None, max_price=None):
must_clauses = []
filter_clauses = []
if query_text:
must_clauses.append({
"multi_match": {
"query": query_text,
"fields": ["name^3", "description^2", "tags"]
}
})
if category:
filter_clauses.append({"term": {"category": category}})
if min_price or max_price:
price_range = {}
if min_price:
price_range["gte"] = min_price
if max_price:
price_range["lte"] = max_price
filter_clauses.append({"range": {"price": price_range}})
body = {
"query": {
"bool": {
"must": must_clauses if must_clauses else [{"match_all": {}}],
"filter": filter_clauses
}
},
"sort": [{"_score": "desc"}, {"price": "asc"}],
"size": 20
}
response = es.search(index="products", body=body)
hits = response["hits"]["hits"]
total = response["hits"]["total"]["value"]
return {"total": total, "products": [hit["_source"] | {"_score": hit["_score"]} for hit in hits]}
# Contoh penggunaan
results = search_products("gaming laptop", max_price=20000000)
print(f"Ditemukan {results['total']} produk:")
for p in results["products"]:
print(f" - {p['name']}: Rp{p['price']:,.0f} (score: {p['_score']:.2f})")
# =============================================
# AGGREGATIONS
# =============================================
def get_category_stats():
response = es.search(index="products", body={
"size": 0,
"aggs": {
"per_category": {
"terms": {"field": "category"},
"aggs": {
"avg_price": {"avg": {"field": "price"}},
"max_price": {"max": {"field": "price"}},
"total_stock": {"sum": {"field": "stock"}}
}
}
}
})
return response["aggregations"]["per_category"]["buckets"]
# =============================================
# BULK INDEX
# =============================================
from elasticsearch.helpers import bulk
def bulk_index_products(products):
actions = [
{
"_index": "products",
"_id": p["id"],
"_source": p
}
for p in products
]
return bulk(es, actions)
# bulk_index_products([
# {"id": "10", "name": "Mouse Logitech", "price": 500000, "category": "Aksesoris"},
# {"id": "11", "name": "Keyboard Razer", "price": 1500000, "category": "Aksesoris"},
# ])
# =============================================
# UPDATE & DELETE
# =============================================
# Update
es.update(index="products", id="1", body={
"doc": {"price": 24000000, "stock": 12}
})
# Delete
es.delete(index="products", id="1")
10. Best Practices & Optimasi
Best Practices
| Praktik | Detail |
|---|---|
| Gunakan filter untuk exact match | Filter lebih cepat (di-cache) dari must |
| Avoid deep pagination | from + size max 10.000. Gunakan search_after untuk lebih dalam |
| text vs keyword benar | text untuk search, keyword untuk filter/sort/aggregate |
| Limit _source fields | Ambil hanya kolom yang dibutuhkan |
| Gunakan bulk API | Bulk untuk insert/update banyak — 1000-5000 per batch |
| Shard sizing | 1 shard = 10-50GB (ideal). Terlalu banyak shard = overhead |
| Index Lifecycle Management | Gunakan ILM policy untuk manage index lifecycle (hot → warm → cold → delete) |
Anti-Patterns
| Anti-Pattern | Kenapa Buruk | Solusi |
|---|---|---|
| Wildcard query (*abc*) | Sangat lambat — scan semua terms | Gunakan match/match_phrase |
| Deep pagination (from: 10000) | Setiap halaman memuat semua halaman sebelumnya | Gunakan search_after |
| Scripted sort | Sangat lambat untuk tabel besar | Pre-compute di index time |
| One giant index | Shard terlalu besar, sulit manage | Gunakan time-based indices + alias |
| Too many fields | Mapping explosion — field limit default 1000 | Gunakan nested/flatten structure |
11. Quiz: Uji Pemahamanmu!
Setelah membaca tutorial di atas, jawablah 5 pertanyaan berikut: