1. Pengenalan Database Sharding
Database Sharding adalah teknik mempartisi database secara horizontal — membagi data dari satu database besar menjadi beberapa database yang lebih kecil (disebut shard). Setiap shard berisi subset dari data dan berjalan secara independen, memungkinkan aplikasi menangani volume data dan traffic yang jauh lebih besar.
Kapan Perlu Sharding?
| Indikator | Deskripsi | Threshold |
|---|---|---|
| Data Volume | Ukuran database melebihi kapasitas satu server | > 1-5 TB |
| Write Throughput | Write saturasi satu server | > 10K writes/detik |
| Read Throughput | Read tidak tertolong lagi oleh read replica | > 100K reads/detik |
| Index Size | Index tidak muat di RAM | > 50% dari RAM |
| Backup Time | Backup memakan waktu berjam-jam | > 4 jam |
Sharding menambah kompleksitas yang signifikan. Sebelum sharding, optimasi query, indexing, caching (Redis), read replicas, dan vertical scaling (upgrade server) harus dicoba terlebih dahulu. Sharding adalah opsi terakhir.
+--------------------------------------------------------------------+ | DATABASE SHARDING | | | | Aplikasi | | +----------------------------------------------------------+ | | | Router / Proxy | | | | Berdasarkan shard_key, tentukan shard tujuan | | | +----------------------------------------------------------+ | | | | | | | v v v | | +-----------+ +-----------+ +-----------+ | | | Shard 1 | | Shard 2 | | Shard 3 | | | |-----------| |-----------| |-----------| | | | User A-F | | User G-M | | User N-Z | | | | 2 juta row| | 2 juta row| | 2 juta row| | | +-----------+ +-----------+ +-----------+ | | | | | | | v v v | | +-----------+ +-----------+ +-----------+ | | | Replica 1 | | Replica 2 | | Replica 3 | | | +-----------+ +-----------+ +-----------+ | +--------------------------------------------------------------------+
2. Vertical vs Horizontal Scaling
Perbandingan
| Aspek | Vertical Scaling (Scale Up) | Horizontal Scaling (Scale Out) |
|---|---|---|
| Cara | Upgrade CPU, RAM, SSD satu server | Tambah server baru, bagi data |
| Batas | Ada batas hardware (fisik) | Hampir tidak terbatas |
| Kompleksitas | Rendah (drop-in upgrade) | Tinggi (aplikasi, routing, konsistensi) |
| Biaya | Naik eksponensial (server besar = mahal) | Linier (banyak server kecil = lebih murah) |
| Availability | Single point of failure | Redundansi per shard |
| Latency | Single server (lokal) | Bisa di-deploy multi-region |
| Kapan | Masih dalam kapasitas satu server | Sudah mentok di vertical scaling |
Scaling Journey
+--------------------------------------------------------------------+ | DATABASE SCALING JOURNEY | | | | Step 1: Optimasi Step 2: Read Replicas | | +-------------------+ +-------------------+ | | | Single Server | | Primary + | | | | + Indexes | | 2-3 Read Replicas | | | | + Query tuning | | Read/Write split | | | | + Caching (Redis) | +-------------------+ | | +-------------------+ | | | | v | | v Step 3: Vertical Scaling | | Sudah optimal? +-------------------+ | | | Bigger server | | | | 64 CPU, 512GB RAM | | | | NVMe SSD | | | +-------------------+ | | | | | v | | Step 4: SHARDING | | +-------------------+ | | | Split ke N shard | | | | Each shard has | | | | its own replicas | | | +-------------------+ | +--------------------------------------------------------------------+
3. Strategi Sharding
Ada beberapa strategi untuk membagi data ke shard. Pilihan strategi sangat mempengaruhi performa, distribusi data, dan kompleksitas operasional.
Hash-Based Sharding
// Hash-based sharding: hash(shard_key) % num_shards
// Distribusi merata, tapi range query sulit
function getShardId(userId, numShards) {
// FNV-1a hash atau murmurhash untuk distribusi merata
let hash = 2166136261; // FNV offset basis
const str = String(userId);
for (let i = 0; i < str.length; i++) {
hash ^= str.charCodeAt(i);
hash = (hash * 16777619) >>> 0; // FNV prime, unsigned
}
return hash % numShards;
}
// Contoh:
getShardId(12345, 4); // Shard 1
getShardId(67890, 4); // Shard 3
getShardId(11111, 4); // Shard 0
// Kelebihan: distribusi sangat merata
// Kekurangan: range query (WHERE user_id BETWEEN 100 AND 200)
// harus scan semua shard
Range-Based Sharding
// Range-based sharding: data dibagi berdasarkan rentang nilai
// Cocok untuk data yang punya pola range query
const shardRanges = [
{ shardId: 0, min: 1, max: 1000000 },
{ shardId: 1, min: 1000001, max: 2000000 },
{ shardId: 2, min: 2000001, max: 3000000 },
{ shardId: 3, min: 3000001, max: Infinity }
];
function getShardByRange(userId) {
for (const range of shardRanges) {
if (userId >= range.min && userId <= range.max) {
return range.shardId;
}
}
return shardRanges.length - 1;
}
// Kelebihan: range query efisien (tahu shard mana)
// Kekurangan: distribusi tidak merata (hotspot pada shard terbaru)
Geo-Based Sharding
// Geo-based sharding: data dibagi berdasarkan lokasi geografis
// Cocok untuk compliance (GDPR) dan latency
const geoShards = {
'asia': {
countries: ['ID', 'MY', 'SG', 'TH', 'PH', 'VN'],
server: 'db-sg-01.ap-southeast-1.amazonaws.com'
},
'europe': {
countries: ['DE', 'FR', 'NL', 'UK', 'ES', 'IT'],
server: 'db-fr-01.eu-west-1.amazonaws.com'
},
'americas': {
countries: ['US', 'CA', 'MX', 'BR', 'AR'],
server: 'db-us-01.us-east-1.amazonaws.com'
}
};
function getGeoShard(userCountry) {
for (const [region, config] of Object.entries(geoShards)) {
if (config.countries.includes(userCountry)) {
return { region, server: config.server };
}
}
return { region: 'asia', server: geoShards.asia.server }; // default
}
// Kelebihan: latency rendah, compliance GDPR
// Kekurangan: cross-region query lambat
Directory-Based Sharding
// Directory-based: lookup table yang memetakan key ke shard
// Paling fleksibel, tapi tambah latency (extra lookup)
// Lookup table di Redis atau database terpisah
const shardDirectory = {
'user_1001': 0,
'user_1002': 2,
'user_1003': 1,
'order_5001': 3,
'order_5002': 0,
};
async function getShardFromDirectory(key) {
// Check cache first
let shardId = await redis.get(`shard:${key}`);
if (shardId !== null) return parseInt(shardId);
// Lookup di directory database
shardId = await directoryDB.query(
'SELECT shard_id FROM shard_map WHERE entity_key = $1', [key]
);
// Cache result
await redis.set(`shard:${key}`, shardId, 'EX', 3600);
return shardId;
}
// Kelebihan: bisa memindahkan data tanpa rehash
// Kekurangan: single point of failure (directory), latency ekstra
Perbandingan Strategi
| Strategi | Distribusi | Range Query | Hotspot | Rebalance |
|---|---|---|---|---|
| Hash | Sangat merata | Sulit | Tidak ada | Sulit (rehash) |
| Range | Tidak merata | Mudah | Bisa ada | Mudah |
| Geo | Tergantung region | Per region | Bisa ada | Sedang |
| Directory | Fleksibel | Bisa diatur | Tidak ada | Mudah |
| Consistent Hashing | Cukup merata | Sulit | Minimal | Mudah |
4. Memilih Shard Key
Pemilihan shard key adalah keputusan paling kritis dalam sharding. Shard key yang salah bisa menyebabkan hotspot, distribusi tidak merata, dan query cross-shard yang mahal.
Kriteria Shard Key yang Baik
| Kriteria | Penjelasan |
|---|---|
| High Cardinality | Banyak nilai unik (user_id lebih baik dari gender) |
| Even Distribution | Data tersebar merata, tidak ada value yang dominan |
| Query-Aligned | Kebanyakan query menggunakan shard key di WHERE clause |
| Immutable | Tidak berubah setelah di-insert (mengubah = pindah shard) |
| Non-Volatile | Nilainya tidak sering berubah |
Contoh: Good vs Bad Shard Key
-- ============================================= -- GOOD: user_id (tinggi cardinality, merata) -- ============================================= -- Sebagian besar query: WHERE user_id = ? -- Router tahu persis shard mana yang dituju SELECT * FROM orders WHERE user_id = 12345; -- Hanya query 1 shard (targeted query) -- ============================================= -- GOOD: (user_id, created_at) compound key -- ============================================= -- Untuk multi-tenant: shard per user, range per waktu SELECT * FROM orders WHERE user_id = 12345 AND created_at BETWEEN '2026-01-01' AND '2026-06-30'; -- ============================================= -- BAD: country (low cardinality, tidak merata) -- ============================================= -- Indonesia: 70% data, Singapura: 5% data -- Shard Indonesia kelebihan beban, Shard SG kosong -- ============================================= -- BAD: status (sangat low cardinality) -- ============================================= -- 'active': 95%, 'banned': 5% -- Semua data masuk ke shard 'active' -- ============================================= -- BAD: email (unique tapi jarang di-query) -- ============================================= -- Query utama pakai user_id, bukan email -- Setiap query harus scan semua shard (scatter-gather)
5. Shard Routing
Shard routing adalah mekanisme untuk menentukan shard mana yang menangani query berdasarkan shard key.
Routing Architecture
+--------------------------------------------------------------------+ | SHARD ROUTING ARCHITECTURE | | | | +--------------------------+ | | | Application Layer | | | | +--------------------+ | | | | | Application Logic | | | | | | Menggunakan ORM | | | | | | dengan shard-aware | | | | | +--------------------+ | | | +-----------|-------------+ | | | | | +-----------v-------------+ | | | Shard Router / Proxy | | | | +------------------+ | | | | | Route Calculator | | | | | | hash(key) % N | | | | | +------------------+ | | | | +------------------+ | | | | | Connection Pool | | | | | | Per Shard | | | | | +------------------+ | | | +-----|------|------|----+ | | | | | | | +----v--+ +----v--+ +----v--+ | | |Shard 1| |Shard 2| |Shard 3| | | +-------+ +-------+ +-------+ | +--------------------------------------------------------------------+
Implementasi Router
// Shard Router untuk Node.js + PostgreSQL
const { Pool } = require('pg');
class ShardRouter {
constructor(shardConfigs) {
// shardConfigs: [{ id: 0, host: '...', port: 5432, database: '...' }]
this.pools = shardConfigs.map(config => ({
id: config.id,
pool: new Pool({
host: config.host,
port: config.port,
database: config.database,
user: config.user,
password: config.password,
max: 20 // connection pool per shard
})
}));
}
// Hash-based routing
getShardId(shardKey) {
let hash = 2166136261;
const str = String(shardKey);
for (let i = 0; i < str.length; i++) {
hash ^= str.charCodeAt(i);
hash = (hash * 16777619) >>> 0;
}
return hash % this.pools.length;
}
// Execute query on specific shard
async queryOnShard(shardKey, sql, params) {
const shardId = this.getShardId(shardKey);
const pool = this.pools[shardId].pool;
return pool.query(sql, params);
}
// Execute query on ALL shards (scatter-gather)
async queryAllShards(sql, params) {
const results = await Promise.all(
this.pools.map(async ({ id, pool }) => {
const result = await pool.query(sql, params);
return { shardId: id, rows: result.rows };
})
);
// Merge results
return results.flatMap(r => r.rows);
}
// Targeted query (most efficient)
async getUser(userId) {
return this.queryOnShard(userId,
'SELECT * FROM users WHERE id = $1', [userId]
);
}
// Cross-shard query (less efficient)
async searchUsers(query) {
return this.queryAllShards(
'SELECT * FROM users WHERE name ILIKE $1',
[`%${query}%`]
);
}
}
// Usage
const router = new ShardRouter([
{ id: 0, host: 'shard0.db.internal', port: 5432, database: 'app_0' },
{ id: 1, host: 'shard1.db.internal', port: 5432, database: 'app_1' },
{ id: 2, host: 'shard2.db.internal', port: 5432, database: 'app_2' },
{ id: 3, host: 'shard3.db.internal', port: 5432, database: 'app_3' },
]);
// Targeted: hanya 1 shard
const user = await router.getUser(12345);
// Scatter-gather: semua shard
const results = await router.searchUsers('Budi');
Proxy-Based Sharding
| Proxy | Database | Bahasa | Fitur |
|---|---|---|---|
| Vitess | MySQL | Go | VTGate routing, online DDL, resharding |
| Citus | PostgreSQL | C | Distributed tables, reference tables |
| ProxySQL | MySQL | C++ | Query routing, caching, connection pooling |
| ShardingSphere | MySQL, PostgreSQL | Java | JDBC proxy, Sharding-Proxy |
| PgBouncer | PostgreSQL | C | Connection pooling (bukan sharding) |
6. Cross-Shard Queries & Joins
Salah satu tantangan terbesar sharding adalah melakukan query yang membutuhkan data dari beberapa shard sekaligus.
Jenis Query
| Jenis Query | Efisiensi | Contoh |
|---|---|---|
| Targeted Query | ⭐⭐⭐⭐⭐ Sangat cepat | SELECT * FROM orders WHERE user_id = 123 |
| Scatter-Gather | ⭐⭐⭐ Cukup lambat | SELECT * FROM orders WHERE status = 'pending' |
| Cross-Shard Join | ⭐⭐ Sangat lambat | JOIN orders + users (berbeda shard) |
| Global Aggregate | ⭐⭐ Lambat | SELECT COUNT(*), SUM(total) FROM orders |
// =============================================
// CROSS-SHARD JOIN: Lakukan di aplikasi
// =============================================
async function getOrdersWithUser(userId) {
// Step 1: Targeted query ke shard yang tepat
const ordersResult = await router.queryOnShard(userId,
'SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC',
[userId]
);
// Step 2: Karena user dan order ada di shard yang sama
// (co-located), kita bisa JOIN langsung
const result = await router.queryOnShard(userId, `
SELECT o.*, u.name, u.email
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.user_id = $1
ORDER BY o.created_at DESC
LIMIT 20
`, [userId]);
return result.rows;
}
// =============================================
// GLOBAL AGGREGATE: Scatter-gather + merge
// =============================================
async function getGlobalStats() {
// Query setiap shard untuk partial aggregation
const partialResults = await router.queryAllShards(`
SELECT
COUNT(*) AS order_count,
SUM(total) AS total_revenue,
AVG(total) AS avg_order_value
FROM orders
WHERE created_at >= NOW() - INTERVAL '30 days'
`);
// Merge di aplikasi
const merged = {
order_count: 0,
total_revenue: 0,
weighted_avg: 0
};
for (const row of partialResults) {
merged.order_count += parseInt(row.order_count);
merged.total_revenue += parseFloat(row.total_revenue);
}
merged.avg_order_value = merged.total_revenue / merged.order_count;
return merged;
}
// =============================================
// GLOBAL TOP-N: Ambil dari semua shard, sort
// =============================================
async function getTopOrders(limit = 10) {
const allResults = await router.queryAllShards(`
SELECT * FROM orders
ORDER BY total DESC
LIMIT $1
`, [limit]);
// Sort gabungan dan ambil top-N
return allResults
.sort((a, b) => b.total - a.total)
.slice(0, limit);
}
Co-Located Data
Desain shard key sehingga data yang sering di-join ada di shard yang sama. Misalnya: shard semua tabel berdasarkan user_id — users, orders, order_items, payments semua di shard yang sama untuk user tertentu. Ini mengeliminasi cross-shard join.
7. Shard Migration & Rebalancing
Saat data bertambah, shard yang ada mungkin perlu dipecah lagi (split) atau digabung (merge). Proses ini disebut resharding dan harus dilakukan tanpa downtime.
Resharding dengan Consistent Hashing
// Consistent hashing: hanya ~1/N data yang perlu dipindah
// saat menambah/mengurangi shard
class ConsistentHashRing {
constructor(nodes, virtualNodes = 150) {
this.ring = [];
this.nodeMap = new Map();
for (const node of nodes) {
for (let i = 0; i < virtualNodes; i++) {
const hash = this.hash(`${node}-${i}`);
this.ring.push({ hash, node });
this.nodeMap.set(hash, node);
}
}
this.ring.sort((a, b) => a.hash - b.hash);
}
hash(key) {
let h = 2166136261;
for (let i = 0; i < key.length; i++) {
h ^= key.charCodeAt(i);
h = (h * 16777619) >>> 0;
}
return h;
}
getNode(key) {
const hash = this.hash(key);
// Binary search untuk clockwise nearest node
let low = 0, high = this.ring.length - 1;
while (low < high) {
const mid = Math.floor((low + high) / 2);
if (this.ring[mid].hash < hash) low = mid + 1;
else high = mid;
}
return this.ring[low].node;
}
}
// Saat menambah shard baru:
// Hanya ~1/4 data yang dipindah (bukan semua)
const ring = new ConsistentHashRing(['shard0', 'shard1', 'shard2']);
console.log(ring.getNode('user_12345')); // 'shard1'
// Tambah shard3: hanya ~1/4 data dari setiap shard lama dipindah
Online Resharding (Double-Write)
+--------------------------------------------------------------------+ | ZERO-DOWNTIME RESHARDING | | | | Phase 1: DOUBLE WRITE | | +---------------------------+ | | | App writes to OLD shard | | | | App ALSO writes to NEW | | | | Backfill: copy old->new | | | +---------------------------+ | | | | | Phase 2: VERIFY | | +---------------------------+ | | | Compare data old vs new | | | | Ensure consistency | | | | Read from old (still) | | | +---------------------------+ | | | | | Phase 3: SWITCH | | +---------------------------+ | | | Read from NEW shard | | | | Stop writing to old | | | | Drop old shard (later) | | | +---------------------------+ | +--------------------------------------------------------------------+
8. Challenges & Pitfalls
Tantangan Sharding
| Challenge | Deskripsi | Solusi |
|---|---|---|
| Cross-shard query | Query yang butuh data dari banyak shard | Co-locate data, denormalisasi |
| Auto-increment ID | ID tidak unik jika di tiap shard sama | Snowflake ID, UUID, sequence service |
| Global uniqueness | Unique constraint sulit di shard berbeda | Unique check di service layer |
| Transactions | ACID transactions hanya dalam 1 shard | Saga pattern, eventual consistency |
| Schema migration | ALTER TABLE harus di semua shard | Rolling migration, proxy DDL |
| Hotspot | Shard tertentu lebih sibuk dari lain | Split shard, rebalance |
| Operational complexity | Monitoring, backup, recovery N shard | Automation, tooling |
Distributed ID Generation
// Twitter Snowflake: 64-bit ID yang unik di semua shard
// Bit layout:
// [1 bit sign][41 bits timestamp][10 bits machine][12 bits sequence]
class SnowflakeGenerator {
constructor(machineId) {
this.machineId = machineId;
this.sequence = 0;
this.lastTimestamp = -1;
this.epoch = 1609459200000; // 2021-01-01
}
generate() {
let timestamp = Date.now();
if (timestamp === this.lastTimestamp) {
this.sequence = (this.sequence + 1) & 0xFFF; // 12 bits
if (this.sequence === 0) {
while (timestamp <= this.lastTimestamp) {
timestamp = Date.now();
}
}
} else {
this.sequence = 0;
}
this.lastTimestamp = timestamp;
const id = BigInt(timestamp - this.epoch) << 22n
| BigInt(this.machineId) << 12n
| BigInt(this.sequence);
return id.toString();
}
}
// Setiap shard punya machine ID berbeda
const generator1 = new SnowflakeGenerator(1); // Shard 1
const generator2 = new SnowflakeGenerator(2); // Shard 2
console.log(generator1.generate()); // "1234567890123456789"
console.log(generator2.generate()); // "9876543210987654321"
// Keduanya unik, sortable by time
9. Implementasi Nyata
Vitess (YouTube / Slack)
# Vitess: MySQL sharding proxy (digunakan YouTube, Slack, GitHub)
# Install Vitess via operator
# helm install vitess-operator vitess/vitess-operator
# VSchema: definisi sharding
# vschema.json
{
"sharded": true,
"vindexes": {
"hash_user_id": {
"type": "hash"
}
},
"tables": {
"users": {
"column_vindexes": [
{ "column": "id", "name": "hash_user_id" }
]
},
"orders": {
"column_vindexes": [
{ "column": "user_id", "name": "hash_user_id" }
]
}
}
}
# users dan orders co-located berdasarkan user_id
# JOIN users ON orders.user_id = users.id = same shard!
Citus (PostgreSQL Distributed)
-- Citus: PostgreSQL extension untuk sharding
-- Aktifkan Citus
CREATE EXTENSION citus;
-- Tambah worker nodes
SELECT citus_add_node('worker1', 5432);
SELECT citus_add_node('worker2', 5432);
SELECT citus_add_node('worker3', 5432);
-- Distribute tabel berdasarkan user_id
SELECT create_distributed_table('users', 'id');
SELECT create_distributed_table('orders', 'user_id');
SELECT create_distributed_table('order_items', 'user_id');
-- Reference table (replika ke semua shard)
SELECT create_reference_table('categories');
SELECT create_reference_table('settings');
-- Query sekarang otomatis terdistribusi:
SELECT u.name, COUNT(o.id), SUM(o.total)
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.id = 12345
GROUP BY u.name;
-- Query hanya ke shard yang berisi user 12345
-- Cross-shard query (Citus otomatis handle)
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON u.id = o.user_id
GROUP BY u.name
ORDER BY COUNT(o.id) DESC
LIMIT 10;
10. Quiz Pemahaman
- Apa itu database sharding dan mengapa dilakukan?
Jawaban: Membagi database besar menjadi beberapa shard yang lebih kecil dan independen. Dilakukan ketika vertical scaling sudah tidak cukup untuk menangani volume data dan traffic. - Apa perbedaan hash sharding dan range sharding?
Jawaban: Hash sharding mendistribusi data merata berdasarkan hash key, range sharding membagi berdasarkan rentang nilai. Hash: distribusi merata tapi range query sulit. Range: range query mudah tapi bisa ada hotspot. - Apa itu cross-shard join dan mengapa mahal?
Jawaban: Join yang membutuhkan data dari shard berbeda. Mahal karena harus query ke beberapa shard, transfer data ke satu tempat, lalu join di aplikasi. Solusi: co-locate data. - Mengapa auto-increment ID tidak cocok untuk sharding?
Jawaban: Setiap shard punya auto-increment terpisah, menghasilkan ID duplikat. Gunakan Snowflake ID, UUID, atau distributed sequence generator. - Apa itu consistent hashing dan keuntungannya?
Jawaban: Teknik hashing di mana penambahan/penghapusan node hanya mempengaruhi ~1/N data. Tidak perlu rehash semua data saat cluster berubah ukuran. - Kapan sebaiknya TIDAK melakukan sharding?
Jawaban: Ketika data masih muat di satu server, ketika read replicas + caching masih cukup, atau ketika tim belum siap mengelola kompleksitas sharding.
Setelah memahami Database Sharding, pelajari Database Migration Tools untuk mengelola schema di lingkungan sharded, atau TimescaleDB untuk time-series data yang perlu di-shard berdasarkan waktu.