Database

Database Sharding: Panduan Horizontal Scaling Lengkap

Pelajari Database Sharding dari dasar — strategi hash, range, geo-based, routing, cross-shard queries, challenges, dan best practices untuk skalabilitas horizontal

1. Pengenalan Database Sharding

Database Sharding adalah teknik mempartisi database secara horizontal — membagi data dari satu database besar menjadi beberapa database yang lebih kecil (disebut shard). Setiap shard berisi subset dari data dan berjalan secara independen, memungkinkan aplikasi menangani volume data dan traffic yang jauh lebih besar.

Kapan Perlu Sharding?

IndikatorDeskripsiThreshold
Data VolumeUkuran database melebihi kapasitas satu server> 1-5 TB
Write ThroughputWrite saturasi satu server> 10K writes/detik
Read ThroughputRead tidak tertolong lagi oleh read replica> 100K reads/detik
Index SizeIndex tidak muat di RAM> 50% dari RAM
Backup TimeBackup memakan waktu berjam-jam> 4 jam
⚠️ Sharding Bukan Solusi Pertama

Sharding menambah kompleksitas yang signifikan. Sebelum sharding, optimasi query, indexing, caching (Redis), read replicas, dan vertical scaling (upgrade server) harus dicoba terlebih dahulu. Sharding adalah opsi terakhir.

Diagram: Database Sharding
+--------------------------------------------------------------------+
|                  DATABASE SHARDING                                   |
|                                                                    |
|  Aplikasi                                                          |
|  +----------------------------------------------------------+      |
|  |                    Router / Proxy                         |      |
|  |         Berdasarkan shard_key, tentukan shard tujuan      |      |
|  +----------------------------------------------------------+      |
|      |                    |                    |                   |
|      v                    v                    v                   |
|  +-----------+     +-----------+     +-----------+                  |
|  | Shard 1   |     | Shard 2   |     | Shard 3   |                  |
|  |-----------|     |-----------|     |-----------|                  |
|  | User A-F  |     | User G-M  |     | User N-Z  |                  |
|  | 2 juta row|     | 2 juta row|     | 2 juta row|                  |
|  +-----------+     +-----------+     +-----------+                  |
|      |                    |                    |                   |
|      v                    v                    v                   |
|  +-----------+     +-----------+     +-----------+                  |
|  | Replica 1 |     | Replica 2 |     | Replica 3 |                  |
|  +-----------+     +-----------+     +-----------+                  |
+--------------------------------------------------------------------+

2. Vertical vs Horizontal Scaling

Perbandingan

AspekVertical Scaling (Scale Up)Horizontal Scaling (Scale Out)
CaraUpgrade CPU, RAM, SSD satu serverTambah server baru, bagi data
BatasAda batas hardware (fisik)Hampir tidak terbatas
KompleksitasRendah (drop-in upgrade)Tinggi (aplikasi, routing, konsistensi)
BiayaNaik eksponensial (server besar = mahal)Linier (banyak server kecil = lebih murah)
AvailabilitySingle point of failureRedundansi per shard
LatencySingle server (lokal)Bisa di-deploy multi-region
KapanMasih dalam kapasitas satu serverSudah mentok di vertical scaling

Scaling Journey

Diagram: Database Scaling Journey
+--------------------------------------------------------------------+
|              DATABASE SCALING JOURNEY                                |
|                                                                    |
|  Step 1: Optimasi          Step 2: Read Replicas                   |
|  +-------------------+     +-------------------+                   |
|  | Single Server     |     | Primary +         |                   |
|  | + Indexes         |     | 2-3 Read Replicas |                   |
|  | + Query tuning    |     | Read/Write split  |                   |
|  | + Caching (Redis) |     +-------------------+                   |
|  +-------------------+             |                               |
|          |                        v                                |
|          v            Step 3: Vertical Scaling                     |
|  Sudah optimal?       +-------------------+                       |
|                       | Bigger server     |                       |
|                       | 64 CPU, 512GB RAM |                       |
|                       | NVMe SSD          |                       |
|                       +-------------------+                       |
|                                  |                                 |
|                                  v                                |
|                       Step 4: SHARDING                            |
|                       +-------------------+                       |
|                       | Split ke N shard  |                       |
|                       | Each shard has    |                       |
|                       | its own replicas  |                       |
|                       +-------------------+                       |
+--------------------------------------------------------------------+

3. Strategi Sharding

Ada beberapa strategi untuk membagi data ke shard. Pilihan strategi sangat mempengaruhi performa, distribusi data, dan kompleksitas operasional.

Hash-Based Sharding

JavaScript — Hash Sharding Logic
// Hash-based sharding: hash(shard_key) % num_shards
// Distribusi merata, tapi range query sulit

function getShardId(userId, numShards) {
  // FNV-1a hash atau murmurhash untuk distribusi merata
  let hash = 2166136261;  // FNV offset basis
  const str = String(userId);
  for (let i = 0; i < str.length; i++) {
    hash ^= str.charCodeAt(i);
    hash = (hash * 16777619) >>> 0;  // FNV prime, unsigned
  }
  return hash % numShards;
}

// Contoh:
getShardId(12345, 4);  // Shard 1
getShardId(67890, 4);  // Shard 3
getShardId(11111, 4);  // Shard 0

// Kelebihan: distribusi sangat merata
// Kekurangan: range query (WHERE user_id BETWEEN 100 AND 200)
// harus scan semua shard

Range-Based Sharding

JavaScript — Range Sharding Logic
// Range-based sharding: data dibagi berdasarkan rentang nilai
// Cocok untuk data yang punya pola range query

const shardRanges = [
  { shardId: 0, min: 1,      max: 1000000 },
  { shardId: 1, min: 1000001, max: 2000000 },
  { shardId: 2, min: 2000001, max: 3000000 },
  { shardId: 3, min: 3000001, max: Infinity }
];

function getShardByRange(userId) {
  for (const range of shardRanges) {
    if (userId >= range.min && userId <= range.max) {
      return range.shardId;
    }
  }
  return shardRanges.length - 1;
}

// Kelebihan: range query efisien (tahu shard mana)
// Kekurangan: distribusi tidak merata (hotspot pada shard terbaru)

Geo-Based Sharding

JavaScript — Geo Sharding Logic
// Geo-based sharding: data dibagi berdasarkan lokasi geografis
// Cocok untuk compliance (GDPR) dan latency

const geoShards = {
  'asia': {
    countries: ['ID', 'MY', 'SG', 'TH', 'PH', 'VN'],
    server: 'db-sg-01.ap-southeast-1.amazonaws.com'
  },
  'europe': {
    countries: ['DE', 'FR', 'NL', 'UK', 'ES', 'IT'],
    server: 'db-fr-01.eu-west-1.amazonaws.com'
  },
  'americas': {
    countries: ['US', 'CA', 'MX', 'BR', 'AR'],
    server: 'db-us-01.us-east-1.amazonaws.com'
  }
};

function getGeoShard(userCountry) {
  for (const [region, config] of Object.entries(geoShards)) {
    if (config.countries.includes(userCountry)) {
      return { region, server: config.server };
    }
  }
  return { region: 'asia', server: geoShards.asia.server }; // default
}

// Kelebihan: latency rendah, compliance GDPR
// Kekurangan: cross-region query lambat

Directory-Based Sharding

JavaScript — Directory Sharding
// Directory-based: lookup table yang memetakan key ke shard
// Paling fleksibel, tapi tambah latency (extra lookup)

// Lookup table di Redis atau database terpisah
const shardDirectory = {
  'user_1001':  0,
  'user_1002':  2,
  'user_1003':  1,
  'order_5001': 3,
  'order_5002': 0,
};

async function getShardFromDirectory(key) {
  // Check cache first
  let shardId = await redis.get(`shard:${key}`);
  if (shardId !== null) return parseInt(shardId);

  // Lookup di directory database
  shardId = await directoryDB.query(
    'SELECT shard_id FROM shard_map WHERE entity_key = $1', [key]
  );

  // Cache result
  await redis.set(`shard:${key}`, shardId, 'EX', 3600);
  return shardId;
}

// Kelebihan: bisa memindahkan data tanpa rehash
// Kekurangan: single point of failure (directory), latency ekstra

Perbandingan Strategi

StrategiDistribusiRange QueryHotspotRebalance
HashSangat merataSulitTidak adaSulit (rehash)
RangeTidak merataMudahBisa adaMudah
GeoTergantung regionPer regionBisa adaSedang
DirectoryFleksibelBisa diaturTidak adaMudah
Consistent HashingCukup merataSulitMinimalMudah

4. Memilih Shard Key

Pemilihan shard key adalah keputusan paling kritis dalam sharding. Shard key yang salah bisa menyebabkan hotspot, distribusi tidak merata, dan query cross-shard yang mahal.

Kriteria Shard Key yang Baik

KriteriaPenjelasan
High CardinalityBanyak nilai unik (user_id lebih baik dari gender)
Even DistributionData tersebar merata, tidak ada value yang dominan
Query-AlignedKebanyakan query menggunakan shard key di WHERE clause
ImmutableTidak berubah setelah di-insert (mengubah = pindah shard)
Non-VolatileNilainya tidak sering berubah

Contoh: Good vs Bad Shard Key

SQL — Shard Key Examples
-- =============================================
-- GOOD: user_id (tinggi cardinality, merata)
-- =============================================
-- Sebagian besar query: WHERE user_id = ?
-- Router tahu persis shard mana yang dituju
SELECT * FROM orders WHERE user_id = 12345;
-- Hanya query 1 shard (targeted query)

-- =============================================
-- GOOD: (user_id, created_at) compound key
-- =============================================
-- Untuk multi-tenant: shard per user, range per waktu
SELECT * FROM orders
WHERE user_id = 12345
  AND created_at BETWEEN '2026-01-01' AND '2026-06-30';

-- =============================================
-- BAD: country (low cardinality, tidak merata)
-- =============================================
-- Indonesia: 70% data, Singapura: 5% data
-- Shard Indonesia kelebihan beban, Shard SG kosong

-- =============================================
-- BAD: status (sangat low cardinality)
-- =============================================
-- 'active': 95%, 'banned': 5%
-- Semua data masuk ke shard 'active'

-- =============================================
-- BAD: email (unique tapi jarang di-query)
-- =============================================
-- Query utama pakai user_id, bukan email
-- Setiap query harus scan semua shard (scatter-gather)

5. Shard Routing

Shard routing adalah mekanisme untuk menentukan shard mana yang menangani query berdasarkan shard key.

Routing Architecture

Diagram: Shard Routing
+--------------------------------------------------------------------+
|                  SHARD ROUTING ARCHITECTURE                          |
|                                                                    |
|  +--------------------------+                                     |
|  |     Application Layer     |                                     |
|  |  +--------------------+  |                                     |
|  |  | Application Logic   |  |                                     |
|  |  | Menggunakan ORM     |  |                                     |
|  |  | dengan shard-aware  |  |                                     |
|  |  +--------------------+  |                                     |
|  +-----------|-------------+                                      |
|              |                                                     |
|  +-----------v-------------+                                      |
|  |    Shard Router / Proxy  |                                      |
|  |  +------------------+   |                                      |
|  |  | Route Calculator  |   |                                      |
|  |  | hash(key) % N     |   |                                      |
|  |  +------------------+   |                                      |
|  |  +------------------+   |                                      |
|  |  | Connection Pool   |   |                                      |
|  |  | Per Shard          |   |                                      |
|  |  +------------------+   |                                      |
|  +-----|------|------|----+                                       |
|        |      |      |                                            |
|   +----v--+ +----v--+ +----v--+                                   |
|   |Shard 1| |Shard 2| |Shard 3|                                   |
|   +-------+ +-------+ +-------+                                   |
+--------------------------------------------------------------------+

Implementasi Router

JavaScript — Shard Router Implementation
// Shard Router untuk Node.js + PostgreSQL

const { Pool } = require('pg');

class ShardRouter {
  constructor(shardConfigs) {
    // shardConfigs: [{ id: 0, host: '...', port: 5432, database: '...' }]
    this.pools = shardConfigs.map(config => ({
      id: config.id,
      pool: new Pool({
        host: config.host,
        port: config.port,
        database: config.database,
        user: config.user,
        password: config.password,
        max: 20  // connection pool per shard
      })
    }));
  }

  // Hash-based routing
  getShardId(shardKey) {
    let hash = 2166136261;
    const str = String(shardKey);
    for (let i = 0; i < str.length; i++) {
      hash ^= str.charCodeAt(i);
      hash = (hash * 16777619) >>> 0;
    }
    return hash % this.pools.length;
  }

  // Execute query on specific shard
  async queryOnShard(shardKey, sql, params) {
    const shardId = this.getShardId(shardKey);
    const pool = this.pools[shardId].pool;
    return pool.query(sql, params);
  }

  // Execute query on ALL shards (scatter-gather)
  async queryAllShards(sql, params) {
    const results = await Promise.all(
      this.pools.map(async ({ id, pool }) => {
        const result = await pool.query(sql, params);
        return { shardId: id, rows: result.rows };
      })
    );
    // Merge results
    return results.flatMap(r => r.rows);
  }

  // Targeted query (most efficient)
  async getUser(userId) {
    return this.queryOnShard(userId,
      'SELECT * FROM users WHERE id = $1', [userId]
    );
  }

  // Cross-shard query (less efficient)
  async searchUsers(query) {
    return this.queryAllShards(
      'SELECT * FROM users WHERE name ILIKE $1',
      [`%${query}%`]
    );
  }
}

// Usage
const router = new ShardRouter([
  { id: 0, host: 'shard0.db.internal', port: 5432, database: 'app_0' },
  { id: 1, host: 'shard1.db.internal', port: 5432, database: 'app_1' },
  { id: 2, host: 'shard2.db.internal', port: 5432, database: 'app_2' },
  { id: 3, host: 'shard3.db.internal', port: 5432, database: 'app_3' },
]);

// Targeted: hanya 1 shard
const user = await router.getUser(12345);

// Scatter-gather: semua shard
const results = await router.searchUsers('Budi');

Proxy-Based Sharding

ProxyDatabaseBahasaFitur
VitessMySQLGoVTGate routing, online DDL, resharding
CitusPostgreSQLCDistributed tables, reference tables
ProxySQLMySQLC++Query routing, caching, connection pooling
ShardingSphereMySQL, PostgreSQLJavaJDBC proxy, Sharding-Proxy
PgBouncerPostgreSQLCConnection pooling (bukan sharding)

6. Cross-Shard Queries & Joins

Salah satu tantangan terbesar sharding adalah melakukan query yang membutuhkan data dari beberapa shard sekaligus.

Jenis Query

Jenis QueryEfisiensiContoh
Targeted Query⭐⭐⭐⭐⭐ Sangat cepatSELECT * FROM orders WHERE user_id = 123
Scatter-Gather⭐⭐⭐ Cukup lambatSELECT * FROM orders WHERE status = 'pending'
Cross-Shard Join⭐⭐ Sangat lambatJOIN orders + users (berbeda shard)
Global Aggregate⭐⭐ LambatSELECT COUNT(*), SUM(total) FROM orders
JavaScript — Cross-Shard Operations
// =============================================
// CROSS-SHARD JOIN: Lakukan di aplikasi
// =============================================
async function getOrdersWithUser(userId) {
  // Step 1: Targeted query ke shard yang tepat
  const ordersResult = await router.queryOnShard(userId,
    'SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC',
    [userId]
  );

  // Step 2: Karena user dan order ada di shard yang sama
  // (co-located), kita bisa JOIN langsung
  const result = await router.queryOnShard(userId, `
    SELECT o.*, u.name, u.email
    FROM orders o
    JOIN users u ON o.user_id = u.id
    WHERE o.user_id = $1
    ORDER BY o.created_at DESC
    LIMIT 20
  `, [userId]);

  return result.rows;
}

// =============================================
// GLOBAL AGGREGATE: Scatter-gather + merge
// =============================================
async function getGlobalStats() {
  // Query setiap shard untuk partial aggregation
  const partialResults = await router.queryAllShards(`
    SELECT
      COUNT(*) AS order_count,
      SUM(total) AS total_revenue,
      AVG(total) AS avg_order_value
    FROM orders
    WHERE created_at >= NOW() - INTERVAL '30 days'
  `);

  // Merge di aplikasi
  const merged = {
    order_count: 0,
    total_revenue: 0,
    weighted_avg: 0
  };

  for (const row of partialResults) {
    merged.order_count += parseInt(row.order_count);
    merged.total_revenue += parseFloat(row.total_revenue);
  }
  merged.avg_order_value = merged.total_revenue / merged.order_count;

  return merged;
}

// =============================================
// GLOBAL TOP-N: Ambil dari semua shard, sort
// =============================================
async function getTopOrders(limit = 10) {
  const allResults = await router.queryAllShards(`
    SELECT * FROM orders
    ORDER BY total DESC
    LIMIT $1
  `, [limit]);

  // Sort gabungan dan ambil top-N
  return allResults
    .sort((a, b) => b.total - a.total)
    .slice(0, limit);
}

Co-Located Data

💡 Co-Located Sharding

Desain shard key sehingga data yang sering di-join ada di shard yang sama. Misalnya: shard semua tabel berdasarkan user_id — users, orders, order_items, payments semua di shard yang sama untuk user tertentu. Ini mengeliminasi cross-shard join.

7. Shard Migration & Rebalancing

Saat data bertambah, shard yang ada mungkin perlu dipecah lagi (split) atau digabung (merge). Proses ini disebut resharding dan harus dilakukan tanpa downtime.

Resharding dengan Consistent Hashing

JavaScript — Consistent Hashing
// Consistent hashing: hanya ~1/N data yang perlu dipindah
// saat menambah/mengurangi shard

class ConsistentHashRing {
  constructor(nodes, virtualNodes = 150) {
    this.ring = [];
    this.nodeMap = new Map();

    for (const node of nodes) {
      for (let i = 0; i < virtualNodes; i++) {
        const hash = this.hash(`${node}-${i}`);
        this.ring.push({ hash, node });
        this.nodeMap.set(hash, node);
      }
    }
    this.ring.sort((a, b) => a.hash - b.hash);
  }

  hash(key) {
    let h = 2166136261;
    for (let i = 0; i < key.length; i++) {
      h ^= key.charCodeAt(i);
      h = (h * 16777619) >>> 0;
    }
    return h;
  }

  getNode(key) {
    const hash = this.hash(key);
    // Binary search untuk clockwise nearest node
    let low = 0, high = this.ring.length - 1;
    while (low < high) {
      const mid = Math.floor((low + high) / 2);
      if (this.ring[mid].hash < hash) low = mid + 1;
      else high = mid;
    }
    return this.ring[low].node;
  }
}

// Saat menambah shard baru:
// Hanya ~1/4 data yang dipindah (bukan semua)
const ring = new ConsistentHashRing(['shard0', 'shard1', 'shard2']);
console.log(ring.getNode('user_12345')); // 'shard1'

// Tambah shard3: hanya ~1/4 data dari setiap shard lama dipindah

Online Resharding (Double-Write)

Diagram: Zero-Downtime Resharding
+--------------------------------------------------------------------+
|              ZERO-DOWNTIME RESHARDING                                |
|                                                                    |
|  Phase 1: DOUBLE WRITE                                             |
|  +---------------------------+                                     |
|  | App writes to OLD shard   |                                     |
|  | App ALSO writes to NEW    |                                     |
|  | Backfill: copy old->new   |                                     |
|  +---------------------------+                                     |
|              |                                                     |
|  Phase 2: VERIFY                                                   |
|  +---------------------------+                                     |
|  | Compare data old vs new   |                                     |
|  | Ensure consistency        |                                     |
|  | Read from old (still)     |                                     |
|  +---------------------------+                                     |
|              |                                                     |
|  Phase 3: SWITCH                                                   |
|  +---------------------------+                                     |
|  | Read from NEW shard       |                                     |
|  | Stop writing to old       |                                     |
|  | Drop old shard (later)    |                                     |
|  +---------------------------+                                     |
+--------------------------------------------------------------------+

8. Challenges & Pitfalls

Tantangan Sharding

ChallengeDeskripsiSolusi
Cross-shard queryQuery yang butuh data dari banyak shardCo-locate data, denormalisasi
Auto-increment IDID tidak unik jika di tiap shard samaSnowflake ID, UUID, sequence service
Global uniquenessUnique constraint sulit di shard berbedaUnique check di service layer
TransactionsACID transactions hanya dalam 1 shardSaga pattern, eventual consistency
Schema migrationALTER TABLE harus di semua shardRolling migration, proxy DDL
HotspotShard tertentu lebih sibuk dari lainSplit shard, rebalance
Operational complexityMonitoring, backup, recovery N shardAutomation, tooling

Distributed ID Generation

JavaScript — Snowflake ID
// Twitter Snowflake: 64-bit ID yang unik di semua shard
// Bit layout:
// [1 bit sign][41 bits timestamp][10 bits machine][12 bits sequence]

class SnowflakeGenerator {
  constructor(machineId) {
    this.machineId = machineId;
    this.sequence = 0;
    this.lastTimestamp = -1;
    this.epoch = 1609459200000; // 2021-01-01
  }

  generate() {
    let timestamp = Date.now();

    if (timestamp === this.lastTimestamp) {
      this.sequence = (this.sequence + 1) & 0xFFF; // 12 bits
      if (this.sequence === 0) {
        while (timestamp <= this.lastTimestamp) {
          timestamp = Date.now();
        }
      }
    } else {
      this.sequence = 0;
    }

    this.lastTimestamp = timestamp;

    const id = BigInt(timestamp - this.epoch) << 22n
             | BigInt(this.machineId) << 12n
             | BigInt(this.sequence);

    return id.toString();
  }
}

// Setiap shard punya machine ID berbeda
const generator1 = new SnowflakeGenerator(1); // Shard 1
const generator2 = new SnowflakeGenerator(2); // Shard 2

console.log(generator1.generate()); // "1234567890123456789"
console.log(generator2.generate()); // "9876543210987654321"
// Keduanya unik, sortable by time

9. Implementasi Nyata

Vitess (YouTube / Slack)

YAML — Vitess Sharding
# Vitess: MySQL sharding proxy (digunakan YouTube, Slack, GitHub)

# Install Vitess via operator
# helm install vitess-operator vitess/vitess-operator

# VSchema: definisi sharding
# vschema.json
{
  "sharded": true,
  "vindexes": {
    "hash_user_id": {
      "type": "hash"
    }
  },
  "tables": {
    "users": {
      "column_vindexes": [
        { "column": "id", "name": "hash_user_id" }
      ]
    },
    "orders": {
      "column_vindexes": [
        { "column": "user_id", "name": "hash_user_id" }
      ]
    }
  }
}

# users dan orders co-located berdasarkan user_id
# JOIN users ON orders.user_id = users.id = same shard!

Citus (PostgreSQL Distributed)

SQL — Citus Sharding
-- Citus: PostgreSQL extension untuk sharding

-- Aktifkan Citus
CREATE EXTENSION citus;

-- Tambah worker nodes
SELECT citus_add_node('worker1', 5432);
SELECT citus_add_node('worker2', 5432);
SELECT citus_add_node('worker3', 5432);

-- Distribute tabel berdasarkan user_id
SELECT create_distributed_table('users', 'id');
SELECT create_distributed_table('orders', 'user_id');
SELECT create_distributed_table('order_items', 'user_id');

-- Reference table (replika ke semua shard)
SELECT create_reference_table('categories');
SELECT create_reference_table('settings');

-- Query sekarang otomatis terdistribusi:
SELECT u.name, COUNT(o.id), SUM(o.total)
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.id = 12345
GROUP BY u.name;
-- Query hanya ke shard yang berisi user 12345

-- Cross-shard query (Citus otomatis handle)
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON u.id = o.user_id
GROUP BY u.name
ORDER BY COUNT(o.id) DESC
LIMIT 10;

10. Quiz Pemahaman

📝 Quiz: Database Sharding
  1. Apa itu database sharding dan mengapa dilakukan?
    Jawaban: Membagi database besar menjadi beberapa shard yang lebih kecil dan independen. Dilakukan ketika vertical scaling sudah tidak cukup untuk menangani volume data dan traffic.
  2. Apa perbedaan hash sharding dan range sharding?
    Jawaban: Hash sharding mendistribusi data merata berdasarkan hash key, range sharding membagi berdasarkan rentang nilai. Hash: distribusi merata tapi range query sulit. Range: range query mudah tapi bisa ada hotspot.
  3. Apa itu cross-shard join dan mengapa mahal?
    Jawaban: Join yang membutuhkan data dari shard berbeda. Mahal karena harus query ke beberapa shard, transfer data ke satu tempat, lalu join di aplikasi. Solusi: co-locate data.
  4. Mengapa auto-increment ID tidak cocok untuk sharding?
    Jawaban: Setiap shard punya auto-increment terpisah, menghasilkan ID duplikat. Gunakan Snowflake ID, UUID, atau distributed sequence generator.
  5. Apa itu consistent hashing dan keuntungannya?
    Jawaban: Teknik hashing di mana penambahan/penghapusan node hanya mempengaruhi ~1/N data. Tidak perlu rehash semua data saat cluster berubah ukuran.
  6. Kapan sebaiknya TIDAK melakukan sharding?
    Jawaban: Ketika data masih muat di satu server, ketika read replicas + caching masih cukup, atau ketika tim belum siap mengelola kompleksitas sharding.
🎯 Langkah Selanjutnya

Setelah memahami Database Sharding, pelajari Database Migration Tools untuk mengelola schema di lingkungan sharded, atau TimescaleDB untuk time-series data yang perlu di-shard berdasarkan waktu.