1. Pengenalan TimescaleDB
TimescaleDB adalah ekstensi PostgreSQL untuk data time-series. Dibangun di atas PostgreSQL, TimescaleDB menambahkan fitur khusus untuk data berbasis waktu — hypertables, compression, dan continuous aggregates — sambil mempertahankan kompatibilitas penuh dengan SQL dan ekosistem PostgreSQL.
Mengapa Time-Series Data Berbeda?
| Karakteristik | Database Relasional Biasa | Time-Series Database |
|---|---|---|
| Pola Insert | Bervariasi (CRUD campur) | Insert-heavy (append-only) |
| Pola Query | Point lookup, join kompleks | Range scan, aggregation per waktu |
| Data Volume | Jutaan baris | Miliaran baris |
| Retensi | Tidak terbatas | Perlu auto-expire data lama |
| Indexing | B-tree, hash, GIN | Time-based partitioning |
| Compression | Opsional | Kritis (hemat storage) |
TimescaleDB vs Alternatif
| Aspek | TimescaleDB | InfluxDB | Apache Druid |
|---|---|---|---|
| Base | PostgreSQL | Custom | Custom |
| Query Language | SQL | Flux / InfluxQL | SQL (Druid SQL) |
| Full SQL Support | Ya (100%) | Terbatas | Terbatas |
| JOIN dengan tabel lain | Ya | Tidak | Terbatas |
| ACID Transactions | Ya | Tidak | Tidak |
| Compression | 95%+ ratio | Baik | Sangat baik |
| Ecosystem | Seluruh PostgreSQL | Grafana, Telegraf | Grafana, Superset |
+--------------------------------------------------------------------+ | ARSITEKTUR TIMESCALEDB | | | | +----------------------------------------------------------+ | | | PostgreSQL | | | | +----------------------------------------------------+ | | | | | TimescaleDB Extension | | | | | | | | | | | | +----------+ +----------+ +----------+ | | | | | | |Hypertable| |Continuous| |Compression| | | | | | | | (tabel) | |Aggregate | | Policies | | | | | | | +----------+ +----------+ +----------+ | | | | | | | | | | | | | | | +----------+ +----------+ +----------+ | | | | | | | Chunk 1 | | Chunk 2 | | Chunk N | | | | | | | | (partisi)| | (partisi)| | (partisi)| | | | | | | +----------+ +----------+ +----------+ | | | | | +----------------------------------------------------+ | | | | | | | | Standard PostgreSQL: tables, indexes, views, functions | | | +----------------------------------------------------------+ | +--------------------------------------------------------------------+
2. Instalasi & Setup
# ============================================= # Docker (recommended untuk development) # ============================================= docker run -d --name timescaledb \ -p 5432:5432 \ -e POSTGRES_PASSWORD=*** \ timescale/timescaledb:latest-pg16 # ============================================= # Ubuntu/Debian # ============================================= # Tambah repository Timescale sudo add-apt-repository ppa:timescale/timescaledb-ppa sudo apt update sudo apt install timescaledb-2-postgresql-16 # Atau gunakan script setup # https://docs.timescale.com/install/latest/installation-linux/ # ============================================= # Aktifkan ekstensi di PostgreSQL # ============================================= psql -U postgres -d your_database -- Aktifkan ekstensi CREATE EXTENSION IF NOT EXISTS timescaledb; -- Verifikasi SELECT default_version, installed_version FROM pg_available_extensions WHERE name = 'timescaledb';
3. Hypertables
Hypertable adalah konsep inti TimescaleDB. Secara luar, hypertable terlihat seperti tabel PostgreSQL biasa. Di balik layar, TimescaleDB otomatis mempartisi data berdasarkan kolom waktu (time column) menjadi chunks yang lebih kecil.
Membuat Hypertable
-- =============================================
-- BUAT TABEL STANDAR dulu, lalu convert ke hypertable
-- =============================================
-- Tabel sensor readings (IoT)
CREATE TABLE sensor_readings (
time TIMESTAMPTZ NOT NULL,
sensor_id INTEGER NOT NULL,
temperature DOUBLE PRECISION,
humidity DOUBLE PRECISION,
pressure DOUBLE PRECISION,
battery DOUBLE PRECISION
);
-- Convert ke hypertable (time_column = 'time')
SELECT create_hypertable('sensor_readings', 'time');
-- =============================================
-- HYPERTABLE dengan partisi tambahan
-- =============================================
-- Partition by time DAN sensor_id (multi-dimensional)
CREATE TABLE sensor_data (
time TIMESTAMPTZ NOT NULL,
location_id INTEGER NOT NULL,
sensor_id INTEGER NOT NULL,
metric_name TEXT NOT NULL,
metric_value DOUBLE PRECISION
);
SELECT create_hypertable(
'sensor_data',
'time',
partitioning_column => 'location_id',
number_partitions => 4
);
-- =============================================
-- TABEL LAINNYA: metrik aplikasi
-- =============================================
CREATE TABLE app_metrics (
time TIMESTAMPTZ NOT NULL,
service TEXT NOT NULL,
endpoint TEXT NOT NULL,
response_ms DOUBLE PRECISION,
status_code INTEGER,
is_error BOOLEAN DEFAULT FALSE
);
SELECT create_hypertable('app_metrics', 'time');
-- =============================================
-- TABEL: transaksi keuangan
-- =============================================
CREATE TABLE financial_ticks (
time TIMESTAMPTZ NOT NULL,
symbol TEXT NOT NULL,
price NUMERIC(15,4),
volume BIGINT,
bid NUMERIC(15,4),
ask NUMERIC(15,4)
);
SELECT create_hypertable('financial_ticks', 'time');
-- Cek hypertables
SELECT * FROM timescaledb_information.hypertables;
Mengelola Chunks
-- Lihat semua chunks dari sebuah hypertable
SELECT chunk_name, range_start, range_end, is_compressed
FROM timescaledb_information.chunks
WHERE hypertable_name = 'sensor_readings'
ORDER BY range_start DESC;
-- Lihat ukuran hypertable (termasuk index dan chunks)
SELECT
hypertable_name,
pg_size_pretty(hypertable_size(format('%I.%I', hypertable_schema, hypertable_name)::regclass)) AS total_size,
num_chunks
FROM timescaledb_information.hypertables;
-- Set chunk interval (default 7 hari)
SELECT set_chunk_time_interval('sensor_readings', INTERVAL '1 day');
-- Add compression policy (otomatis compress chunk lama)
SELECT add_compression_policy('sensor_readings', INTERVAL '7 days');
Insert Data ke Hypertable
-- Insert data biasa (sama seperti PostgreSQL)
INSERT INTO sensor_readings (time, sensor_id, temperature, humidity, pressure, battery)
VALUES
(NOW() - INTERVAL '5 minutes', 1, 25.5, 65.2, 1013.25, 85.0),
(NOW() - INTERVAL '4 minutes', 1, 25.7, 64.8, 1013.20, 84.5),
(NOW() - INTERVAL '3 minutes', 2, 22.3, 70.1, 1013.30, 92.0),
(NOW() - INTERVAL '2 minutes', 2, 22.5, 69.8, 1013.28, 91.5),
(NOW() - INTERVAL '1 minute', 1, 26.0, 63.5, 1013.15, 84.0);
-- Generate data untuk testing (1 juta baris)
INSERT INTO sensor_readings (time, sensor_id, temperature, humidity, pressure, battery)
SELECT
NOW() - (random() * INTERVAL '30 days'),
(random() * 100)::INT + 1,
20 + random() * 15,
40 + random() * 40,
1000 + random() * 30,
50 + random() * 50
FROM generate_series(1, 1000000);
-- Bulk insert untuk performa
-- Gunakan COPY atau INSERT ... VALUES batch
COPY sensor_readings (time, sensor_id, temperature, humidity, pressure, battery)
FROM '/path/to/data.csv'
WITH (FORMAT csv, HEADER true);
4. Time-Series Queries
TimescaleDB menambahkan fungsi-fungsi khusus untuk query time-series yang tidak ada di PostgreSQL standar.
Fungsi TimeBucket
-- =============================================
-- time_bucket: Group by interval waktu
-- (mirip date_trunc tapi lebih fleksibel)
-- =============================================
-- Rata-rata per 5 menit
SELECT
time_bucket('5 minutes', time) AS bucket,
sensor_id,
AVG(temperature) AS avg_temp,
AVG(humidity) AS avg_humidity,
MAX(temperature) AS max_temp,
MIN(temperature) AS min_temp,
COUNT(*) AS readings
FROM sensor_readings
WHERE time >= NOW() - INTERVAL '1 day'
GROUP BY bucket, sensor_id
ORDER BY bucket DESC, sensor_id;
-- Rata-rata per jam
SELECT
time_bucket('1 hour', time) AS jam,
AVG(temperature) AS avg_temp,
MAX(temperature) AS max_temp,
MIN(temperature) AS min_temp
FROM sensor_readings
WHERE time >= NOW() - INTERVAL '7 days'
GROUP BY jam
ORDER BY jam DESC;
-- Rata-rata per hari (dengan custom origin)
SELECT
time_bucket('1 day', time, TIMESTAMPTZ '2026-01-01') AS hari,
AVG(temperature) AS avg_temp
FROM sensor_readings
GROUP BY hari
ORDER BY hari DESC;
Fungsi Time-Series Lainnya
-- =============================================
-- first() dan last(): Ambil nilai pertama/terakhir
-- dalam interval
-- =============================================
SELECT
time_bucket('1 hour', time) AS jam,
sensor_id,
first(temperature, time) AS temp_awal,
last(temperature, time) AS temp_akhir,
last(temperature, time) - first(temperature, time) AS perubahan
FROM sensor_readings
WHERE time >= NOW() - INTERVAL '24 hours'
GROUP BY jam, sensor_id
ORDER BY jam DESC;
-- =============================================
-- time_bucket_gapfill: Isi gap waktu yang kosong
-- =============================================
SELECT
time_bucket_gapfill('1 hour', time) AS jam,
sensor_id,
AVG(temperature) AS avg_temp,
interpolate(AVG(temperature)) AS interpolated_temp,
locf(AVG(temperature)) AS last_known_temp
FROM sensor_readings
WHERE time >= NOW() - INTERVAL '24 hours'
AND sensor_id = 1
GROUP BY jam, sensor_id
ORDER BY jam;
-- =============================================
-- Moving Average dengan window function
-- =============================================
SELECT
time,
sensor_id,
temperature,
AVG(temperature) OVER (
PARTITION BY sensor_id
ORDER BY time
ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
) AS moving_avg_10,
temperature - LAG(temperature) OVER (
PARTITION BY sensor_id ORDER BY time
) AS delta
FROM sensor_readings
WHERE sensor_id = 1
AND time >= NOW() - INTERVAL '2 hours'
ORDER BY time DESC;
5. Compression
Kompression adalah fitur kritis untuk time-series data karena volume data yang sangat besar. TimescaleDB menawarkan native columnar compression yang bisa mengurangi storage hingga 95%+.
Aktifkan Compression
-- =============================================
-- LANGKAH 1: Tentukan segmentby dan orderby
-- =============================================
ALTER TABLE sensor_readings SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'sensor_id',
timescaledb.compress_orderby = 'time DESC'
);
-- =============================================
-- LANGKAH 2: Auto-compress chunk yang lebih lama
-- dari 7 hari
-- =============================================
SELECT add_compression_policy('sensor_readings', INTERVAL '7 days');
-- =============================================
-- Manual compress: compress chunk tertentu
-- =============================================
-- Cari chunk yang belum di-compress
SELECT chunk_name, range_start, range_end, is_compressed
FROM timescaledb_information.chunks
WHERE hypertable_name = 'sensor_readings'
AND is_compressed = FALSE
ORDER BY range_start;
-- Compress secara manual
SELECT compress_chunk('_timescaledb_internal._hyper_1_1_chunk');
SELECT compress_chunk('_timescaledb_internal._hyper_1_2_chunk');
-- =============================================
-- Lihat hasil kompresi
-- =============================================
SELECT
pg_size_pretty(before_compression_total_bytes) AS sebelum,
pg_size_pretty(after_compression_total_bytes) AS sesudah,
ROUND(
(1.0 - after_compression_total_bytes::float / before_compression_total_bytes) * 100,
2
) AS persen_penghematan
FROM hypertable_compression_stats('sensor_readings');
Atur segmentby berdasarkan kolom yang sering di-group atau filter. Atur orderby berdasarkan kolom yang sering di-sort. Misal: segmentby = 'sensor_id' dan orderby = 'time DESC' untuk query per sensor dalam rentang waktu.
6. Continuous Aggregates
Continuous Aggregate adalah materialized view yang otomatis di-update saat data baru masuk. Berbeda dari materialized view biasa yang harus di-refresh manual, continuous aggregate hanya meng-update chunk yang berubah.
Membuat Continuous Aggregate
-- =============================================
-- CONTINUOUS AGGREGATE: Statistik per jam
-- =============================================
CREATE MATERIALIZED VIEW stats_per_jam
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 hour', time) AS jam,
sensor_id,
AVG(temperature) AS avg_temp,
MIN(temperature) AS min_temp,
MAX(temperature) AS max_temp,
AVG(humidity) AS avg_humidity,
AVG(pressure) AS avg_pressure,
COUNT(*) AS jumlah_pembacaan
FROM sensor_readings
GROUP BY jam, sensor_id;
-- Query continuous aggregate (sangat cepat!)
SELECT * FROM stats_per_jam
WHERE jam >= NOW() - INTERVAL '24 hours'
AND sensor_id = 1
ORDER BY jam DESC;
-- =============================================
-- CONTINUOUS AGGREGATE: Statistik per hari
-- =============================================
CREATE MATERIALIZED VIEW stats_per_hari
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 day', time) AS hari,
sensor_id,
AVG(temperature) AS avg_temp,
MIN(temperature) AS min_temp,
MAX(temperature) AS max_temp,
STDDEV(temperature) AS std_temp,
COUNT(*) AS readings_count
FROM sensor_readings
GROUP BY hari, sensor_id;
-- =============================================
-- Auto-refresh policy
-- =============================================
-- Refresh otomatis setiap 1 jam
SELECT add_continuous_aggregate_policy('stats_per_jam',
start_offset => INTERVAL '3 days',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour'
);
-- Refresh otomatis setiap 1 hari
SELECT add_continuous_aggregate_policy('stats_per_hari',
start_offset => INTERVAL '30 days',
end_offset => INTERVAL '1 day',
schedule_interval => INTERVAL '1 day'
);
-- Manual refresh
CALL refresh_continuous_aggregate('stats_per_jam', NOW() - INTERVAL '1 day', NOW());
-- Lihat semua continuous aggregates
SELECT * FROM timescaledb_information.continuous_aggregates;
+--------------------------------------------------------------------+ | CONTINUOUS AGGREGATES HIERARCHY | | | | Raw Data (Hypertable) | | +--------------------------------------------------------------+ | | | sensor_readings: Miliaran baris | | | | Per-second / per-minute data | | | +--------------------------------------------------------------+ | | | | | v | | stats_per_jam (Continuous Aggregate) | | +--------------------------------------------------------------+ | | | Rata-rata per jam per sensor | | | | Jauh lebih sedikit baris | | | +--------------------------------------------------------------+ | | | | | v | | stats_per_hari (Continuous Aggregate) | | +--------------------------------------------------------------+ | | | Rata-rata per hari per sensor | | | | Paling cepat untuk query jangka panjang | | | +--------------------------------------------------------------+ | +--------------------------------------------------------------------+
7. Retention Policies
Data time-series sering memiliki masa retensi — data lama tidak lagi relevan dan harus dihapus untuk menghemat storage.
-- =============================================
-- Auto-drop chunk yang lebih lama dari 90 hari
-- =============================================
SELECT add_retention_policy('sensor_readings', INTERVAL '90 days');
-- Retensi berbeda per hypertable
SELECT add_retention_policy('app_metrics', INTERVAL '30 days');
SELECT add_retention_policy('financial_ticks', INTERVAL '365 days');
-- Lihat semua retention policies
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';
-- Manual drop chunk lama
SELECT drop_chunks('sensor_readings', INTERVAL '90 days');
-- Preview chunk yang akan di-drop (tanpa benar-benar drop)
SELECT show_chunks('sensor_readings', older_than => INTERVAL '90 days');
-- Hapus retention policy
SELECT remove_retention_policy('sensor_readings');
-- =============================================
-- Data Tiering: Pindahkan data lama ke storage murah
-- =============================================
-- TimescaleDB mendukung multi-tier storage
-- Data panas (recent) di SSD, data dingin (lama) di object storage
-- Contoh setup tiering (memerlukan license Tiered)
-- SELECT add_tiering_policy('sensor_readings', INTERVAL '30 days');
8. Use Cases: IoT, Monitoring, Finance
IoT: Monitoring Sensor
-- Deteksi anomali: sensor dengan temperatur di atas rata-rata
WITH sensor_avg AS (
SELECT
sensor_id,
AVG(temperature) AS avg_temp,
STDDEV(temperature) AS std_temp
FROM sensor_readings
WHERE time >= NOW() - INTERVAL '24 hours'
GROUP BY sensor_id
)
SELECT
sr.time,
sr.sensor_id,
sr.temperature,
sa.avg_temp,
sa.std_temp,
(sr.temperature - sa.avg_temp) / NULLIF(sa.std_temp, 0) AS z_score
FROM sensor_readings sr
JOIN sensor_avg sa ON sr.sensor_id = sa.sensor_id
WHERE sr.time >= NOW() - INTERVAL '1 hour'
AND ABS((sr.temperature - sa.avg_temp) / NULLIF(sa.std_temp, 0)) > 3
ORDER BY ABS((sr.temperature - sa.avg_temp) / NULLIF(sa.std_temp, 0)) DESC
LIMIT 20;
-- Sensor yang offline (tidak ada data dalam 30 menit terakhir)
SELECT
s.sensor_id,
MAX(s.time) AS last_reading,
NOW() - MAX(s.time) AS offline_duration
FROM sensor_readings s
WHERE time >= NOW() - INTERVAL '2 hours'
GROUP BY s.sensor_id
HAVING NOW() - MAX(s.time) > INTERVAL '30 minutes'
ORDER BY offline_duration DESC;
Monitoring: Aplikasi Performance
-- Endpoint terlambat (response time > 1 detik)
SELECT
time_bucket('5 minutes', time) AS bucket,
endpoint,
AVG(response_ms) AS avg_ms,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY response_ms) AS p95_ms,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY response_ms) AS p99_ms,
COUNT(*) AS total_requests,
SUM(CASE WHEN is_error THEN 1 ELSE 0 END) AS errors,
ROUND(SUM(CASE WHEN is_error THEN 1 ELSE 0 END)::NUMERIC / COUNT(*) * 100, 2) AS error_rate
FROM app_metrics
WHERE time >= NOW() - INTERVAL '1 hour'
GROUP BY bucket, endpoint
HAVING AVG(response_ms) > 1000
ORDER BY avg_ms DESC;
-- Status code distribution per jam
SELECT
time_bucket('1 hour', time) AS jam,
status_code,
COUNT(*) AS jumlah
FROM app_metrics
WHERE time >= NOW() - INTERVAL '24 hours'
GROUP BY jam, status_code
ORDER BY jam DESC, jumlah DESC;
Finance: Stock Ticks
-- OHLCV (Open, High, Low, Close, Volume) per 5 menit
SELECT
time_bucket('5 minutes', time) AS bucket,
symbol,
first(price, time) AS open,
MAX(price) AS high,
MIN(price) AS low,
last(price, time) AS close,
SUM(volume) AS volume
FROM financial_ticks
WHERE time >= NOW() - INTERVAL '1 day'
AND symbol = 'BBCA'
GROUP BY bucket, symbol
ORDER BY bucket DESC;
-- VWAP (Volume Weighted Average Price)
SELECT
time_bucket('1 hour', time) AS jam,
symbol,
SUM(price * volume)::NUMERIC / NULLIF(SUM(volume), 0) AS vwap,
SUM(volume) AS total_volume
FROM financial_ticks
WHERE time >= NOW() - INTERVAL '24 hours'
GROUP BY jam, symbol
ORDER BY jam DESC;
9. Best Practices & Performance
| Praktik | Penjelasan |
|---|---|
| Chunk interval yang tepat | Default 7 hari. Untuk data volume sangat tinggi, gunakan 1 hari atau bahkan 1 jam |
| Segmentby yang tepat | Pilih kolom yang sering di-GROUP BY atau filter untuk compression |
| Continuous aggregates | Buat aggregate per jam/hari untuk query yang sering diulang |
| Retention policies | Hapus data lama secara otomatis untuk hemat storage |
| Bulk insert | Gunakan COPY atau batch INSERT, bukan satu-satu |
| Jangan UPDATE/DELETE frequent | Time-series data bersifat append-only. Update jarang terjadi |
| Index minimal | Tambah index hanya pada kolom yang sering di-filter (selain time) |
| EXPLAIN ANALYZE | Selalu cek execution plan untuk query besar |
Jangan gunakan SELECT * pada hypertable dengan miliaran baris. Selalu filter berdasarkan rentang waktu (WHERE time BETWEEN ...) agar query memanfaatkan chunk pruning. Tanpa filter waktu, query scan semua chunks — sangat lambat.
10. Quiz Pemahaman
- Apa itu hypertable dan bagaimana cara membuatnya?
Jawaban: Hypertable adalah tabel yang otomatis dipartisi berdasarkan waktu menjadi chunks. Dibuat dengan SELECT create_hypertable('nama_tabel', 'kolom_waktu'). - Apa keuntungan continuous aggregate dibanding materialized view biasa?
Jawaban: Continuous aggregate otomatis meng-update hanya chunk yang berubah, sedangkan materialized view biasa harus di-refresh manual (full refresh). - Apa fungsi time_bucket()?
Jawaban: Mengelompokkan timestamp ke interval tertentu (5 menit, 1 jam, 1 hari) untuk aggregation. Lebih fleksibel dari date_trunc(). - Apa itu segmentby dalam compression?
Jawaban: Kolom yang digunakan untuk membagi data sebelum dikompres. Query yang filter berdasarkan segmentby kolom akan lebih cepat karena hanya decompress segmen yang relevan. - Mengapa filter waktu sangat penting di hypertable?
Jawaban: Tanpa filter waktu, query harus scan semua chunks (full table scan). Dengan filter waktu, TimescaleDB hanya membaca chunks yang relevan (chunk pruning).
Pelajari Database Sharding untuk strategi skalabilitas horizontal, atau kembali ke SQL Stored Procedures untuk membuat fungsi-fungsi yang berinteraksi dengan data time-series.