1. Pengenalan SVM
Support Vector Machine (SVM) adalah salah satu algoritma Machine Learning paling elegan dan powerful, yang dikembangkan oleh Vladimir Vapnik dan kolega pada tahun 1963-1995. SVM pertama kali dipopulerkan dengan konsep Maximum Margin Classifier dan kemudian diperluas dengan Kernel Trick oleh Boser, Guyon, dan Vapnik pada tahun 1992.
Inti dari SVM sederhana namun powerful: SVM mencari hyperplane terbaik yang memisahkan kelas-kelas data dengan margin terbesar. Margin adalah jarak antara hyperplane dengan data point terdekat dari masing-masing kelas (yang disebut support vectors).
Kapan Menggunakan SVM?
βββββββββββββββββββββββββββββββββββ
β Masalah ML β
ββββββββββββββββ¬βββββββββββββββββββ
β
βββββββΌββββββ
β Tabular β
β Data? β
βββββββ¬ββββββ
Ya / \ Tidak
/ \
βββββββββΌβββ ββββΌβββββββββββ
β Jumlah β β Image/Text β
β fitur β β β Neural β
β banyak? β β Network β
βββββ¬ββββ¬βββ βββββββββββββββ
Ya β β Tidak
βΌ βΌ
ββββββββββββββββ
β SVM dengan β β
SVM sangat cocok:
β kernel RBF β - High dimensional data
β atau linear β - Text classification
ββββββββββββββββ - Gene expression
- Small-medium dataset
| Aplikasi SVM | Tipe | Deskripsi |
|---|---|---|
| Klasifikasi Teks/Dokumen | Text | Klasifikasi spam, sentiment analysis, topic classification |
| Pengenalan Wajah | Image | Face detection (sebelum era deep learning) |
| Deteksi Penipuan | Tabular | Mendeteksi transaksi kartu kredit mencurigakan |
| Biomedis | Genomic | Klasifikasi kanker berdasarkan ekspresi gen |
| Handwriting Recognition | Image | Pengenalan tulisan tangan digit MNIST |
| Analisis Sentimen | Text | Positif vs negatif review film/produk |
2. Hyperplane & Margin
Hyperplane adalah batas keputusan (decision boundary) yang memisahkan kelas-kelas dalam data. Dalam 2D, hyperplane adalah garis. Dalam 3D, hyperplane adalah bidang. Dalam nD, hyperplane adalah (n-1) dimensi.
Konsep Margin
Feature 2 (y)
β
β β β β β = Kelas A
β β β β β Support Vector β
β β β β β
ββββββββββββββββββββββββββββββββ Margin
β β β β β β
β β---Hyperplane (wΒ·x + b = 0)
ββββββββββββββββββββββββββββββββ Margin
β β β β β Support Vector β
β β β β β β β
β
βββββββββββββββββββββββββββββββββ Feature 1 (x)
Margin = jarak antara hyperplane dengan support vectors
Tujuan SVM: MAXIMIZE margin ini!
Matematika di Balik SVM
Untuk data linearly separable, SVM mencari hyperplane wΒ·x + b = 0 yang memaksimalkan margin:
- Equation hyperplane:
w Β· x + b = 0, di manaw= weight vector (normal ke hyperplane),b= bias - Margin:
2 / ||w||β jarak antara dua margin boundary - Optimization goal: Minimize
||w||Β² / 2dengan constraintyα΅’(w Β· xα΅’ + b) β₯ 1untuk semua data
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import make_blobs
# Generate 2D dataset yang linearly separable
X, y = make_blobs(n_samples=100, centers=2, cluster_std=1.5, random_state=42)
# Train SVM linear
svm_linear = SVC(kernel='linear', C=1.0)
svm_linear.fit(X, y)
# Mendapatkan koefisien dan intercept
w = svm_linear.coef_[0]
b = svm_linear.intercept_[0]
print(f"Weight vector (w): {w}")
print(f"Bias (b): {b:.4f}")
print(f"Support vectors: {len(svm_linear.support_vectors_)}")
# Plot
plt.figure(figsize=(12, 6))
# Subplot 1: Decision boundary
plt.subplot(1, 2, 1)
plt.scatter(X[y==0, 0], X[y==0, 1], c='blue', label='Kelas 0',
edgecolors='k', s=50)
plt.scatter(X[y==1, 0], X[y==1, 1], c='red', label='Kelas 1',
edgecolors='k', s=50)
# Plot hyperplane
xx = np.linspace(X[:, 0].min()-1, X[:, 0].max()+1, 100)
yy = -(w[0] * xx + b) / w[1] # wΒ·x + b = 0
# Margin boundaries: wΒ·x + b = Β±1
margin_up = -(w[0] * xx + b - 1) / w[1]
margin_down = -(w[0] * xx + b + 1) / w[1]
plt.plot(xx, yy, 'k-', linewidth=2, label='Hyperplane (wΒ·x + b = 0)')
plt.plot(xx, margin_up, 'k--', linewidth=1, label='Margin (wΒ·x + b = 1)')
plt.plot(xx, margin_down, 'k--', linewidth=1, label='Margin (wΒ·x + b = -1)')
# Highlight support vectors
sv = svm_linear.support_vectors_
plt.scatter(sv[:, 0], sv[:, 1], s=200, facecolors='none',
edgecolors='green', linewidths=3, label='Support Vectors')
plt.fill_between(xx, margin_down, margin_up, alpha=0.1, color='green')
plt.title('SVM: Hyperplane & Margin')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend(fontsize=8)
plt.grid(True, alpha=0.3)
# Subplot 2: Decision function surface
plt.subplot(1, 2, 2)
from sklearn.inspection import DecisionBoundaryDisplay
DecisionBoundaryDisplay.from_estimator(
svm_linear, X, response_method='predict',
cmap='RdYlBu', alpha=0.3, ax=plt.gca()
)
plt.scatter(X[y==0, 0], X[y==0, 1], c='blue', edgecolors='k', s=50)
plt.scatter(X[y==1, 0], X[y==1, 1], c='red', edgecolors='k', s=50)
plt.scatter(sv[:, 0], sv[:, 1], s=200, facecolors='none',
edgecolors='green', linewidths=3)
plt.title('Decision Boundary')
plt.tight_layout()
plt.show()
# Margin calculation
margin = 2.0 / np.linalg.norm(w)
print(f"\nMargin: {margin:.4f}")
print(f"Jumlah support vectors: {len(svm_linear.support_)}")
3. Support Vectors & Soft Margin
Support Vectors adalah data point terdekat dari masing-masing kelas yang "menopang" hyperplane. Mereka adalah titik-titik yang menentukan posisi dan orientasi hyperplane β hanya support vectors yang relevan untuk model, data lainnya bisa dihapus tanpa mengubah model!
Hard Margin vs Soft Margin
| Aspek | Hard Margin | Soft Margin |
|---|---|---|
| Constraint | Semua data HARUS benar dipisahkan (tidak ada miss) | Beberapa data BOLEH salah (dengan penalti) |
| Data yang bisa diproses | Hanya linearly separable | Bisa handle overlap/non-separable |
| Robust terhadap outlier | β Sangat sensitif | β Lebih robust |
| Parameter | Tidak ada parameter fleksibel | Parameter C (trade-off) |
| Penggunaan praktis | Jarang digunakan (data real hampir tidak pernah clean) | Sangat umum (default sklearn) |
Parameter C: Trade-off
Parameter C mengontrol trade-off antara margin yang besar dan kesalahan klasifikasi yang kecil:
C KECIL (misal 0.01): C BESAR (misal 100): ββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββ β β β β β β β β β β β β β β β β β βmargin besarβ β β β β β β βmargin kecilβ β β β β β β β β β β β Γ Γ β β β β β β β β β (margin kecil) β β β β β β β β β β β β β β Lebih banyak β β Lebih sedikit β β misclassification β β β misclassification β β β bias tinggi β β variance tinggi β β Underfitting tendency β β Overfitting tendency β ββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββ
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# Dataset dengan noise (overlapping classes)
X, y = make_classification(
n_samples=200, n_features=2, n_redundant=0,
n_informative=2, n_clusters_per_class=1,
flip_y=0.1, random_state=42 # 10% noise
)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
fig, axes = plt.subplots(1, 4, figsize=(20, 5))
C_values = [0.001, 0.01, 1, 100]
for ax, C in zip(axes, C_values):
svm = SVC(kernel='linear', C=C)
svm.fit(X_train, y_train)
train_acc = svm.score(X_train, y_train)
test_acc = svm.score(X_test, y_test)
n_sv = len(svm.support_)
# Decision boundary
xx, yy = np.meshgrid(
np.linspace(X[:, 0].min()-1, X[:, 0].max()+1, 300),
np.linspace(X[:, 1].min()-1, X[:, 1].max()+1, 300)
)
Z = svm.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
ax.contourf(xx, yy, Z, alpha=0.3, cmap='RdYlBu')
ax.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap='RdYlBu',
edgecolors='k', s=40)
# Highlight support vectors
sv = svm.support_vectors_
ax.scatter(sv[:, 0], sv[:, 1], s=200, facecolors='none',
edgecolors='green', linewidths=2)
ax.set_title(f'C = {C}\nTrain: {train_acc:.2f}, Test: {test_acc:.2f}\n'
f'SVs: {n_sv}', fontsize=10)
ax.grid(True, alpha=0.3)
plt.suptitle('Pengaruh Parameter C pada SVM Linear', fontsize=14)
plt.tight_layout()
plt.show()
4. Kernel Trick & Jenis Kernel
Salah satu keunggulan terbesar SVM adalah Kernel Trick β kemampuan untuk menangani data yang tidak bisa dipisahkan secara linear dengan memproyeksikannya ke dimensi yang lebih tinggi tanpa secara eksplisit menghitung koordinat di dimensi tersebut.
Apa Itu Kernel Trick?
2D (Tidak Linearly Separable): 3D (Linearly Separable!):
β β β β β β
β β β βββββββΊ β β β
β β β transform β β β β
β β β β β β β
β β β β
Tidak bisa dipisahkan Hyperplane memisahkan
dengan garis lurus dengan bidang datar!
Kernel function K(xα΅’, xβ±Ό) = Ο(xα΅’) Β· Ο(xβ±Ό)
β Tidak perlu menghitung Ο(x) secara eksplisit!
β Hanya menghitung "jarak" di ruang higher-dimensional
Jenis-Jenis Kernel
| Kernel | Formula | Parameter | Cocok untuk |
|---|---|---|---|
linear | K(x,y) = xΒ·y | β | Data linearly separable, teks (high-dim) |
poly | K(x,y) = (Ξ³xΒ·y + r)^d | degree, gamma, coef0 | Polynomial patterns |
rbf (Gaussian) | K(x,y) = exp(-Ξ³||x-y||Β²) | gamma | Non-linear data (default, paling umum) |
sigmoid | K(x,y) = tanh(Ξ³xΒ·y + r) | gamma, coef0 | Meniru neural network |
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import make_circles, make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Dataset 1: Concentric Circles (harusnya dipisahkan oleh kernel)
X_circles, y_circles = make_circles(n_samples=300, noise=0.1,
factor=0.3, random_state=42)
# Dataset 2: Two Moons
X_moons, y_moons = make_moons(n_samples=300, noise=0.2, random_state=42)
datasets = [
('Concentric Circles', X_circles, y_circles),
('Two Moons', X_moons, y_moons)
]
kernels = ['linear', 'poly', 'rbf']
kernel_names = ['Linear', 'Polynomial (degree=3)', 'RBF (Gaussian)']
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
for row, (name, X, y) in enumerate(datasets):
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42
)
for col, (kernel, k_name) in enumerate(zip(kernels, kernel_names)):
svm = SVC(kernel=kernel, degree=3, C=1.0, gamma='scale')
svm.fit(X_train, y_train)
# Decision boundary
xx, yy = np.meshgrid(
np.linspace(X[:, 0].min()-0.5, X[:, 0].max()+0.5, 300),
np.linspace(X[:, 1].min()-0.5, X[:, 1].max()+0.5, 300)
)
Z = svm.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
ax = axes[row, col]
ax.contourf(xx, yy, Z, alpha=0.3, cmap='RdYlBu')
ax.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap='RdYlBu',
edgecolors='k', s=30)
# Support vectors
sv = svm.support_vectors_
ax.scatter(sv[:, 0], sv[:, 1], s=100, facecolors='none',
edgecolors='green', linewidths=2)
acc_train = svm.score(X_train, y_train)
acc_test = svm.score(X_test, y_test)
ax.set_title(f'{name}\n{kernel_name}\n'
f'Train: {acc_train:.2f} | Test: {acc_test:.2f}',
fontsize=10)
ax.grid(True, alpha=0.3)
plt.suptitle('SVM: Perbandingan Kernel pada Berbagai Dataset', fontsize=14)
plt.tight_layout()
plt.show()
Tips Memilih Kernel
- Linear: Mulai dari sini jika data berdimensi tinggi (teks, genomics) atau jumlah fitur > jumlah sampel
- RBF (default): Pilihan paling aman untuk data non-linear dengan dimensi moderat
- Polynomial: Cocok jika ada hubungan polinomial dalam data (jarang lebih baik dari RBF)
- Rule of thumb: Coba linear dulu β kalau buruk, coba RBF β kalau kurang, fine-tune gamma dan C
5. Implementasi Klasifikasi SVM
Sekarang kita akan implementasi SVM klasifikasi yang lengkap, dari preprocessing hingga evaluasi, menggunakan dataset klasik Iris.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import (
accuracy_score, classification_report,
confusion_matrix
)
from sklearn.pipeline import Pipeline
import seaborn as sns
# =============================================
# 1. LOAD DATA
# =============================================
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = iris.target
print("=" * 60)
print("DATASET IRIS")
print("=" * 60)
print(f"Samples: {X.shape[0]}, Features: {X.shape[1]}")
print(f"Classes: {iris.target_names}")
print(f"Class distribution: {np.bincount(y)}")
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# =============================================
# 2. PREPROCESSING & TRAINING
# =============================================
# SVM SANGAT SENSITIF terhadap scaling!
# Selalu gunakan StandardScaler
svm_pipeline = Pipeline([
('scaler', StandardScaler()),
('svm', SVC(kernel='rbf', C=1.0, gamma='scale', random_state=42))
])
svm_pipeline.fit(X_train, y_train)
# =============================================
# 3. EVALUASI
# =============================================
y_pred = svm_pipeline.predict(X_test)
print("\n" + "=" * 60)
print("HASIL SVM")
print("=" * 60)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.4f}")
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=iris.target_names))
# Cross-validation
cv_scores = cross_val_score(svm_pipeline, X, y, cv=5, scoring='accuracy')
print(f"CV Accuracy: {cv_scores.mean():.4f} (Β±{cv_scores.std():.4f})")
# Confusion Matrix
plt.figure(figsize=(8, 6))
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=iris.target_names,
yticklabels=iris.target_names)
plt.title('SVM β Confusion Matrix (Iris Dataset)')
plt.xlabel('Prediksi')
plt.ylabel('Aktual')
plt.tight_layout()
plt.show()
# =============================================
# 4. FEATURE SCALING EFFECT DEMO
# =============================================
# Tanpa scaling
svm_no_scale = SVC(kernel='rbf', C=1.0, gamma='scale')
svm_no_scale.fit(X_train, y_train)
acc_no_scale = svm_no_scale.score(X_test, y_test)
# Dengan scaling
svm_scaled = Pipeline([
('scaler', StandardScaler()),
('svm', SVC(kernel='rbf', C=1.0, gamma='scale'))
])
svm_scaled.fit(X_train, y_train)
acc_scaled = svm_scaled.score(X_test, y_test)
print("\n=== Pentingnya Feature Scaling untuk SVM ===")
print(f"Tanpa scaling: {acc_no_scale:.4f}")
print(f"Dengan scaling: {acc_scaled:.4f}")
print(f"Selisih: {acc_scaled - acc_no_scale:.4f}")
6. Klasifikasi Multi-Class dengan SVM
SVM secara dasarnya adalah binary classifier β hanya bisa mengklasifikasi 2 kelas. Untuk masalah multi-class (>2 kelas), ada dua strategi utama:
One-vs-One (OvO) vs One-vs-Rest (OvR)
ONE-vs-REST (OvR): ONE-vs-ONE (OvO):
Untuk 3 kelas (A, B, C): Untuk 3 kelas (A, B, C):
Classifier 1: A vs (B, C) Classifier 1: A vs B
Classifier 2: B vs (A, C) Classifier 2: A vs C
Classifier 3: C vs (A, B) Classifier 3: B vs C
Total: 3 classifiers Total: 3 classifiers
Prediksi: kelas dengan score tertinggi Prediksi: voting mayoritas
Untuk N kelas: Untuk N kelas:
N classifiers N(N-1)/2 classifiers
β Lebih sedikit model β Lebih banyak model
β Bisa lebih cepat β Lebih akurat untuk
dataset kecil
from sklearn.svm import SVC
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, classification_report
import numpy as np
# Dataset digits (10 kelas: 0-9)
digits = load_digits()
X, y = digits.data, digits.target
print(f"Dataset Digits: {X.shape[0]} samples, {X.shape[1]} features, "
f"10 kelas")
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# --- One-vs-Rest (OvR) ---
svm_ovr = Pipeline([
('scaler', StandardScaler()),
('svm', SVC(kernel='rbf', C=10, gamma='scale',
decision_function_shape='ovr', random_state=42))
])
svm_ovr.fit(X_train, y_train)
acc_ovr = svm_ovr.score(X_test, y_test)
# --- One-vs-One (OvO) ---
svm_ovo = Pipeline([
('scaler', StandardScaler()),
('svm', SVC(kernel='rbf', C=10, gamma='scale',
decision_function_shape='ovo', random_state=42))
])
svm_ovo.fit(X_train, y_train)
acc_ovo = svm_ovo.score(X_test, y_test)
print("=" * 50)
print("MULTI-CLASS SVM HASIL")
print("=" * 50)
print(f"OvR Accuracy: {acc_ovr:.4f}")
print(f"OvO Accuracy: {acc_ovo:.4f}")
# Default sklearn menggunakan OvO untuk SVC
print(f"\nNumber of classifiers OvR: 10")
print(f"Number of classifiers OvO: {10*9//2} = 45")
# Classification report
y_pred = svm_ovr.predict(X_test)
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred))
7. Support Vector Regression (SVR)
SVM juga bisa digunakan untuk regresi β memprediksi nilai kontinu. Versi ini disebut Support Vector Regression (SVR). Konsepnya mirip dengan klasifikasi SVM, tapi alih-alih mencari hyperplane yang memisahkan kelas, SVR mencari hyperplane yang "memeluk" data dengan epsilon-tube (Ξ΅-tube).
Konsep SVR
Target (y)
β
β β
β ββ β β
β β β β β β β Data points
β β β ββ β β
β β β β β β
β β β ββ β
β β β β β
β β β β
ββββββββββββββββββββββββ β Ξ΅-tube boundary (atas)
β βββββββββββββββ β Hyperplane (prediksi)
ββββββββββββββββββββββββ β Ξ΅-tube boundary (bawah)
β
βββββββββββββββββββββββ Features (x)
β di dalam Ξ΅-tube β error = 0 (tidak dikenakan penalti)
β di luar Ξ΅-tube β error = |y - f(x)| - Ξ΅
SVR goal: buat Ξ΅-tube yang menangkap SEBANYAK MUNGKIN data
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVR
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
# Generate non-linear data
np.random.seed(42)
X = np.sort(5 * np.random.rand(300, 1), axis=0)
y = np.sin(X).ravel() + 0.2 * np.random.randn(300)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42
)
# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Bandingkan SVR dengan berbagai kernel
kernels_params = {
'Linear SVR': {'kernel': 'linear', 'C': 1.0, 'epsilon': 0.1},
'RBF SVR (C=1)': {'kernel': 'rbf', 'C': 1.0, 'gamma': 'scale', 'epsilon': 0.1},
'RBF SVR (C=100)': {'kernel': 'rbf', 'C': 100.0, 'gamma': 'scale', 'epsilon': 0.1},
'Poly SVR (degree=3)': {'kernel': 'poly', 'degree': 3, 'C': 1.0, 'epsilon': 0.1},
}
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
results = {}
for ax, (name, params) in zip(axes.ravel(), kernels_params.items()):
svr = SVR(**params)
svr.fit(X_train_scaled, y_train)
y_pred = svr.predict(X_test_scaled)
r2 = r2_score(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
mae = mean_absolute_error(y_test, y_pred)
n_sv = len(svr.support_)
results[name] = {'RΒ²': r2, 'RMSE': rmse, 'MAE': mae, 'SVs': n_sv}
# Plot
X_plot = np.linspace(X_train_scaled.min(), X_train_scaled.max(), 500).reshape(-1, 1)
y_plot = svr.predict(X_plot)
ax.scatter(X_train_scaled, y_train, c='steelblue', s=15, alpha=0.5, label='Train')
ax.scatter(X_test_scaled, y_test, c='orange', s=15, alpha=0.5, label='Test')
ax.plot(X_plot, y_plot, 'r-', linewidth=2, label='Prediksi')
# Epsilon tube
ax.fill_between(X_plot.ravel(), y_plot - svr.epsilon, y_plot + svr.epsilon,
alpha=0.15, color='red', label=f'Ξ΅-tube ({svr.epsilon})')
# Support vectors
ax.scatter(X_train_scaled[svr.support_], y_train[svr.support_],
s=100, facecolors='none', edgecolors='green', linewidths=2,
label=f'SVs ({n_sv})')
ax.set_title(f'{name}\nRΒ²={r2:.3f} | RMSE={rmse:.3f}', fontsize=10)
ax.legend(fontsize=7)
ax.grid(True, alpha=0.3)
plt.suptitle('Support Vector Regression β Perbandingan Kernel', fontsize=14)
plt.tight_layout()
plt.show()
# Summary table
print("\n" + "=" * 65)
print("SVR RESULTS SUMMARY")
print("=" * 65)
print(f"{'Model':<25} {'RΒ²':>8} {'RMSE':>8} {'MAE':>8} {'SVs':>6}")
print("-" * 65)
for name, metrics in results.items():
print(f"{name:<25} {metrics['RΒ²']:>8.4f} {metrics['RMSE']:>8.4f} "
f"{metrics['MAE']:>8.4f} {metrics['SVs']:>6d}")
8. Hyperparameter Tuning (C & Gamma)
Dua hyperparameter terpenting pada SVM dengan kernel RBF adalah C dan gamma (Ξ³). Memahami interaksi antara keduanya sangat krusial untuk mendapatkan performa terbaik.
C dan Gamma: Interaksi
| Gamma Kecil | Gamma Besar | |
|---|---|---|
| C Kecil | Bias tinggi, variance rendah β Underfitting | Bias rendah, variance tinggi β Model kompleks |
| C Besar | Bias sedang, variance sedang β Model moderat | Bias rendah, variance tinggi β Overfitting |
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import load_digits
from sklearn.model_selection import GridSearchCV, StratifiedKFold, train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import warnings
warnings.filterwarnings('ignore')
# Load digits dataset
X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Pipeline dengan scaling
pipe = Pipeline([
('scaler', StandardScaler()),
('svm', SVC(kernel='rbf', random_state=42))
])
# Grid Search: C dan Gamma
param_grid = {
'svm__C': [0.01, 0.1, 1, 10, 100, 1000],
'svm__gamma': ['scale', 'auto', 0.001, 0.01, 0.1, 1, 10]
}
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
grid_search = GridSearchCV(
pipe, param_grid,
cv=cv,
scoring='accuracy',
n_jobs=-1,
verbose=1,
refit=True
)
grid_search.fit(X_train, y_train)
print("=" * 60)
print("GRID SEARCH RESULTS")
print("=" * 60)
print(f"Best Score (CV): {grid_search.best_score_:.4f}")
print(f"Best Parameters: {grid_search.best_params_}")
print(f"Test Score: {grid_search.best_estimator_.score(X_test, y_test):.4f}")
# Visualisasi Heatmap C vs Gamma
import pandas as pd
results = grid_search.cv_results_
# Filter hanya kombinasi numeric gamma
mask_numeric = [isinstance(p, (int, float)) for p in param_grid['svm__gamma'][:len(param_grid['svm__gamma'])]]
C_values = param_grid['svm__C']
gamma_values = ['0.001', '0.01', '0.1', '1', '10'] # numeric only
score_matrix = np.zeros((len(C_values), len(gamma_values)))
for i, c in enumerate(C_values):
for j, g in enumerate(['0.001', '0.01', '0.1', '1', '10']):
idx = [k for k, p in enumerate(results['params'])
if p['svm__C'] == c and str(p['svm__gamma']) == g]
if idx:
score_matrix[i, j] = results['mean_test_score'][idx[0]]
plt.figure(figsize=(10, 6))
plt.imshow(score_matrix, cmap='YlOrRd', aspect='auto', interpolation='nearest')
plt.colorbar(label='CV Accuracy')
plt.xticks(range(len(gamma_values)), gamma_values)
plt.yticks(range(len(C_values)), C_values)
plt.xlabel('Gamma')
plt.ylabel('C')
plt.title('Grid Search: Accuracy Heatmap (C vs Gamma)')
# Annotate cells
for i in range(len(C_values)):
for j in range(len(gamma_values)):
plt.text(j, i, f'{score_matrix[i, j]:.3f}',
ha='center', va='center', fontsize=8,
color='white' if score_matrix[i, j] > 0.95 else 'black')
plt.tight_layout()
plt.show()
- Selalu scaling! SVM sangat sensitif terhadap skala fitur. Gunakan StandardScaler
- C: Mulai dari 1.0. Besar = lebih kompleks (overfitting), kecil = lebih sederhana (underfitting)
- Gamma: 'scale' (default sklearn) = 1/(n_features * variance(X)). Good starting point
- RandomizedSearchCV lebih cepat daripada GridSearchCV untuk eksplorasi awal
- SVM training O(nΒ² ~ nΒ³) β untuk dataset > 50k samples, pertimbangkan LinearSVC atau SGDClassifier
9. Kelebihan & Kekurangan
Kelebihan SVM
| Kelebihan | Penjelasan |
|---|---|
| β Efektif di high dimension | Performa bagus bahkan jika jumlah fitur > jumlah sampel (e.g., text classification) |
| β Margin maximization | Generalisasi baik karena margin besar β robust terhadap noise |
| β Memory efficient | Hanya menyimpan support vectors (subset data), bukan seluruh dataset |
| β Versatile kernels | Berbagai kernel tersedia untuk berbagai jenis data |
| β Global optimum | Convex optimization β tidak terjebak local minimum (unlike neural network) |
| β Works well with small data | Performa bagus bahkan dengan sedikit data |
Kekurangan SVM
| Kekurangan | Penjelasan |
|---|---|
| β Scaling sensitif | WAJIB melakukan feature scaling (StandardScaler) |
| β Lambat untuk data besar | Training O(nΒ²~nΒ³) β sangat lambat untuk dataset > 100k samples |
| β Probabilitas tidak langsung | Tidak memberikan probabilitas langsung (butuh probability=True dengan Platt scaling) |
| β Tuning kompleks | C, gamma, kernel type, degree β banyak hyperparameter yang saling berinteraksi |
| β Tidak handle missing values | Perlu imputasi sebelum training |
| β Interpretasi sulit | Model sulit diinterpretasikan (terutama kernel non-linear) |
10. Quiz: Uji Pemahamanmu!
Setelah membaca tutorial di atas, jawablah 5 pertanyaan berikut untuk menguji pemahamanmu tentang SVM: