1. Pengenalan YOLO & Object Detection
YOLO (You Only Look Once) adalah arsitektur deep learning untuk object detection yang sangat cepat dan akurat. Berbeda dari pendekatan tradisional yang memproses gambar secara bertahap, YOLO memproses seluruh gambar dalam satu kali pass saja.
Object detection adalah tugas mengidentifikasi apa objek dalam gambar dan di mana lokasinya (bounding box). Ini berbeda dari image classification yang hanya menjawab "apa ini gambar apa".
┌─────────────────────────────────────────────────────────────────┐ │ IMAGE CLASSIFICATION OBJECT DETECTION (YOLO) │ │ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ 🐕 🐱 │ │ ┌──┐ ┌──┐ │ │ │ │ │ │ │🐕│ │🐱│ │ │ │ │ │ │ └──┘ └──┘ │ │ │ └──────────────┘ └──────────────┘ │ │ Output: "anjing" Output: │ │ - anjing [x:10, y:20, w:80,h:60]│ │ - kucing [x:120, y:30, w:70,h:55]│ │ + confidence score │ │ │ │ Classification = "APA?" Detection = "APA + DI MANA?" │ └─────────────────────────────────────────────────────────────────┘
Mengapa YOLO Populer?
| Keunggulan | Penjelasan |
|---|---|
| Real-time speed | 30-100+ FPS bahkan di perangkat edge |
| High accuracy | mAP50 mencapai 50-55%+ di COCO dataset |
| End-to-end | Satu model untuk deteksi, segmentasi, pose |
| Easy to use | API sederhana dengan Ultralytics |
| Well-documented | Komunitas besar, banyak tutorial |
| Multi-platform | Deploy di GPU, CPU, mobile, edge devices |
2. Evolusi YOLO: v1 hingga v11
Timeline Evolusi YOLO
| Versi | Tahun | Penulis | Inovasi Utama |
|---|---|---|---|
| YOLOv1 | 2016 | Joseph Redmon | Deteksi real-time pertama, single-pass |
| YOLOv2 | 2017 | Joseph Redmon | Batch normalization, anchor boxes, multi-scale |
| YOLOv3 | 2018 | Joseph Redmon | FPN, multi-scale detection, Darknet-53 |
| YOLOv4 | 2020 | Alexey Bochkovskiy | CSPDarknet53, SPP, PANet |
| YOLOv5 | 2020 | Ultralytics | PyTorch native, auto-anchor, easy training |
| YOLOv8 | 2023 | Ultralytics | Anchor-free, C2f module, unified API |
| YOLOv9 | 2024 | Chien-Yao Wang | GELAN, PGI, programmable gradient info |
| YOLOv10 | 2024 | Tsinghua University | NMS-free, consistent dual assignments |
| YOLOv11 | 2025 | Ultralytics | C3k2 block, SPPF enhancement, efficiency |
- Pemula / Produksi → YOLOv8 atau YOLOv11 (API mudah, dokumentasi lengkap)
- Edge/Mobile → YOLOv8n atau YOLOv11n (nano, sangat ringan)
- Akurasi maksimal → YOLOv9 atau YOLOv11x (extra-large)
- Real-time tanpa NMS → YOLOv10
3. Setup & Instalasi
# =============================================
# Instalasi YOLO (Ultralytics)
# =============================================
# Install ultralytics (mencakup YOLOv8 - v11)
pip install ultralytics
# Verifikasi instalasi
python -c "from ultralytics import YOLO; print('YOLO siap!')"
# Untuk GPU (CUDA)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
# Untuk export ke ONNX/TensorRT
pip install onnx onnxruntime-gpu
pip install tensorrt # Butuh NVIDIA GPU
# Check GPU
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, Device: {torch.cuda.get_device_name(0)}')"
4. Inference dengan Pre-trained Model
# =============================================
# Inference dengan Pre-trained YOLO
# =============================================
from ultralytics import YOLO
import cv2
# Load pre-trained model (COCO dataset - 80 kelas)
# Model sizes: n(nano), s(small), m(medium), l(large), x(extra-large)
model = YOLO("yolo11n.pt") # YOLOv11 Nano
# ----- 1. Deteksi pada gambar -----
results = model("gambar/street.jpg")
# Analisis hasil
for result in results:
boxes = result.boxes # Bounding boxes
print(f"Jumlah objek terdeteksi: {len(boxes)}")
for box in boxes:
# Koordinat bounding box
x1, y1, x2, y2 = box.xyxy[0].tolist()
confidence = box.conf[0].item()
class_id = int(box.cls[0].item())
class_name = model.names[class_id]
print(f" {class_name}: {confidence:.2f} "
f"[{x1:.0f}, {y1:.0f}, {x2:.0f}, {y2:.0f}]")
# Simpan gambar dengan bounding box
result.save(filename="hasil_deteksi.jpg")
# ----- 2. Deteksi pada video -----
results = model("video/jalan.mp4", stream=True, conf=0.5)
for frame_result in results:
# Setiap frame diproses
frame_result.save(filename=None) # Simpan ke default
# Atau proses manual
annotated_frame = frame_result.plot() # Gambar dengan bbox
cv2.imshow("YOLO Detection", annotated_frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# ----- 3. Deteksi dari webcam -----
results = model(source=0, stream=True, conf=0.5) # source=0 = webcam
for result in results:
annotated = result.plot()
cv2.imshow("Webcam YOLO", annotated)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
# ----- 4. Batch processing -----
image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
results = model(image_paths, batch=8) # Proses 8 gambar sekaligus
# ----- 5. Konfigurasi inference -----
results = model(
"gambar.jpg",
conf=0.5, # Minimum confidence threshold
iou=0.45, # NMS IoU threshold
max_det=100, # Maksimal deteksi per gambar
classes=[0, 2], # Hanya person dan car (COCO class IDs)
imgsz=640, # Input image size
half=True, # FP16 inference (lebih cepat)
)
Model Sizes & Performa
| Model | Parameters | mAP50-95 | FPS (GPU) | Ukuran File |
|---|---|---|---|---|
| YOLO11n | 2.6M | 39.5 | ~500 | ~5 MB |
| YOLO11s | 9.4M | 47.0 | ~350 | ~19 MB |
| YOLO11m | 20.1M | 51.5 | ~200 | ~40 MB |
| YOLO11l | 25.3M | 53.4 | ~150 | ~50 MB |
| YOLO11x | 56.9M | 54.7 | ~100 | ~114 MB |
5. Persiapan Custom Dataset
Untuk melatih YOLO pada objek custom, Anda perlu menyiapkan dataset dalam format YOLO: gambar + label teks dengan koordinat bounding box.
┌─────────────────────────────────────────────────────────────────┐ │ DATASET YOLO FORMAT │ │ │ │ dataset/ │ │ ├── data.yaml ← Konfigurasi dataset │ │ ├── train/ │ │ │ ├── images/ ← Gambar training │ │ │ │ ├── img001.jpg │ │ │ │ ├── img002.jpg │ │ │ │ └── ... │ │ │ └── labels/ ← Label training │ │ │ ├── img001.txt ← Label untuk img001.jpg │ │ │ ├── img002.txt │ │ │ └── ... │ │ └── val/ │ │ ├── images/ ← Gambar validasi │ │ └── labels/ ← Label validasi │ │ │ │ Format label (.txt): │ │ class_id x_center y_center width height │ │ 0 0.5 0.4 0.3 0.2 │ │ 1 0.7 0.6 0.15 0.25 │ │ │ │ (Semua nilai di-normalize 0-1 relatif terhadap ukuran gambar) │ └─────────────────────────────────────────────────────────────────┘
# =============================================
# Persiapan Dataset YOLO
# =============================================
# ----- 1. data.yaml -----
# Simpan sebagai dataset/data.yaml
yaml_content = """
# Dataset Configuration
path: /path/to/dataset # Root directory
train: train/images # Training images path
val: val/images # Validation images path
# Class names
names:
0: helm
1: rompi_safety
2: orang
3: kendaraan
4: alat_berat
"""
# ----- 2. Anotasi dengan Roboflow (Recommended) -----
# 1. Buka roboflow.com, buat project baru
# 2. Upload gambar
# 3. Anotasi bounding box
# 4. Export dalam format "YOLOv8"
# 5. Download dataset.zip
# ----- 3. Convert dari format lain -----
# COCO → YOLO format
from ultralytics.data.converter import convert_coco
convert_coco(
labels_dir="coco_annotations/",
save_dir="dataset_yolo/",
use_segments=False, # True untuk segmentation
use_keypoints=False
)
# ----- 4. Split dataset -----
import os
import shutil
import random
def split_dataset(image_dir, label_dir, train_ratio=0.8):
"""Split dataset ke train dan val."""
images = [f for f in os.listdir(image_dir) if f.endswith(('.jpg', '.png', '.jpeg'))]
random.shuffle(images)
split_idx = int(len(images) * train_ratio)
train_imgs = images[:split_idx]
val_imgs = images[split_idx:]
for split, img_list in [("train", train_imgs), ("val", val_imgs)]:
os.makedirs(f"dataset/{split}/images", exist_ok=True)
os.makedirs(f"dataset/{split}/labels", exist_ok=True)
for img in img_list:
# Copy image
shutil.copy2(f"{image_dir}/{img}", f"dataset/{split}/images/{img}")
# Copy label
label = img.rsplit(".", 1)[0] + ".txt"
label_path = f"{label_dir}/{label}"
if os.path.exists(label_path):
shutil.copy2(label_path, f"dataset/{split}/labels/{label}")
split_dataset("raw/images", "raw/labels")
print("Dataset berhasil di-split!")
6. Training Custom Model
# =============================================
# Training Custom YOLO Model
# =============================================
from ultralytics import YOLO
# Load pre-trained model (transfer learning)
model = YOLO("yolo11n.pt")
# Training
results = model.train(
data="dataset/data.yaml", # Path ke data.yaml
epochs=100, # Jumlah epoch
imgsz=640, # Image size
batch=16, # Batch size (sesuaikan VRAM)
name="helm_detector", # Nama experiment
# Hyperparameters
lr0=0.01, # Initial learning rate
lrf=0.01, # Final LR (lr0 * lrf)
momentum=0.937, # SGD momentum
weight_decay=0.0005, # L2 regularization
warmup_epochs=3, # Warmup epochs
warmup_momentum=0.8, # Warmup momentum
# Augmentasi
hsv_h=0.015, # HSV-Hue augmentation
hsv_s=0.7, # HSV-Saturation
hsv_v=0.4, # HSV-Value
degrees=0.0, # Rotasi (+/- derajat)
translate=0.1, # Translasi
scale=0.5, # Skala
flipud=0.0, # Flip vertikal prob
fliplr=0.5, # Flip horizontal prob
mosaic=1.0, # Mosaic augmentation
mixup=0.0, # MixUp augmentation
# Hardware
device=0, # GPU device (0, 1, "cpu")
workers=8, # Data loader workers
# Other
patience=50, # Early stopping patience
resume=False, # Resume training
amp=True, # Automatic Mixed Precision
)
# Hasil training tersimpan di:
# runs/detect/helm_detector/
# ├── weights/
# │ ├── best.pt ← Model terbaik
# │ └── last.pt ← Model epoch terakhir
# ├── results.csv ← Metrics per epoch
# ├── confusion_matrix.png
# ├── results.png
# └── ...
# =============================================
# Resume Training (jika terputus)
# =============================================
model = YOLO("runs/detect/helm_detector/weights/last.pt")
model.train(resume=True)
# =============================================
# Fine-tune dari custom model
# =============================================
# Fine-tune model yang sudah di-train pada dataset baru
model = YOLO("runs/detect/helm_detector/weights/best.pt")
model.train(
data="dataset_new/data.yaml",
epochs=50,
lr0=0.001, # Lebih kecil untuk fine-tune
freeze=10, # Freeze 10 layer pertama
)
7. Evaluasi & Visualisasi
# =============================================
# Evaluasi Model YOLO
# =============================================
from ultralytics import YOLO
# Load trained model
model = YOLO("runs/detect/helm_detector/weights/best.pt")
# Evaluasi pada validation set
metrics = model.val(data="dataset/data.yaml")
# Print metrics
print(f"mAP50: {metrics.box.map50:.4f}") # mAP @ IoU=0.50
print(f"mAP50-95: {metrics.box.map:.4f}") # mAP @ IoU=0.50:0.95
print(f"Precision: {metrics.box.mp:.4f}") # Mean precision
print(f"Recall: {metrics.box.mr:.4f}") # Mean recall
# Per-class metrics
for i, name in enumerate(metrics.names):
print(f" {name}: mAP50={metrics.box.maps[i]:.4f}")
# Confusion Matrix
# Otomatis tersimpan di runs/detect/val/confusion_matrix.png
# =============================================
# Inference dengan trained model
# =============================================
model = YOLO("runs/detect/helm_detector/weights/best.pt")
# Deteksi pada gambar baru
results = model("test_gambar.jpg", conf=0.5)
for r in results:
# Visualisasi
annotated = r.plot()
# Simpan
r.save(filename="hasil_custom_deteksi.jpg")
# Export ke dictionary
detections = []
for box in r.boxes:
detections.append({
"class": model.names[int(box.cls)],
"confidence": float(box.conf),
"bbox": box.xyxy[0].tolist()
})
print(f"Deteksi: {detections}")
Memahami Metrics
| Metric | Penjelasan | Nilai Ideal |
|---|---|---|
| mAP50 | Mean Average Precision @ IoU=0.50 | > 0.7 |
| mAP50-95 | Rata-rata mAP pada IoU 0.50-0.95 | > 0.5 |
| Precision | Dari semua deteksi, berapa % yang benar | > 0.8 |
| Recall | Dari semua objek, berapa % yang terdeteksi | > 0.7 |
| F1-Score | Harmonic mean precision & recall | > 0.75 |
8. Segmentation, Classification & Pose
# =============================================
# YOLO untuk Berbagai Tasks
# =============================================
from ultralytics import YOLO
# ----- 1. Instance Segmentation -----
seg_model = YOLO("yolo11n-seg.pt")
results = seg_model("gambar.jpg")
for r in results:
r.save() # Simpan dengan mask
# Access masks
if r.masks is not None:
masks = r.masks.data # Tensor mask
print(f"Jumlah mask: {len(masks)}")
# ----- 2. Image Classification -----
cls_model = YOLO("yolo11n-cls.pt")
results = cls_model("gambar.jpg")
for r in results:
probs = r.probs
print(f"Top-1: {r.names[probs.top1]} ({probs.top1conf:.2f})")
print(f"Top-5: {[r.names[i] for i in probs.top5]}")
# ----- 3. Pose Estimation -----
pose_model = YOLO("yolo11n-pose.pt")
results = pose_model("orang.jpg")
for r in results:
if r.keypoints is not None:
keypoints = r.keypoints.xy # 17 keypoints (COCO format)
print(f"Keypoints shape: {keypoints.shape}")
# ----- 4. Oriented Bounding Box (OBB) -----
obb_model = YOLO("yolo11n-obb.pt")
results = obb_model("satellite.jpg")
# Untuk deteksi objek dengan rotasi (drone, satellite imagery)
9. Deployment ke Produksi
# =============================================
# Export Model ke Format Produksi
# =============================================
from ultralytics import YOLO
model = YOLO("runs/detect/helm_detector/weights/best.pt")
# Export ke berbagai format
model.export(format="onnx", imgsz=640, simplify=True) # ONNX
model.export(format="torchscript") # TorchScript
model.export(format="engine", half=True) # TensorRT (NVIDIA)
model.export(format="coreml") # CoreML (Apple)
model.export(format="tflite") # TFLite (Mobile)
model.export(format="ncnn") # NCNN (Mobile)
# =============================================
# Deploy sebagai API dengan FastAPI
# =============================================
from fastapi import FastAPI, UploadFile
from ultralytics import YOLO
import io
from PIL import Image
app = FastAPI()
model = YOLO("best.pt")
@app.post("/detect")
async def detect(file: UploadFile):
image_bytes = await file.read()
image = Image.open(io.BytesIO(image_bytes))
results = model(image, conf=0.5)
detections = []
for r in results:
for box in r.boxes:
detections.append({
"class": model.names[int(box.cls)],
"confidence": round(float(box.conf), 3),
"bbox": [round(x, 1) for x in box.xyxy[0].tolist()]
})
return {"detections": detections, "count": len(detections)}
# Run: uvicorn app:app --host 0.0.0.0 --port 8000
# =============================================
# Deploy dengan Streamlit (UI)
# =============================================
# pip install streamlit
# file: app_streamlit.py
"""
import streamlit as st
from ultralytics import YOLO
from PIL import Image
model = YOLO("best.pt")
st.title("🔍 YOLO Object Detection")
uploaded = st.file_uploader("Upload gambar", type=["jpg", "png", "jpeg"])
if uploaded:
image = Image.open(uploaded)
st.image(image, caption="Gambar asli")
results = model(image)
st.image(results[0].plot(), caption="Hasil deteksi")
for box in results[0].boxes:
st.write(f"- {model.names[int(box.cls)]}: {float(box.conf):.1%}")
"""
10. Quiz Pemahaman
1. Apa keunggulan utama YOLO dibanding object detector lain?
2. Format label YOLO berisi apa?
3. Apa fungsi dari data.yaml dalam training YOLO?
4. Apa perbedaan model YOLO dengan suffix -n, -s, -m, -l, -x?
5. Mengapa augmentasi penting dalam training YOLO?
Rangkuman
- YOLO — object detection real-time, satu pass untuk deteksi semua objek
- Ultralytics — framework unified untuk YOLOv8-v11, API sederhana
- Custom dataset — format YOLO (class x_center y_center width height, normalized)
- Training — gunakan transfer learning dari pre-trained model
- Metrics — mAP50, mAP50-95, precision, recall
- Multi-task — YOLO bisa detection, segmentation, classification, pose, OBB
- Deployment — export ke ONNX/TensorRT, deploy dengan FastAPI/Streamlit