YOLO Object Detection: v8 hingga v11 — Training, Inference & Deployment

📋 Daftar Isi

Pengenalan YOLO & Object Detection
Evolusi YOLO: v1 hingga v11
Setup & Instalasi
Inference dengan Pre-trained Model
Persiapan Custom Dataset
Training Custom Model
Evaluasi & Visualisasi
Segmentation, Classification & Pose
Deployment ke Produksi
Quiz Pemahaman

1. Pengenalan YOLO & Object Detection

YOLO (You Only Look Once) adalah arsitektur deep learning untuk object detection yang sangat cepat dan akurat. Berbeda dari pendekatan tradisional yang memproses gambar secara bertahap, YOLO memproses seluruh gambar dalam satu kali pass saja.

Object detection adalah tugas mengidentifikasi apa objek dalam gambar dan di mana lokasinya (bounding box). Ini berbeda dari image classification yang hanya menjawab "apa ini gambar apa".

Diagram: Image Classification vs Object Detection

┌─────────────────────────────────────────────────────────────────┐
│  IMAGE CLASSIFICATION          OBJECT DETECTION (YOLO)          │
│                                                                  │
│  ┌──────────────┐              ┌──────────────┐                │
│  │  🐕 🐱      │              │ ┌──┐  ┌──┐  │                │
│  │              │              │ │🐕│  │🐱│  │                │
│  │              │              │ └──┘  └──┘  │                │
│  └──────────────┘              └──────────────┘                │
│  Output: "anjing"              Output:                          │
│                                - anjing [x:10, y:20, w:80,h:60]│
│                                - kucing [x:120, y:30, w:70,h:55]│
│                                + confidence score               │
│                                                                  │
│  Classification = "APA?"      Detection = "APA + DI MANA?"     │
└─────────────────────────────────────────────────────────────────┘

Mengapa YOLO Populer?

Keunggulan	Penjelasan
Real-time speed	30-100+ FPS bahkan di perangkat edge
High accuracy	mAP50 mencapai 50-55%+ di COCO dataset
End-to-end	Satu model untuk deteksi, segmentasi, pose
Easy to use	API sederhana dengan Ultralytics
Well-documented	Komunitas besar, banyak tutorial
Multi-platform	Deploy di GPU, CPU, mobile, edge devices

2. Evolusi YOLO: v1 hingga v11

Timeline Evolusi YOLO

Versi	Tahun	Penulis	Inovasi Utama
YOLOv1	2016	Joseph Redmon	Deteksi real-time pertama, single-pass
YOLOv2	2017	Joseph Redmon	Batch normalization, anchor boxes, multi-scale
YOLOv3	2018	Joseph Redmon	FPN, multi-scale detection, Darknet-53
YOLOv4	2020	Alexey Bochkovskiy	CSPDarknet53, SPP, PANet
YOLOv5	2020	Ultralytics	PyTorch native, auto-anchor, easy training
YOLOv8	2023	Ultralytics	Anchor-free, C2f module, unified API
YOLOv9	2024	Chien-Yao Wang	GELAN, PGI, programmable gradient info
YOLOv10	2024	Tsinghua University	NMS-free, consistent dual assignments
YOLOv11	2025	Ultralytics	C3k2 block, SPPF enhancement, efficiency

💡 Tips Memilih Versi YOLO

Pemula / Produksi → YOLOv8 atau YOLOv11 (API mudah, dokumentasi lengkap)
Edge/Mobile → YOLOv8n atau YOLOv11n (nano, sangat ringan)
Akurasi maksimal → YOLOv9 atau YOLOv11x (extra-large)
Real-time tanpa NMS → YOLOv10

3. Setup & Instalasi

Bash — Instalasi Ultralytics

# =============================================
# Instalasi YOLO (Ultralytics)
# =============================================

# Install ultralytics (mencakup YOLOv8 - v11)
pip install ultralytics

# Verifikasi instalasi
python -c "from ultralytics import YOLO; print('YOLO siap!')"

# Untuk GPU (CUDA)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# Untuk export ke ONNX/TensorRT
pip install onnx onnxruntime-gpu
pip install tensorrt  # Butuh NVIDIA GPU

# Check GPU
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, Device: {torch.cuda.get_device_name(0)}')"

4. Inference dengan Pre-trained Model

Python — YOLO Inference Dasar

# =============================================
# Inference dengan Pre-trained YOLO
# =============================================
from ultralytics import YOLO
import cv2

# Load pre-trained model (COCO dataset - 80 kelas)
# Model sizes: n(nano), s(small), m(medium), l(large), x(extra-large)
model = YOLO("yolo11n.pt")  # YOLOv11 Nano

# ----- 1. Deteksi pada gambar -----
results = model("gambar/street.jpg")

# Analisis hasil
for result in results:
    boxes = result.boxes          # Bounding boxes
    print(f"Jumlah objek terdeteksi: {len(boxes)}")
    
    for box in boxes:
        # Koordinat bounding box
        x1, y1, x2, y2 = box.xyxy[0].tolist()
        confidence = box.conf[0].item()
        class_id = int(box.cls[0].item())
        class_name = model.names[class_id]
        
        print(f"  {class_name}: {confidence:.2f} "
              f"[{x1:.0f}, {y1:.0f}, {x2:.0f}, {y2:.0f}]")
    
    # Simpan gambar dengan bounding box
    result.save(filename="hasil_deteksi.jpg")

# ----- 2. Deteksi pada video -----
results = model("video/jalan.mp4", stream=True, conf=0.5)

for frame_result in results:
    # Setiap frame diproses
    frame_result.save(filename=None)  # Simpan ke default
    
    # Atau proses manual
    annotated_frame = frame_result.plot()  # Gambar dengan bbox
    cv2.imshow("YOLO Detection", annotated_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# ----- 3. Deteksi dari webcam -----
results = model(source=0, stream=True, conf=0.5)  # source=0 = webcam

for result in results:
    annotated = result.plot()
    cv2.imshow("Webcam YOLO", annotated)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cv2.destroyAllWindows()

# ----- 4. Batch processing -----
image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
results = model(image_paths, batch=8)  # Proses 8 gambar sekaligus

# ----- 5. Konfigurasi inference -----
results = model(
    "gambar.jpg",
    conf=0.5,       # Minimum confidence threshold
    iou=0.45,       # NMS IoU threshold
    max_det=100,    # Maksimal deteksi per gambar
    classes=[0, 2], # Hanya person dan car (COCO class IDs)
    imgsz=640,      # Input image size
    half=True,      # FP16 inference (lebih cepat)
)

Model Sizes & Performa

Model	Parameters	mAP50-95	FPS (GPU)	Ukuran File
YOLO11n	2.6M	39.5	~500	~5 MB
YOLO11s	9.4M	47.0	~350	~19 MB
YOLO11m	20.1M	51.5	~200	~40 MB
YOLO11l	25.3M	53.4	~150	~50 MB
YOLO11x	56.9M	54.7	~100	~114 MB

5. Persiapan Custom Dataset

Untuk melatih YOLO pada objek custom, Anda perlu menyiapkan dataset dalam format YOLO: gambar + label teks dengan koordinat bounding box.

Diagram: Format Dataset YOLO

┌─────────────────────────────────────────────────────────────────┐
│                    DATASET YOLO FORMAT                           │
│                                                                  │
│  dataset/                                                       │
│  ├── data.yaml              ← Konfigurasi dataset               │
│  ├── train/                                                      │
│  │   ├── images/             ← Gambar training                  │
│  │   │   ├── img001.jpg                                           │
│  │   │   ├── img002.jpg                                           │
│  │   │   └── ...                                                  │
│  │   └── labels/             ← Label training                   │
│  │       ├── img001.txt      ← Label untuk img001.jpg           │
│  │       ├── img002.txt                                           │
│  │       └── ...                                                  │
│  └── val/                                                        │
│      ├── images/             ← Gambar validasi                  │
│      └── labels/             ← Label validasi                   │
│                                                                  │
│  Format label (.txt):                                            │
│  class_id  x_center  y_center  width  height                    │
│  0         0.5       0.4       0.3    0.2                       │
│  1         0.7       0.6       0.15   0.25                      │
│                                                                  │
│  (Semua nilai di-normalize 0-1 relatif terhadap ukuran gambar)  │
└─────────────────────────────────────────────────────────────────┘

Python — Persiapan Dataset

# =============================================
# Persiapan Dataset YOLO
# =============================================

# ----- 1. data.yaml -----
# Simpan sebagai dataset/data.yaml
yaml_content = """
# Dataset Configuration
path: /path/to/dataset   # Root directory
train: train/images      # Training images path
val: val/images          # Validation images path

# Class names
names:
  0: helm
  1: rompi_safety
  2: orang
  3: kendaraan
  4: alat_berat
"""

# ----- 2. Anotasi dengan Roboflow (Recommended) -----
# 1. Buka roboflow.com, buat project baru
# 2. Upload gambar
# 3. Anotasi bounding box
# 4. Export dalam format "YOLOv8"
# 5. Download dataset.zip

# ----- 3. Convert dari format lain -----
# COCO → YOLO format
from ultralytics.data.converter import convert_coco

convert_coco(
    labels_dir="coco_annotations/",
    save_dir="dataset_yolo/",
    use_segments=False,  # True untuk segmentation
    use_keypoints=False
)

# ----- 4. Split dataset -----
import os
import shutil
import random

def split_dataset(image_dir, label_dir, train_ratio=0.8):
    """Split dataset ke train dan val."""
    images = [f for f in os.listdir(image_dir) if f.endswith(('.jpg', '.png', '.jpeg'))]
    random.shuffle(images)
    
    split_idx = int(len(images) * train_ratio)
    train_imgs = images[:split_idx]
    val_imgs = images[split_idx:]
    
    for split, img_list in [("train", train_imgs), ("val", val_imgs)]:
        os.makedirs(f"dataset/{split}/images", exist_ok=True)
        os.makedirs(f"dataset/{split}/labels", exist_ok=True)
        
        for img in img_list:
            # Copy image
            shutil.copy2(f"{image_dir}/{img}", f"dataset/{split}/images/{img}")
            # Copy label
            label = img.rsplit(".", 1)[0] + ".txt"
            label_path = f"{label_dir}/{label}"
            if os.path.exists(label_path):
                shutil.copy2(label_path, f"dataset/{split}/labels/{label}")

split_dataset("raw/images", "raw/labels")
print("Dataset berhasil di-split!")

6. Training Custom Model

Python — Training YOLO

# =============================================
# Training Custom YOLO Model
# =============================================
from ultralytics import YOLO

# Load pre-trained model (transfer learning)
model = YOLO("yolo11n.pt")

# Training
results = model.train(
    data="dataset/data.yaml",    # Path ke data.yaml
    epochs=100,                   # Jumlah epoch
    imgsz=640,                    # Image size
    batch=16,                     # Batch size (sesuaikan VRAM)
    name="helm_detector",         # Nama experiment
    
    # Hyperparameters
    lr0=0.01,                     # Initial learning rate
    lrf=0.01,                     # Final LR (lr0 * lrf)
    momentum=0.937,               # SGD momentum
    weight_decay=0.0005,          # L2 regularization
    warmup_epochs=3,              # Warmup epochs
    warmup_momentum=0.8,          # Warmup momentum
    
    # Augmentasi
    hsv_h=0.015,                  # HSV-Hue augmentation
    hsv_s=0.7,                    # HSV-Saturation
    hsv_v=0.4,                    # HSV-Value
    degrees=0.0,                  # Rotasi (+/- derajat)
    translate=0.1,                # Translasi
    scale=0.5,                    # Skala
    flipud=0.0,                   # Flip vertikal prob
    fliplr=0.5,                   # Flip horizontal prob
    mosaic=1.0,                   # Mosaic augmentation
    mixup=0.0,                    # MixUp augmentation
    
    # Hardware
    device=0,                     # GPU device (0, 1, "cpu")
    workers=8,                    # Data loader workers
    
    # Other
    patience=50,                  # Early stopping patience
    resume=False,                 # Resume training
    amp=True,                     # Automatic Mixed Precision
)

# Hasil training tersimpan di:
# runs/detect/helm_detector/
# ├── weights/
# │   ├── best.pt      ← Model terbaik
# │   └── last.pt      ← Model epoch terakhir
# ├── results.csv      ← Metrics per epoch
# ├── confusion_matrix.png
# ├── results.png
# └── ...

Python — Resume & Fine-tune Training

# =============================================
# Resume Training (jika terputus)
# =============================================
model = YOLO("runs/detect/helm_detector/weights/last.pt")
model.train(resume=True)

# =============================================
# Fine-tune dari custom model
# =============================================
# Fine-tune model yang sudah di-train pada dataset baru
model = YOLO("runs/detect/helm_detector/weights/best.pt")
model.train(
    data="dataset_new/data.yaml",
    epochs=50,
    lr0=0.001,   # Lebih kecil untuk fine-tune
    freeze=10,   # Freeze 10 layer pertama
)

7. Evaluasi & Visualisasi

Python — Evaluasi Model

# =============================================
# Evaluasi Model YOLO
# =============================================
from ultralytics import YOLO

# Load trained model
model = YOLO("runs/detect/helm_detector/weights/best.pt")

# Evaluasi pada validation set
metrics = model.val(data="dataset/data.yaml")

# Print metrics
print(f"mAP50:     {metrics.box.map50:.4f}")      # mAP @ IoU=0.50
print(f"mAP50-95:  {metrics.box.map:.4f}")         # mAP @ IoU=0.50:0.95
print(f"Precision: {metrics.box.mp:.4f}")           # Mean precision
print(f"Recall:    {metrics.box.mr:.4f}")           # Mean recall

# Per-class metrics
for i, name in enumerate(metrics.names):
    print(f"  {name}: mAP50={metrics.box.maps[i]:.4f}")

# Confusion Matrix
# Otomatis tersimpan di runs/detect/val/confusion_matrix.png

# =============================================
# Inference dengan trained model
# =============================================
model = YOLO("runs/detect/helm_detector/weights/best.pt")

# Deteksi pada gambar baru
results = model("test_gambar.jpg", conf=0.5)

for r in results:
    # Visualisasi
    annotated = r.plot()
    
    # Simpan
    r.save(filename="hasil_custom_deteksi.jpg")
    
    # Export ke dictionary
    detections = []
    for box in r.boxes:
        detections.append({
            "class": model.names[int(box.cls)],
            "confidence": float(box.conf),
            "bbox": box.xyxy[0].tolist()
        })
    print(f"Deteksi: {detections}")

Memahami Metrics

Metric	Penjelasan	Nilai Ideal
mAP50	Mean Average Precision @ IoU=0.50	> 0.7
mAP50-95	Rata-rata mAP pada IoU 0.50-0.95	> 0.5
Precision	Dari semua deteksi, berapa % yang benar	> 0.8
Recall	Dari semua objek, berapa % yang terdeteksi	> 0.7
F1-Score	Harmonic mean precision & recall	> 0.75

8. Segmentation, Classification & Pose

Python — YOLO Tasks Lainnya

# =============================================
# YOLO untuk Berbagai Tasks
# =============================================
from ultralytics import YOLO

# ----- 1. Instance Segmentation -----
seg_model = YOLO("yolo11n-seg.pt")
results = seg_model("gambar.jpg")
for r in results:
    r.save()  # Simpan dengan mask
    # Access masks
    if r.masks is not None:
        masks = r.masks.data  # Tensor mask
        print(f"Jumlah mask: {len(masks)}")

# ----- 2. Image Classification -----
cls_model = YOLO("yolo11n-cls.pt")
results = cls_model("gambar.jpg")
for r in results:
    probs = r.probs
    print(f"Top-1: {r.names[probs.top1]} ({probs.top1conf:.2f})")
    print(f"Top-5: {[r.names[i] for i in probs.top5]}")

# ----- 3. Pose Estimation -----
pose_model = YOLO("yolo11n-pose.pt")
results = pose_model("orang.jpg")
for r in results:
    if r.keypoints is not None:
        keypoints = r.keypoints.xy  # 17 keypoints (COCO format)
        print(f"Keypoints shape: {keypoints.shape}")

# ----- 4. Oriented Bounding Box (OBB) -----
obb_model = YOLO("yolo11n-obb.pt")
results = obb_model("satellite.jpg")
# Untuk deteksi objek dengan rotasi (drone, satellite imagery)

9. Deployment ke Produksi

Python — Export & Deploy YOLO

# =============================================
# Export Model ke Format Produksi
# =============================================
from ultralytics import YOLO

model = YOLO("runs/detect/helm_detector/weights/best.pt")

# Export ke berbagai format
model.export(format="onnx", imgsz=640, simplify=True)    # ONNX
model.export(format="torchscript")                         # TorchScript
model.export(format="engine", half=True)                   # TensorRT (NVIDIA)
model.export(format="coreml")                              # CoreML (Apple)
model.export(format="tflite")                              # TFLite (Mobile)
model.export(format="ncnn")                                # NCNN (Mobile)

# =============================================
# Deploy sebagai API dengan FastAPI
# =============================================
from fastapi import FastAPI, UploadFile
from ultralytics import YOLO
import io
from PIL import Image

app = FastAPI()
model = YOLO("best.pt")

@app.post("/detect")
async def detect(file: UploadFile):
    image_bytes = await file.read()
    image = Image.open(io.BytesIO(image_bytes))
    
    results = model(image, conf=0.5)
    
    detections = []
    for r in results:
        for box in r.boxes:
            detections.append({
                "class": model.names[int(box.cls)],
                "confidence": round(float(box.conf), 3),
                "bbox": [round(x, 1) for x in box.xyxy[0].tolist()]
            })
    
    return {"detections": detections, "count": len(detections)}

# Run: uvicorn app:app --host 0.0.0.0 --port 8000

# =============================================
# Deploy dengan Streamlit (UI)
# =============================================
# pip install streamlit
# file: app_streamlit.py
"""
import streamlit as st
from ultralytics import YOLO
from PIL import Image

model = YOLO("best.pt")
st.title("🔍 YOLO Object Detection")
uploaded = st.file_uploader("Upload gambar", type=["jpg", "png", "jpeg"])

if uploaded:
    image = Image.open(uploaded)
    st.image(image, caption="Gambar asli")
    
    results = model(image)
    st.image(results[0].plot(), caption="Hasil deteksi")
    
    for box in results[0].boxes:
        st.write(f"- {model.names[int(box.cls)]}: {float(box.conf):.1%}")
"""

10. Quiz Pemahaman

Rangkuman

📝 Poin Penting

YOLO — object detection real-time, satu pass untuk deteksi semua objek
Ultralytics — framework unified untuk YOLOv8-v11, API sederhana
Custom dataset — format YOLO (class x_center y_center width height, normalized)
Training — gunakan transfer learning dari pre-trained model
Metrics — mAP50, mAP50-95, precision, recall
Multi-task — YOLO bisa detection, segmentation, classification, pose, OBB
Deployment — export ke ONNX/TensorRT, deploy dengan FastAPI/Streamlit