AI & Data Science

YOLO Object Detection: v8 hingga v11 — Training, Inference, Custom Dataset & Deployment

Tutorial lengkap YOLO Object Detection — dari konsep dasar, persiapan dataset, training custom model, inference, hingga deployment ke produksi dengan Ultralytics

1. Pengenalan YOLO & Object Detection

YOLO (You Only Look Once) adalah arsitektur deep learning untuk object detection yang sangat cepat dan akurat. Berbeda dari pendekatan tradisional yang memproses gambar secara bertahap, YOLO memproses seluruh gambar dalam satu kali pass saja.

Object detection adalah tugas mengidentifikasi apa objek dalam gambar dan di mana lokasinya (bounding box). Ini berbeda dari image classification yang hanya menjawab "apa ini gambar apa".

Diagram: Image Classification vs Object Detection
┌─────────────────────────────────────────────────────────────────┐
│  IMAGE CLASSIFICATION          OBJECT DETECTION (YOLO)          │
│                                                                  │
│  ┌──────────────┐              ┌──────────────┐                │
│  │  🐕 🐱      │              │ ┌──┐  ┌──┐  │                │
│  │              │              │ │🐕│  │🐱│  │                │
│  │              │              │ └──┘  └──┘  │                │
│  └──────────────┘              └──────────────┘                │
│  Output: "anjing"              Output:                          │
│                                - anjing [x:10, y:20, w:80,h:60]│
│                                - kucing [x:120, y:30, w:70,h:55]│
│                                + confidence score               │
│                                                                  │
│  Classification = "APA?"      Detection = "APA + DI MANA?"     │
└─────────────────────────────────────────────────────────────────┘

Mengapa YOLO Populer?

Keunggulan Penjelasan
Real-time speed30-100+ FPS bahkan di perangkat edge
High accuracymAP50 mencapai 50-55%+ di COCO dataset
End-to-endSatu model untuk deteksi, segmentasi, pose
Easy to useAPI sederhana dengan Ultralytics
Well-documentedKomunitas besar, banyak tutorial
Multi-platformDeploy di GPU, CPU, mobile, edge devices

2. Evolusi YOLO: v1 hingga v11

Timeline Evolusi YOLO

Versi Tahun Penulis Inovasi Utama
YOLOv12016Joseph RedmonDeteksi real-time pertama, single-pass
YOLOv22017Joseph RedmonBatch normalization, anchor boxes, multi-scale
YOLOv32018Joseph RedmonFPN, multi-scale detection, Darknet-53
YOLOv42020Alexey BochkovskiyCSPDarknet53, SPP, PANet
YOLOv52020UltralyticsPyTorch native, auto-anchor, easy training
YOLOv82023UltralyticsAnchor-free, C2f module, unified API
YOLOv92024Chien-Yao WangGELAN, PGI, programmable gradient info
YOLOv102024Tsinghua UniversityNMS-free, consistent dual assignments
YOLOv112025UltralyticsC3k2 block, SPPF enhancement, efficiency
💡 Tips Memilih Versi YOLO
  • Pemula / Produksi → YOLOv8 atau YOLOv11 (API mudah, dokumentasi lengkap)
  • Edge/Mobile → YOLOv8n atau YOLOv11n (nano, sangat ringan)
  • Akurasi maksimal → YOLOv9 atau YOLOv11x (extra-large)
  • Real-time tanpa NMS → YOLOv10

3. Setup & Instalasi

Bash — Instalasi Ultralytics
# =============================================
# Instalasi YOLO (Ultralytics)
# =============================================

# Install ultralytics (mencakup YOLOv8 - v11)
pip install ultralytics

# Verifikasi instalasi
python -c "from ultralytics import YOLO; print('YOLO siap!')"

# Untuk GPU (CUDA)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# Untuk export ke ONNX/TensorRT
pip install onnx onnxruntime-gpu
pip install tensorrt  # Butuh NVIDIA GPU

# Check GPU
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, Device: {torch.cuda.get_device_name(0)}')"

4. Inference dengan Pre-trained Model

Python — YOLO Inference Dasar
# =============================================
# Inference dengan Pre-trained YOLO
# =============================================
from ultralytics import YOLO
import cv2

# Load pre-trained model (COCO dataset - 80 kelas)
# Model sizes: n(nano), s(small), m(medium), l(large), x(extra-large)
model = YOLO("yolo11n.pt")  # YOLOv11 Nano

# ----- 1. Deteksi pada gambar -----
results = model("gambar/street.jpg")

# Analisis hasil
for result in results:
    boxes = result.boxes          # Bounding boxes
    print(f"Jumlah objek terdeteksi: {len(boxes)}")
    
    for box in boxes:
        # Koordinat bounding box
        x1, y1, x2, y2 = box.xyxy[0].tolist()
        confidence = box.conf[0].item()
        class_id = int(box.cls[0].item())
        class_name = model.names[class_id]
        
        print(f"  {class_name}: {confidence:.2f} "
              f"[{x1:.0f}, {y1:.0f}, {x2:.0f}, {y2:.0f}]")
    
    # Simpan gambar dengan bounding box
    result.save(filename="hasil_deteksi.jpg")

# ----- 2. Deteksi pada video -----
results = model("video/jalan.mp4", stream=True, conf=0.5)

for frame_result in results:
    # Setiap frame diproses
    frame_result.save(filename=None)  # Simpan ke default
    
    # Atau proses manual
    annotated_frame = frame_result.plot()  # Gambar dengan bbox
    cv2.imshow("YOLO Detection", annotated_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# ----- 3. Deteksi dari webcam -----
results = model(source=0, stream=True, conf=0.5)  # source=0 = webcam

for result in results:
    annotated = result.plot()
    cv2.imshow("Webcam YOLO", annotated)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cv2.destroyAllWindows()

# ----- 4. Batch processing -----
image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
results = model(image_paths, batch=8)  # Proses 8 gambar sekaligus

# ----- 5. Konfigurasi inference -----
results = model(
    "gambar.jpg",
    conf=0.5,       # Minimum confidence threshold
    iou=0.45,       # NMS IoU threshold
    max_det=100,    # Maksimal deteksi per gambar
    classes=[0, 2], # Hanya person dan car (COCO class IDs)
    imgsz=640,      # Input image size
    half=True,      # FP16 inference (lebih cepat)
)

Model Sizes & Performa

Model Parameters mAP50-95 FPS (GPU) Ukuran File
YOLO11n2.6M39.5~500~5 MB
YOLO11s9.4M47.0~350~19 MB
YOLO11m20.1M51.5~200~40 MB
YOLO11l25.3M53.4~150~50 MB
YOLO11x56.9M54.7~100~114 MB

5. Persiapan Custom Dataset

Untuk melatih YOLO pada objek custom, Anda perlu menyiapkan dataset dalam format YOLO: gambar + label teks dengan koordinat bounding box.

Diagram: Format Dataset YOLO
┌─────────────────────────────────────────────────────────────────┐
│                    DATASET YOLO FORMAT                           │
│                                                                  │
│  dataset/                                                       │
│  ├── data.yaml              ← Konfigurasi dataset               │
│  ├── train/                                                      │
│  │   ├── images/             ← Gambar training                  │
│  │   │   ├── img001.jpg                                           │
│  │   │   ├── img002.jpg                                           │
│  │   │   └── ...                                                  │
│  │   └── labels/             ← Label training                   │
│  │       ├── img001.txt      ← Label untuk img001.jpg           │
│  │       ├── img002.txt                                           │
│  │       └── ...                                                  │
│  └── val/                                                        │
│      ├── images/             ← Gambar validasi                  │
│      └── labels/             ← Label validasi                   │
│                                                                  │
│  Format label (.txt):                                            │
│  class_id  x_center  y_center  width  height                    │
│  0         0.5       0.4       0.3    0.2                       │
│  1         0.7       0.6       0.15   0.25                      │
│                                                                  │
│  (Semua nilai di-normalize 0-1 relatif terhadap ukuran gambar)  │
└─────────────────────────────────────────────────────────────────┘
Python — Persiapan Dataset
# =============================================
# Persiapan Dataset YOLO
# =============================================

# ----- 1. data.yaml -----
# Simpan sebagai dataset/data.yaml
yaml_content = """
# Dataset Configuration
path: /path/to/dataset   # Root directory
train: train/images      # Training images path
val: val/images          # Validation images path

# Class names
names:
  0: helm
  1: rompi_safety
  2: orang
  3: kendaraan
  4: alat_berat
"""

# ----- 2. Anotasi dengan Roboflow (Recommended) -----
# 1. Buka roboflow.com, buat project baru
# 2. Upload gambar
# 3. Anotasi bounding box
# 4. Export dalam format "YOLOv8"
# 5. Download dataset.zip

# ----- 3. Convert dari format lain -----
# COCO → YOLO format
from ultralytics.data.converter import convert_coco

convert_coco(
    labels_dir="coco_annotations/",
    save_dir="dataset_yolo/",
    use_segments=False,  # True untuk segmentation
    use_keypoints=False
)

# ----- 4. Split dataset -----
import os
import shutil
import random

def split_dataset(image_dir, label_dir, train_ratio=0.8):
    """Split dataset ke train dan val."""
    images = [f for f in os.listdir(image_dir) if f.endswith(('.jpg', '.png', '.jpeg'))]
    random.shuffle(images)
    
    split_idx = int(len(images) * train_ratio)
    train_imgs = images[:split_idx]
    val_imgs = images[split_idx:]
    
    for split, img_list in [("train", train_imgs), ("val", val_imgs)]:
        os.makedirs(f"dataset/{split}/images", exist_ok=True)
        os.makedirs(f"dataset/{split}/labels", exist_ok=True)
        
        for img in img_list:
            # Copy image
            shutil.copy2(f"{image_dir}/{img}", f"dataset/{split}/images/{img}")
            # Copy label
            label = img.rsplit(".", 1)[0] + ".txt"
            label_path = f"{label_dir}/{label}"
            if os.path.exists(label_path):
                shutil.copy2(label_path, f"dataset/{split}/labels/{label}")

split_dataset("raw/images", "raw/labels")
print("Dataset berhasil di-split!")

6. Training Custom Model

Python — Training YOLO
# =============================================
# Training Custom YOLO Model
# =============================================
from ultralytics import YOLO

# Load pre-trained model (transfer learning)
model = YOLO("yolo11n.pt")

# Training
results = model.train(
    data="dataset/data.yaml",    # Path ke data.yaml
    epochs=100,                   # Jumlah epoch
    imgsz=640,                    # Image size
    batch=16,                     # Batch size (sesuaikan VRAM)
    name="helm_detector",         # Nama experiment
    
    # Hyperparameters
    lr0=0.01,                     # Initial learning rate
    lrf=0.01,                     # Final LR (lr0 * lrf)
    momentum=0.937,               # SGD momentum
    weight_decay=0.0005,          # L2 regularization
    warmup_epochs=3,              # Warmup epochs
    warmup_momentum=0.8,          # Warmup momentum
    
    # Augmentasi
    hsv_h=0.015,                  # HSV-Hue augmentation
    hsv_s=0.7,                    # HSV-Saturation
    hsv_v=0.4,                    # HSV-Value
    degrees=0.0,                  # Rotasi (+/- derajat)
    translate=0.1,                # Translasi
    scale=0.5,                    # Skala
    flipud=0.0,                   # Flip vertikal prob
    fliplr=0.5,                   # Flip horizontal prob
    mosaic=1.0,                   # Mosaic augmentation
    mixup=0.0,                    # MixUp augmentation
    
    # Hardware
    device=0,                     # GPU device (0, 1, "cpu")
    workers=8,                    # Data loader workers
    
    # Other
    patience=50,                  # Early stopping patience
    resume=False,                 # Resume training
    amp=True,                     # Automatic Mixed Precision
)

# Hasil training tersimpan di:
# runs/detect/helm_detector/
# ├── weights/
# │   ├── best.pt      ← Model terbaik
# │   └── last.pt      ← Model epoch terakhir
# ├── results.csv      ← Metrics per epoch
# ├── confusion_matrix.png
# ├── results.png
# └── ...
Python — Resume & Fine-tune Training
# =============================================
# Resume Training (jika terputus)
# =============================================
model = YOLO("runs/detect/helm_detector/weights/last.pt")
model.train(resume=True)

# =============================================
# Fine-tune dari custom model
# =============================================
# Fine-tune model yang sudah di-train pada dataset baru
model = YOLO("runs/detect/helm_detector/weights/best.pt")
model.train(
    data="dataset_new/data.yaml",
    epochs=50,
    lr0=0.001,   # Lebih kecil untuk fine-tune
    freeze=10,   # Freeze 10 layer pertama
)

7. Evaluasi & Visualisasi

Python — Evaluasi Model
# =============================================
# Evaluasi Model YOLO
# =============================================
from ultralytics import YOLO

# Load trained model
model = YOLO("runs/detect/helm_detector/weights/best.pt")

# Evaluasi pada validation set
metrics = model.val(data="dataset/data.yaml")

# Print metrics
print(f"mAP50:     {metrics.box.map50:.4f}")      # mAP @ IoU=0.50
print(f"mAP50-95:  {metrics.box.map:.4f}")         # mAP @ IoU=0.50:0.95
print(f"Precision: {metrics.box.mp:.4f}")           # Mean precision
print(f"Recall:    {metrics.box.mr:.4f}")           # Mean recall

# Per-class metrics
for i, name in enumerate(metrics.names):
    print(f"  {name}: mAP50={metrics.box.maps[i]:.4f}")

# Confusion Matrix
# Otomatis tersimpan di runs/detect/val/confusion_matrix.png

# =============================================
# Inference dengan trained model
# =============================================
model = YOLO("runs/detect/helm_detector/weights/best.pt")

# Deteksi pada gambar baru
results = model("test_gambar.jpg", conf=0.5)

for r in results:
    # Visualisasi
    annotated = r.plot()
    
    # Simpan
    r.save(filename="hasil_custom_deteksi.jpg")
    
    # Export ke dictionary
    detections = []
    for box in r.boxes:
        detections.append({
            "class": model.names[int(box.cls)],
            "confidence": float(box.conf),
            "bbox": box.xyxy[0].tolist()
        })
    print(f"Deteksi: {detections}")

Memahami Metrics

Metric Penjelasan Nilai Ideal
mAP50Mean Average Precision @ IoU=0.50> 0.7
mAP50-95Rata-rata mAP pada IoU 0.50-0.95> 0.5
PrecisionDari semua deteksi, berapa % yang benar> 0.8
RecallDari semua objek, berapa % yang terdeteksi> 0.7
F1-ScoreHarmonic mean precision & recall> 0.75

8. Segmentation, Classification & Pose

Python — YOLO Tasks Lainnya
# =============================================
# YOLO untuk Berbagai Tasks
# =============================================
from ultralytics import YOLO

# ----- 1. Instance Segmentation -----
seg_model = YOLO("yolo11n-seg.pt")
results = seg_model("gambar.jpg")
for r in results:
    r.save()  # Simpan dengan mask
    # Access masks
    if r.masks is not None:
        masks = r.masks.data  # Tensor mask
        print(f"Jumlah mask: {len(masks)}")

# ----- 2. Image Classification -----
cls_model = YOLO("yolo11n-cls.pt")
results = cls_model("gambar.jpg")
for r in results:
    probs = r.probs
    print(f"Top-1: {r.names[probs.top1]} ({probs.top1conf:.2f})")
    print(f"Top-5: {[r.names[i] for i in probs.top5]}")

# ----- 3. Pose Estimation -----
pose_model = YOLO("yolo11n-pose.pt")
results = pose_model("orang.jpg")
for r in results:
    if r.keypoints is not None:
        keypoints = r.keypoints.xy  # 17 keypoints (COCO format)
        print(f"Keypoints shape: {keypoints.shape}")

# ----- 4. Oriented Bounding Box (OBB) -----
obb_model = YOLO("yolo11n-obb.pt")
results = obb_model("satellite.jpg")
# Untuk deteksi objek dengan rotasi (drone, satellite imagery)

9. Deployment ke Produksi

Python — Export & Deploy YOLO
# =============================================
# Export Model ke Format Produksi
# =============================================
from ultralytics import YOLO

model = YOLO("runs/detect/helm_detector/weights/best.pt")

# Export ke berbagai format
model.export(format="onnx", imgsz=640, simplify=True)    # ONNX
model.export(format="torchscript")                         # TorchScript
model.export(format="engine", half=True)                   # TensorRT (NVIDIA)
model.export(format="coreml")                              # CoreML (Apple)
model.export(format="tflite")                              # TFLite (Mobile)
model.export(format="ncnn")                                # NCNN (Mobile)

# =============================================
# Deploy sebagai API dengan FastAPI
# =============================================
from fastapi import FastAPI, UploadFile
from ultralytics import YOLO
import io
from PIL import Image

app = FastAPI()
model = YOLO("best.pt")

@app.post("/detect")
async def detect(file: UploadFile):
    image_bytes = await file.read()
    image = Image.open(io.BytesIO(image_bytes))
    
    results = model(image, conf=0.5)
    
    detections = []
    for r in results:
        for box in r.boxes:
            detections.append({
                "class": model.names[int(box.cls)],
                "confidence": round(float(box.conf), 3),
                "bbox": [round(x, 1) for x in box.xyxy[0].tolist()]
            })
    
    return {"detections": detections, "count": len(detections)}

# Run: uvicorn app:app --host 0.0.0.0 --port 8000

# =============================================
# Deploy dengan Streamlit (UI)
# =============================================
# pip install streamlit
# file: app_streamlit.py
"""
import streamlit as st
from ultralytics import YOLO
from PIL import Image

model = YOLO("best.pt")
st.title("🔍 YOLO Object Detection")
uploaded = st.file_uploader("Upload gambar", type=["jpg", "png", "jpeg"])

if uploaded:
    image = Image.open(uploaded)
    st.image(image, caption="Gambar asli")
    
    results = model(image)
    st.image(results[0].plot(), caption="Hasil deteksi")
    
    for box in results[0].boxes:
        st.write(f"- {model.names[int(box.cls)]}: {float(box.conf):.1%}")
"""

10. Quiz Pemahaman

1. Apa keunggulan utama YOLO dibanding object detector lain?

2. Format label YOLO berisi apa?

3. Apa fungsi dari data.yaml dalam training YOLO?

4. Apa perbedaan model YOLO dengan suffix -n, -s, -m, -l, -x?

5. Mengapa augmentasi penting dalam training YOLO?

Rangkuman

📝 Poin Penting
  • YOLO — object detection real-time, satu pass untuk deteksi semua objek
  • Ultralytics — framework unified untuk YOLOv8-v11, API sederhana
  • Custom dataset — format YOLO (class x_center y_center width height, normalized)
  • Training — gunakan transfer learning dari pre-trained model
  • Metrics — mAP50, mAP50-95, precision, recall
  • Multi-task — YOLO bisa detection, segmentation, classification, pose, OBB
  • Deployment — export ke ONNX/TensorRT, deploy dengan FastAPI/Streamlit