Python

Selenium: Web Automation Python Lengkap

Tutorial lengkap Selenium WebDriver di Python — setup browser, element selectors, waits, form automation, web scraping, headless mode, dan contoh praktis otomasi web

1. Pengenalan Selenium

Selenium adalah framework open-source untuk automasi web browser. Dengan Selenium, Anda bisa mengontrol browser (Chrome, Firefox, Edge, Safari) secara programatis — mengklik tombol, mengisi form, mengambil data, navigasi halaman, dan banyak lagi.

Apa yang Bisa Dilakukan Selenium?

Kegunaan Contoh
Web ScrapingMengambil data dari website yang butuh JavaScript rendering
Automated TestingTesting UI website secara otomatis
Form AutomationOtomatis mengisi dan mengirim form
MonitoringMemantau harga, stok, atau perubahan website
Data EntryMenginput data ke sistem yang tidak punya API
ScreenshotMengambil screenshot halaman web otomatis

Selenium vs Alternatif

Tool Kelebihan Kekurangan
SeleniumMulti-browser, mature, fleksibelRelatif lambat, resource-heavy
PlaywrightCepat, modern, auto-waitLebih baru, komunitas lebih kecil
PuppeteerCepat (Node.js), headless-nativeHanya Chromium, butuh Node.js
BeautifulSoupCepat, ringanTidak bisa handle JavaScript
ScrapySangat cepat untuk crawlingComplex setup, tidak handle JS
Diagram: Selenium Architecture
┌─────────────────────────────────────────────────────────┐
│               SELENIUM ARCHITECTURE                     │
│                                                         │
│  Python Script                                          │
│  ┌──────────────┐                                       │
│  │  Your Code   │                                       │
│  │  (commands)  │                                       │
│  └──────┬───────┘                                       │
│         │                                               │
│         ▼                                               │
│  ┌──────────────┐         ┌──────────────────┐         │
│  │  Selenium    │ ──HTTP──▶  Browser Driver   │         │
│  │  WebDriver   │         │  (ChromeDriver)   │         │
│  │  (library)   │ ◀────── │  (GeckoDriver)    │         │
│  └──────────────┘         └────────┬─────────┘         │
│                                    │                    │
│                                    ▼                    │
│                          ┌──────────────────┐          │
│                          │   Web Browser    │          │
│                          │  (Chrome/Firefox) │          │
│                          │                  │          │
│                          │  ┌────────────┐  │          │
│                          │  │ Web Page   │  │          │
│                          │  │ (HTML/CSS/ │  │          │
│                          │  │  JavaScript)│  │          │
│                          │  └────────────┘  │          │
│                          └──────────────────┘          │
└─────────────────────────────────────────────────────────┘

2. Instalasi dan Setup

Terminal
# Instal Selenium
pip install selenium

# Verifikasi
python -c "import selenium; print(selenium.__version__)"
# Output: 4.x.x

# Selenium 4.6+ sudah include Selenium Manager
# yang otomatis download browser driver!
# Tidak perlu download ChromeDriver/GeckoDriver secara manual.

# Jika ingin pakai versi lama (<4.6), perlu download driver:
# ChromeDriver: https://chromedriver.chromium.org/
# GeckoDriver (Firefox): https://github.com/mozilla/geckodriver

# Tambahan (opsional)
pip install webdriver-manager  # Auto-manage driver versions

Quick Start

Python — Quick Start
from selenium import webdriver
from selenium.webdriver.common.by import By

# Selenium 4.6+ — driver management otomatis!
driver = webdriver.Chrome()  # Buka Chrome
# Atau: webdriver.Firefox()  # Buka Firefox
# Atau: webdriver.Edge()     # Buka Edge

try:
    # Buka halaman web
    driver.get("https://www.google.com")

    # Cetak judul halaman
    print(driver.title)  # Google

    # Cari elemen dan ketik teks
    search_box = driver.find_element(By.NAME, "q")
    search_box.send_keys("Selenium Python tutorial")

    # Tekan Enter
    from selenium.webdriver.common.keys import Keys
    search_box.send_keys(Keys.RETURN)

    # Tunggu dan cetak URL
    import time
    time.sleep(2)
    print(driver.current_url)

finally:
    # Selalu tutup browser
    driver.quit()

3. Dasar Selenium WebDriver

WebDriver adalah antarmuka utama untuk mengontrol browser. Berikut operasi dasar yang harus dikuasai.

Python — WebDriver Basics
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options

# ========================================
# Konfigurasi Browser Options
# ========================================
options = Options()

# Headless mode (tanpa UI)
# options.add_argument('--headless')

# Ukuran window
options.add_argument('--window-size=1920,1080')

# Disable GPU (kadang perlu di server)
options.add_argument('--disable-gpu')

# User Agent custom
options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64)')

# Disable notifikasi
options.add_argument('--disable-notifications')

# Disable extensions
options.add_argument('--disable-extensions')

# Bahasa Indonesia
options.add_argument('--lang=id')

# Incognito mode
options.add_argument('--incognito')

driver = webdriver.Chrome(options=options)

try:
    # ========================================
    # Navigasi Dasar
    # ========================================
    driver.get("https://www.example.com")  # Buka URL

    print(driver.title)       # Judul halaman
    print(driver.current_url) # URL saat ini
    print(driver.page_source) # HTML source code halaman

    # Navigasi
    driver.get("https://www.example.com/page1")
    driver.back()      # Kembali ke halaman sebelumnya
    driver.forward()   # Maju ke halaman berikutnya
    driver.refresh()   # Refresh halaman

    # ========================================
    # Window Management
    # ========================================
    # Dapatkan posisi dan ukuran window
    print(driver.get_window_size())
    print(driver.get_window_position())

    # Set ukuran dan posisi
    driver.set_window_size(1280, 720)
    driver.set_window_position(0, 0)
    driver.maximize_window()
    driver.minimize_window()
    driver.fullscreen_window()

    # ========================================
    # Tab/Window Management
    # ========================================
    # Buka tab baru
    driver.execute_script("window.open('https://example.com', '_blank');")

    # Dapatkan semua tab
    print(driver.window_handles)  # ['CDwindow-...', 'CDwindow-...']

    # Switch ke tab baru
    driver.switch_to.window(driver.window_handles[1])

    # Kembali ke tab pertama
    driver.switch_to.window(driver.window_handles[0])

    # Tutup tab saat ini (bukan quit)
    driver.close()

    # ========================================
    # Frames dan Alerts
    # ========================================
    # Switch ke frame
    driver.switch_to.frame("frame_name")     # by name
    driver.switch_to.frame(0)                 # by index
    driver.switch_to.frame(element)           # by element

    # Kembali ke main content
    driver.switch_to.default_content()

    # Handle alert
    alert = driver.switch_to.alert
    print(alert.text)     # Teks alert
    alert.accept()        # Klik OK
    # alert.dismiss()     # Klik Cancel

    # ========================================
    # Screenshots
    # ========================================
    driver.save_screenshot("screenshot.png")  # Full page

    # Screenshot elemen tertentu
    element = driver.find_element(By.ID, "main-content")
    element.screenshot("element.png")

    # Screenshot sebagai bytes
    png_data = driver.get_screenshot_as_png()

    # ========================================
    # Execute JavaScript
    # ========================================
    # Jalankan JS
    driver.execute_script("alert('Hello from Selenium!');")

    # Return value dari JS
    title = driver.execute_script("return document.title;")

    # Scroll halaman
    driver.execute_script("window.scrollTo(0, 500);")  # Scroll ke bawah
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")  # Bottom

    # Scroll ke elemen
    element = driver.find_element(By.ID, "target")
    driver.execute_script("arguments[0].scrollIntoView();", element)

finally:
    driver.quit()  # Tutup browser dan semua tab

4. Element Selectors

Selector digunakan untuk menemukan elemen HTML di halaman web. Selenium mendukung berbagai jenis selector.

Jenis Selector

Selector By Contoh HTML Kode Python
IDBy.ID<div id="main">find_element(By.ID, "main")
NameBy.NAME<input name="email">find_element(By.NAME, "email")
ClassBy.CLASS_NAME<div class="card">find_element(By.CLASS_NAME, "card")
TagBy.TAG_NAME<h1>...</h1>find_element(By.TAG_NAME, "h1")
CSS SelectorBy.CSS_SELECTOR-find_element(By.CSS_SELECTOR, "div.card h2")
XPathBy.XPATH-find_element(By.XPATH, "//div[@id='main']")
Link TextBy.LINK_TEXT<a>Klik</a>find_element(By.LINK_TEXT, "Klik")
Partial LinkBy.PARTIAL_LINK_TEXT-find_element(By.PARTIAL_LINK_TEXT, "Klik")
Python — Element Selectors
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()

try:
    driver.get("https://example.com")

    # ========================================
    # find_element — cari SATU elemen
    # Mengembalikan WebElement, error jika tidak ditemukan
    # ========================================

    # By ID (PALING CEPAT dan PALING DISARANKAN)
    header = driver.find_element(By.ID, "header")

    # By Name
    email_input = driver.find_element(By.NAME, "email")

    # By Class Name
    cards = driver.find_element(By.CLASS_NAME, "card")

    # By Tag Name
    h1 = driver.find_element(By.TAG_NAME, "h1")

    # By Link Text (teks link yang tepat)
    link = driver.find_element(By.LINK_TEXT, "Klik di sini")

    # By Partial Link Text (sebagian teks)
    link = driver.find_element(By.PARTIAL_LINK_TEXT, "Klik")

    # ========================================
    # CSS Selector — PALING FLEKSIBEL
    # ========================================
    # Berdasarkan ID
    el = driver.find_element(By.CSS_SELECTOR, "#main-content")

    # Berdasarkan class
    el = driver.find_element(By.CSS_SELECTOR, ".card")

    # Tag dengan class
    el = driver.find_element(By.CSS_SELECTOR, "div.card")

    # Nested selector
    el = driver.find_element(By.CSS_SELECTOR, "div.container h1.title")

    # Attribute selector
    el = driver.find_element(By.CSS_SELECTOR, "[data-testid='submit']")
    el = driver.find_element(By.CSS_SELECTOR, "input[type='email']")
    el = driver.find_element(By.CSS_SELECTOR, "a[href*='example']")  # contains
    el = driver.find_element(By.CSS_SELECTOR, "input[name^='user']") # starts with
    el = driver.find_element(By.CSS_SELECTOR, "input[name$='name']") # ends with

    # Child selectors
    el = driver.find_element(By.CSS_SELECTOR, "ul > li:first-child")
    el = driver.find_element(By.CSS_SELECTOR, "div > p:nth-child(2)")

    # ========================================
    # XPath — PALING POWERFUL
    # ========================================
    # Berdasarkan atribut
    el = driver.find_element(By.XPATH, "//div[@id='main']")

    # Contains text
    el = driver.find_element(By.XPATH, "//button[contains(text(), 'Submit')]")

    # Contains attribute
    el = driver.find_element(By.XPATH, "//div[contains(@class, 'card')]")

    # Parent-child
    el = driver.find_element(By.XPATH, "//div[@class='form']//input[@name='email']")

    # Following sibling
    el = driver.find_element(By.XPATH, "//label[@for='email']/following-sibling::input")

    # OR condition
    el = driver.find_element(By.XPATH, "//input[@name='email' or @name='username']")

    # Index
    el = driver.find_element(By.XPATH, "(//div[@class='item'])[3]")  # Item ke-3

    # ========================================
    # find_elements — cari BANYAK elemen
    # Mengembalikan List (bisa kosong, tidak error)
    # ========================================
    all_links = driver.find_elements(By.TAG_NAME, "a")
    print(f"Jumlah link: {len(all_links)}")

    for link in all_links:
        print(f"Text: {link.text}, Href: {link.get_attribute('href')}")

    all_cards = driver.find_elements(By.CSS_SELECTOR, ".card")

    # ========================================
    # Chaining — cari elemen di dalam elemen
    # ========================================
    form = driver.find_element(By.ID, "registration-form")
    email_input = form.find_element(By.NAME, "email")  # Cari di dalam form saja
    submit_btn = form.find_element(By.CSS_SELECTOR, "button[type='submit']")

finally:
    driver.quit()

5. Interaksi dengan Element

Python — Element Interaction
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains

driver = webdriver.Chrome()

try:
    driver.get("https://example.com")

    # ========================================
    # Mengisi Form
    # ========================================
    # Teks input
    username = driver.find_element(By.NAME, "username")
    username.clear()  # Hapus teks yang ada
    username.send_keys("budi_santoso")  # Ketik teks

    # Password
    password = driver.find_element(By.NAME, "password")
    password.send_keys("rahasia123")

    # Email
    email = driver.find_element(By.NAME, "email")
    email.send_keys("budi@mail.com")

    # Textarea
    bio = driver.find_element(By.NAME, "bio")
    bio.send_keys("Saya adalah developer Python")

    # Shortcut keyboard
    username.send_keys(Keys.CONTROL + "a")  # Select all
    username.send_keys(Keys.CONTROL + "c")  # Copy
    username.send_keys(Keys.TAB)            # Tab ke elemen berikutnya
    username.send_keys(Keys.ENTER)          # Tekan Enter
    username.send_keys(Keys.ESCAPE)         # Tekan Escape

    # ========================================
    # Mengklik Element
    # ========================================
    # Tombol
    submit = driver.find_element(By.ID, "submit-btn")
    submit.click()

    # Link
    link = driver.find_element(By.LINK_TEXT, "Selengkapnya")
    link.click()

    # Checkbox
    checkbox = driver.find_element(By.NAME, "agree")
    if not checkbox.is_selected():
        checkbox.click()

    # Radio button
    radio = driver.find_element(By.CSS_SELECTOR, "input[value='male']")
    radio.click()

    # ========================================
    # Dropdown / Select
    # ========================================
    from selenium.webdriver.support.ui import Select

    select_element = driver.find_element(By.ID, "provinsi")
    select = Select(select_element)

    # Pilih berdasarkan value
    select.select_by_value("jawa-barat")

    # Pilih berdasarkan visible text
    select.select_by_visible_text("Jawa Barat")

    # Pilih berdasarkan index
    select.select_by_index(0)

    # Deselect (untuk multi-select)
    select.deselect_all()

    # Cek opsi yang dipilih
    print(select.first_selected_option.text)

    # Semua opsi
    for option in select.options:
        print(option.text)

    # ========================================
    # Membaca Properti Element
    # ========================================
    element = driver.find_element(By.ID, "content")

    # Teks
    print(element.text)           # Visible text

    # Atribut
    print(element.get_attribute("href"))
    print(element.get_attribute("class"))
    print(element.get_attribute("data-id"))
    print(element.get_attribute("value"))

    # Properties
    print(element.tag_name)       # Tag HTML
    print(element.size)           # {'width': 300, 'height': 200}
    print(element.location)       # {'x': 50, 'y': 100}
    print(element.is_displayed()) # True/False
    print(element.is_enabled())   # True/False

    # CSS property
    print(element.value_of_css_property("color"))
    print(element.value_of_css_property("font-size"))

    # ========================================
    # Drag and Drop
    # ========================================
    source = driver.find_element(By.ID, "draggable")
    target = driver.find_element(By.ID, "droppable")

    actions = ActionChains(driver)
    actions.drag_and_drop(source, target).perform()

    # ========================================
    # Mouse Actions
    # ========================================
    link = driver.find_element(By.ID, "menu-item")
    actions = ActionChains(driver)

    # Hover
    actions.move_to_element(link).perform()

    # Right click
    actions.context_click(link).perform()

    # Double click
    actions.double_click(link).perform()

    # ========================================
    # File Upload
    # ========================================
    upload_input = driver.find_element(By.ID, "file-upload")
    upload_input.send_keys("/path/to/file.pdf")
    # File path harus absolute path!

finally:
    driver.quit()

6. Waits (Explicit & Implicit)

Waits sangat penting dalam Selenium karena halaman web memuat elemen secara asinkron. Tanpa waits yang tepat, Anda akan mendapatkan NoSuchElementException.

Python — Waits
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

driver = webdriver.Chrome()

try:
    driver.get("https://example.com")

    # ========================================
    # 1. Implicit Wait — SET SEKALI, berlaku untuk semua
    # Selenium akan menunggu sampai waktu ini saat mencari elemen
    # ========================================
    driver.implicitly_wait(10)  # Tunggu maksimal 10 detik

    # Sekarang semua find_element akan tunggu sampai 10 detik
    element = driver.find_element(By.ID, "dynamic-content")  # Auto wait

    # ========================================
    # 2. Explicit Wait — LEBIH PRECISE (RECOMMENDED)
    # Tunggu sampai kondisi spesifik terpenuhi
    # ========================================
    wait = WebDriverWait(driver, 10)  # Timeout 10 detik

    # Tunggu sampai elemen bisa diklik
    button = wait.until(EC.element_to_be_clickable((By.ID, "submit")))
    button.click()

    # Tunggu sampai elemen muncul di DOM
    element = wait.until(EC.presence_of_element_located((By.ID, "content")))

    # Tunggu sampai elemen visible
    element = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, "modal")))

    # Tunggu sampai elemen hilang
    wait.until(EC.invisibility_of_element_located((By.CLASS_NAME, "loading")))

    # Tunggu sampai teks muncul di elemen
    wait.until(EC.text_to_be_present_in_element((By.ID, "status"), "Selesai"))

    # Tunggu sampai URL berubah
    wait.until(EC.url_contains("/dashboard"))

    # Tunggu sampai title berubah
    wait.until(EC.title_contains("Dashboard"))

    # Tunggu sampai alert muncul
    alert = wait.until(EC.alert_is_present())

    # Tunggu sampai jumlah elemen sesuai
    wait.until(lambda d: len(d.find_elements(By.CLASS_NAME, "item")) >= 5)

    # ========================================
    # 3. Custom Wait Conditions
    # ========================================
    def page_fully_loaded(driver):
        """Tunggu sampai page fully loaded"""
        return driver.execute_script("return document.readyState") == "complete"

    wait.until(page_fully_loaded)

    # Tunggu sampai AJAX selesai (jQuery)
    def ajax_complete(driver):
        return driver.execute_script("return jQuery.active == 0")

    # Tunggu sampai elemen punya teks
    def element_has_text(driver):
        el = driver.find_element(By.ID, "result")
        return el.text.strip() != ""

    wait.until(element_has_text)

    # ========================================
    # 4. Menangani Timeout
    # ========================================
    try:
        element = WebDriverWait(driver, 5).until(
            EC.presence_of_element_located((By.ID, "not-exist"))
        )
    except TimeoutException:
        print("Elemen tidak ditemukan dalam 5 detik!")

    # Custom poll frequency (default 0.5 detik)
    wait = WebDriverWait(driver, 10, poll_frequency=1)

finally:
    driver.quit()

# ========================================
# REKOMENDASI:
# - Gunakan Implicit Wait HANYA jika semua elemen load time mirip
# - Gunakan Explicit Wait untuk kontrol lebih presisi
# - JANGAN gabungkan keduanya (bisa konflik)
# - Selalu gunakan try/except untuk TimeoutException
# ========================================

7. Headless Mode

Headless mode menjalankan browser tanpa UI (tanpa jendela yang terlihat). Ini sangat berguna untuk server, CI/CD, dan otomasi yang tidak butuh visual.

Python — Headless Mode
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By

# ========================================
# Chrome Headless
# ========================================
options = Options()

# Cara baru (Chrome 109+)
options.add_argument('--headless=new')

# Cara lama (sebelum Chrome 109)
# options.add_argument('--headless')

# Ukuran window (penting untuk headless!)
options.add_argument('--window-size=1920,1080')

# Tambahan untuk server
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')

driver = webdriver.Chrome(options=options)

try:
    driver.get("https://example.com")
    print(f"Title: {driver.title}")

    # Tetap bisa ambil screenshot
    driver.save_screenshot("headless_screenshot.png")

    # Tetap bisa manipulasi elemen
    heading = driver.find_element(By.TAG_NAME, "h1")
    print(f"Heading: {heading.text}")

    # Tetap bisa execute JS
    height = driver.execute_script("return document.body.scrollHeight")
    print(f"Page height: {height}px")

finally:
    driver.quit()

# ========================================
# Firefox Headless
# ========================================
options = webdriver.FirefoxOptions()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)

# ========================================
# Perbandingan Mode
# ========================================
# +------------------+----------+-----------+
# | Fitur            | Normal   | Headless  |
# +------------------+----------+-----------+
# | UI Browser       | ✅ Ya    | ❌ Tidak  |
# | Kecepatan        | 🟡 Sedang| 🟢 Cepat  |
# | Resource         | 🔴 Banyak| 🟢 Sedikit|
# | Screenshot       | ✅ Ya    | ✅ Ya     |
# | JavaScript       | ✅ Ya    | ✅ Ya     |
# | Server/CI        | ❌ Sulit | ✅ Mudah  |
# | Debugging        | 🟢 Mudah | 🔴 Sulit  |
# +------------------+----------+-----------+

8. Web Scraping dengan Selenium

Selenium sangat cocok untuk scraping website yang menggunakan JavaScript untuk merender konten (SPA, dynamic content).

Python — Web Scraping
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import json
import time

def setup_driver(headless=True):
    """Setup driver dengan konfigurasi optimal untuk scraping"""
    options = Options()
    if headless:
        options.add_argument('--headless=new')
    options.add_argument('--window-size=1920,1080')
    options.add_argument('--no-sandbox')
    options.add_argument('--disable-dev-shm-usage')
    options.add_argument('--disable-blink-features=AutomationControlled')
    options.add_experimental_option("excludeSwitches", ["enable-automation"])

    # Disable images untuk performa lebih baik
    prefs = {"profile.managed_default_content_settings.images": 2}
    options.add_experimental_option("prefs", prefs)

    return webdriver.Chrome(options=options)

def scrape_products(url, max_pages=3):
    """Contoh: Scrape produk dari e-commerce"""
    driver = setup_driver()
    all_products = []

    try:
        for page in range(1, max_pages + 1):
            print(f"Scraping halaman {page}...")
            driver.get(f"{url}?page={page}")

            # Tunggu produk muncul
            WebDriverWait(driver, 10).until(
                EC.presence_of_element_located((By.CLASS_NAME, "product-card"))
            )

            # Scroll untuk load semua produk (lazy loading)
            last_height = driver.execute_script(
                "return document.body.scrollHeight"
            )
            while True:
                driver.execute_script(
                    "window.scrollTo(0, document.body.scrollHeight);"
                )
                time.sleep(1)
                new_height = driver.execute_script(
                    "return document.body.scrollHeight"
                )
                if new_height == last_height:
                    break
                last_height = new_height

            # Ambil semua produk
            products = driver.find_elements(By.CLASS_NAME, "product-card")
            for product in products:
                try:
                    name = product.find_element(By.CLASS_NAME, "name").text
                    price = product.find_element(By.CLASS_NAME, "price").text
                    rating = product.find_element(By.CLASS_NAME, "rating").text

                    all_products.append({
                        'name': name,
                        'price': price,
                        'rating': rating,
                        'page': page
                    })
                except Exception:
                    continue  # Skip jika elemen tidak lengkap

            print(f"  Ditemukan {len(products)} produk")

    finally:
        driver.quit()

    return all_products

def scrape_infinite_scroll(url, max_scrolls=10):
    """Contoh: Scrape halaman dengan infinite scroll"""
    driver = setup_driver()
    items = []

    try:
        driver.get(url)
        wait = WebDriverWait(driver, 10)

        for i in range(max_scrolls):
            # Scroll ke bawah
            driver.execute_script(
                "window.scrollTo(0, document.body.scrollHeight);"
            )

            # Tunggu konten baru muncul
            time.sleep(2)

            # Cek apakah ada "Load More" button
            try:
                load_more = driver.find_element(
                    By.CSS_SELECTOR, "button.load-more"
                )
                load_more.click()
                time.sleep(1)
            except Exception:
                pass  # Tidak ada tombol load more

        # Ambil semua item setelah scroll selesai
        elements = driver.find_elements(By.CSS_SELECTOR, ".item")
        for el in elements:
            items.append({
                'text': el.text,
                'link': el.find_element(By.TAG_NAME, "a").get_attribute("href")
            })

    finally:
        driver.quit()

    return items

# Contoh penggunaan
# products = scrape_products("https://example.com/products")
# with open('products.json', 'w') as f:
#     json.dump(products, f, indent=2)

9. Teknik Lanjutan

Python — Advanced Selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# ========================================
# 1. Page Object Model (Best Practice untuk Testing)
# ========================================
class LoginPage:
    """Page Object untuk halaman login"""

    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    # Locators
    USERNAME_INPUT = (By.ID, "username")
    PASSWORD_INPUT = (By.NAME, "password")
    SUBMIT_BUTTON = (By.CSS_SELECTOR, "button[type='submit']")
    ERROR_MESSAGE = (By.CLASS_NAME, "error-msg")

    def open(self, url):
        self.driver.get(url)
        return self

    def enter_username(self, username):
        field = self.wait.until(
            EC.presence_of_element_located(self.USERNAME_INPUT)
        )
        field.clear()
        field.send_keys(username)
        return self

    def enter_password(self, password):
        field = self.driver.find_element(*self.PASSWORD_INPUT)
        field.clear()
        field.send_keys(password)
        return self

    def click_submit(self):
        self.driver.find_element(*self.SUBMIT_BUTTON).click()
        return self

    def login(self, username, password):
        self.enter_username(username)
        self.enter_password(password)
        self.click_submit()
        return self

    def get_error_message(self):
        try:
            el = self.wait.until(
                EC.visibility_of_element_located(self.ERROR_MESSAGE)
            )
            return el.text
        except Exception:
            return None

    def is_logged_in(self):
        return "/dashboard" in self.driver.current_url

# Menggunakan Page Object
driver = webdriver.Chrome()
try:
    login_page = LoginPage(driver)
    login_page.open("https://example.com/login")
    login_page.login("budi", "rahasia123")

    if login_page.is_logged_in():
        print("Login berhasil!")
    else:
        error = login_page.get_error_message()
        print(f"Login gagal: {error}")
finally:
    driver.quit()

# ========================================
# 2. Network Interception (Performance Logging)
# ========================================
options = webdriver.ChromeOptions()
options.set_capability('goog:loggingPrefs', {'performance': 'ALL'})

driver = webdriver.Chrome(options=options)
try:
    driver.get("https://example.com")

    # Ambil semua network requests
    logs = driver.get_log('performance')
    for log in logs:
        message = json.loads(log['message'])['message']
        if message['method'] == 'Network.requestWillBeSent':
            url = message['params']['request']['url']
            print(f"Request: {url}")
finally:
    driver.quit()

# ========================================
# 3. Cookie Management
# ========================================
driver = webdriver.Chrome()
try:
    driver.get("https://example.com")

    # Tambah cookie
    driver.add_cookie({
        'name': 'session',
        'value': 'abc123',
        'domain': '.example.com',
        'path': '/',
        'secure': True
    })

    # Baca cookies
    cookies = driver.get_cookies()
    for cookie in cookies:
        print(f"{cookie['name']}: {cookie['value']}")

    # Hapus cookies
    driver.delete_cookie("session")
    driver.delete_all_cookies()

finally:
    driver.quit()

# ========================================
# 4. Handling Multiple Windows
# ========================================
driver = webdriver.Chrome()
try:
    driver.get("https://example.com")
    main_window = driver.current_window_handle

    # Klik link yang buka tab baru
    driver.find_element(By.LINK_TEXT, "External Link").click()

    # Tunggu window baru
    WebDriverWait(driver, 10).until(
        lambda d: len(d.window_handles) > 1
    )

    # Switch ke window baru
    for handle in driver.window_handles:
        if handle != main_window:
            driver.switch_to.window(handle)
            break

    print(f"New window title: {driver.title}")

    # Kembali ke window utama
    driver.switch_to.window(main_window)

finally:
    driver.quit()
⚠️ Etika Web Scraping

1. Selalu hormati robots.txt website. 2. Jangan scrape terlalu cepat — gunakan delay antara request. 3. Perhatikan Terms of Service website. 4. Hindari scraping data pribadi. 5. Gunakan API resmi jika tersedia. Selenium untuk scraping sebaiknya digunakan hanya ketika tidak ada alternatif lain.

10. Quiz: Uji Pemahamanmu!

Setelah membaca tutorial di atas, jawablah 5 pertanyaan berikut untuk menguji pemahamanmu tentang Selenium:

Pertanyaan 1: Apa fungsi dari driver.quit() dibandingkan driver.close()?

a) Tidak ada perbedaan
b) quit() menutup tab saat ini, close() menutup browser
c) quit() menutup seluruh browser dan semua tab, close() hanya menutup tab saat ini
d) close() lebih aman dari quit()

Pertanyaan 2: Apa perbedaan antara Implicit Wait dan Explicit Wait?

a) Tidak ada perbedaan, fungsinya sama
b) Implicit Wait berlaku global, Explicit Wait untuk kondisi spesifik pada elemen tertentu
c) Implicit Wait lebih cepat dari Explicit Wait
d) Explicit Wait hanya bisa dipakai di headless mode

Pertanyaan 3: Selector mana yang paling cepat untuk mencari elemen?

a) XPath
b) CSS Selector
c) By.ID
d) By.TAG_NAME

Pertanyaan 4: Apa keuntungan menggunakan Headless Mode?

a) Membuat website lebih cepat
b) Menghilangkan kebutuhan untuk JavaScript
c) Lebih cepat, hemat resource, dan bisa dijalankan di server tanpa display
d) Menghilangkan deteksi anti-bot

Pertanyaan 5: Apa itu Page Object Model dalam Selenium?

a) Model untuk menghitung ukuran halaman web
b) Pola desain yang memisahkan locators dan actions dari test scripts
c) Library untuk membuat halaman web otomatis
d) Fitur Selenium untuk mengoptimalkan loading halaman
🔍 Zoom
100%
🎨 Tema