Python

Python Excel: openpyxl & pandas

Tutorial lengkap manipulasi Excel dengan Python — openpyxl, pandas, baca tulis file .xlsx, chart, formula, formatting, dan otomasi spreadsheet

1. Pengenalan Library Excel

Python memiliki beberapa library untuk memanipulasi file Excel (.xlsx, .xls). Dua yang paling populer adalah openpyxl untuk manipulasi detail dan pandas untuk analisis data. Dalam tutorial ini, kita akan membahas keduanya secara mendalam.

Perbandingan Library Excel Python

Library Format Keunggulan Kekurangan
openpyxl.xlsxManipulasi detail, chart, formula, formattingLebih lambat untuk file besar
pandas.xlsx, .xls, .csvAnalisis data, filter, groupby, pivotKurang untuk formatting detail
xlrd.xlsBaca format lama Excel 97-2003Tidak support .xlsx
xlsxwriter.xlsxChart lebih kaya, conditional formattingHanya bisa menulis, tidak bisa baca
xlwings.xlsxKontrol Excel langsung, VBA replacementButuh Excel terinstall

Instalasi

# Instal openpyxl
pip install openpyxl

# Instal pandas (termasuk openpyxl sebagai dependency)
pip install pandas openpyxl

# Instal xlswriter (opsional, untuk chart lebih kaya)
pip install xlsxwriter

# Verifikasi
python -c "import openpyxl; print(openpyxl.__version__)"
python -c "import pandas; print(pandas.__version__)"
Diagram: Kapan Menggunakan Library Mana
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│              PILIH LIBRARY YANG TEPAT                           │
│                                                                 │
│  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”                                               │
│  │  Kebutuhan   │                                               │
│  │  Anda?       │                                               │
│  ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜                                               │
│         │                                                       │
│    ā”Œā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”                                                  │
│    ā–¼         ā–¼                                                  │
│  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”  │
│  │ Baca/  │ │ Manipulasi     │ │ Analisis Data              │  │
│  │ Tulis  │ │ Detail (cell,  │ │ (filter, groupby, pivot)   │  │
│  │ Data   │ │ chart, format) │ │                            │  │
│  ā””ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”˜ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜  │
│      │              │                     │                     │
│      ā–¼              ā–¼                     ā–¼                     │
│  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”    ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”         ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”               │
│  │ pandas │    │ openpyxl │         │  pandas  │               │
│  ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜    ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜         ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜               │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

2. openpyxl: Dasar-Dasar

openpyxl adalah library untuk membaca, menulis, dan memanipulasi file Excel 2010+ (.xlsx). Library ini mendukung manipulasi cell, row, column, sheet, chart, formula, dan formatting.

Konsep Dasar

"""Konsep dasar openpyxl — workbook, sheet, cell."""
from openpyxl import Workbook, load_workbook

# === Membuat Workbook Baru ===
wb = Workbook()  # Workbook = 1 file Excel

# Mendapatkan sheet aktif (default: "Sheet")
ws = wb.active
ws.title = "Data Penjualan"

# === Mengakses Cell ===
# Cara 1: Menggunakan koordinat
cell = ws["A1"]
cell.value = "Nama Produk"
print(f"A1 = {cell.value}")  # A1 = Nama Produk

# Cara 2: Menggunakan row dan column (1-based)
ws.cell(row=2, column=1, value="Laptop ASUS")
ws.cell(row=3, column=1, value="Keyboard Logitech")

# === Menambahkan Sheet Baru ===
ws2 = wb.create_sheet("Rekap")
ws3 = wb.create_sheet("Grafik", 0)  # Insert di posisi 0

# === Mengakses Sheet ===
print(f"Semua sheet: {wb.sheetnames}")
# ['Grafik', 'Data Penjualan', 'Rekap']

# Akses sheet berdasarkan nama
sheet = wb["Data Penjualan"]

# === Menyimpan File ===
wb.save("contoh.xlsx")
print("āœ… File disimpan: contoh.xlsx")

Properti Cell

"""Properti cell di openpyxl."""
from openpyxl import Workbook
from openpyxl.utils import get_column_letter

wb = Workbook()
ws = wb.active

# Isi data
ws["A1"] = "Hello"
ws["B1"] = "World"
ws["A2"] = 42
ws["B2"] = 3.14
ws["A3"] = True

# Properti cell
cell = ws["A1"]
print(f"Value    : {cell.value}")
print(f"Row      : {cell.row}")
print(f"Column   : {cell.column}")
print(f"Coordinate: {cell.coordinate}")

# Konversi kolom angka ke huruf
print(f"Column letter: {get_column_letter(1)}")  # A
print(f"Column letter: {get_column_letter(26)}")  # Z
print(f"Column letter: {get_column_letter(27)}")  # AA

# Iterasi rows dan columns
for row in ws.iter_rows(min_row=1, max_row=3, min_col=1, max_col=2):
    for cell in row:
        print(f"  {cell.coordinate}: {cell.value} ({type(cell.value).__name__})")

# Iterasi semua rows
print("\nAll rows:")
for row in ws.iter_rows(values_only=True):
    print(row)

# Dimensi sheet
print(f"\nMin row: {ws.min_row}, Max row: {ws.max_row}")
print(f"Min col: {ws.min_column}, Max col: {ws.max_column}")

Menulis Data Sederhana

"""Menulis berbagai tipe data ke Excel."""
from openpyxl import Workbook
from datetime import datetime, date

wb = Workbook()
ws = wb.active
ws.title = "Laporan"

# === Header ===
headers = ["ID", "Nama", "Departemen", "Gaji", "Tanggal Masuk", "Aktif"]
for col, header in enumerate(headers, 1):
    ws.cell(row=1, column=col, value=header)

# === Data ===
employees = [
    [1, "Budi Santoso", "Engineering", 15000000, datetime(2023, 1, 15), True],
    [2, "Ani Wijaya", "Marketing", 12000000, datetime(2023, 3, 20), True],
    [3, "Citra Dewi", "Engineering", 16000000, datetime(2022, 11, 5), True],
    [4, "Deni Kurniawan", "Finance", 13500000, datetime(2024, 1, 10), False],
    [5, "Eka Putri", "HR", 11000000, datetime(2024, 5, 1), True],
    [6, "Fajar Ramadhan", "Engineering", 14500000, datetime(2023, 8, 25), True],
    [7, "Gita Sari", "Marketing", 12500000, datetime(2022, 6, 30), True],
    [8, "Hadi Prasetyo", "Finance", 14000000, datetime(2023, 12, 15), True],
]

for row_idx, emp in enumerate(employees, 2):
    for col_idx, value in enumerate(emp, 1):
        ws.cell(row=row_idx, column=col_idx, value=value)

# Set format tanggal
from openpyxl.styles import numbers
for row in range(2, len(employees) + 2):
    ws.cell(row=row, column=5).number_format = "DD-MMM-YYYY"
    ws.cell(row=row, column=4).number_format = '#,##0'

wb.save("employees.xlsx")
print("āœ… File employees.xlsx berhasil dibuat")

Menulis dari List of Dict

"""Menulis data dari list of dictionary ke Excel."""
from openpyxl import Workbook


def write_dicts_to_excel(data: list[dict], filename: str, sheet_name: str = "Data"):
    """Menulis list of dict ke Excel, keys menjadi header."""
    wb = Workbook()
    ws = wb.active
    ws.title = sheet_name

    if not data:
        wb.save(filename)
        return

    # Tulis header dari keys
    headers = list(data[0].keys())
    for col, header in enumerate(headers, 1):
        ws.cell(row=1, column=col, value=header)

    # Tulis data
    for row_idx, item in enumerate(data, 2):
        for col_idx, key in enumerate(headers, 1):
            ws.cell(row=row_idx, column=col_idx, value=item.get(key))

    # Auto-adjust column width
    for col in range(1, len(headers) + 1):
        max_length = max(
            len(str(ws.cell(row=r, column=col).value or ""))
            for r in range(1, len(data) + 2)
        )
        ws.column_dimensions[chr(64 + col)].width = min(max_length + 4, 50)

    wb.save(filename)
    print(f"āœ… {len(data)} baris ditulis ke {filename}")


# Contoh penggunaan
products = [
    {"SKU": "LP001", "Nama": "Laptop ASUS ROG", "Harga": 25000000, "Stok": 15},
    {"SKU": "KB002", "Nama": "Keyboard Mechanical", "Harga": 850000, "Stok": 200},
    {"SKU": "MS003", "Nama": "Mouse Logitech MX", "Harga": 1200000, "Stok": 150},
    {"SKU": "MN004", "Nama": "Monitor LG 27\"", "Harga": 4500000, "Stok": 50},
    {"SKU": "HD005", "Nama": "Headset Sony WH-1000", "Harga": 5000000, "Stok": 75},
]

write_dicts_to_excel(products, "products.xlsx", "Katalog")

Append Rows

"""Menambahkan baris baru dengan ws.append() — cara cepat."""
from openpyxl import Workbook

wb = Workbook()
ws = wb.active

# Header
ws.append(["Produk", "Kategori", "Harga", "Stok"])

# Data — satu per satu
ws.append(["Laptop", "Elektronik", 15000000, 10])
ws.append(["Mouse", "Aksesoris", 250000, 200])
ws.append(["Monitor", "Elektronik", 3500000, 25])

# Data — batch dari list
batch_data = [
    ["Keyboard", "Aksesoris", 500000, 150],
    ["Printer", "Elektronik", 2000000, 30],
    ["Webcam", "Aksesoris", 750000, 80],
]
for row in batch_data:
    ws.append(row)

wb.save("append_example.xlsx")
print(f"āœ… Total {ws.max_row} baris (termasuk header)")

4. Membaca Data dari Excel

Membaca Workbook

"""Membaca data dari file Excel."""
from openpyxl import load_workbook

# Load workbook (data_only=True membaca nilai hasil formula, bukan formula-nya)
wb = load_workbook("employees.xlsx", data_only=True)

# Info workbook
print(f"Sheet names: {wb.sheetnames}")
print(f"Active sheet: {wb.active.title}")

# Akses sheet
ws = wb["Laporan"]

# Membaca cell individual
print(f"A1: {ws['A1'].value}")
print(f"B2: {ws['B2'].value}")

# Membaca semua data
print("\n=== Semua Data ===")
for row in ws.iter_rows(min_row=1, max_row=ws.max_row, values_only=True):
    print(row)

# Membaca dengan header
print("\n=== Data per Row ===")
headers = [cell.value for cell in ws[1]]  # Row 1 = header
for row in ws.iter_rows(min_row=2, values_only=True):
    row_dict = dict(zip(headers, row))
    print(row_dict)

Filter dan Proses Data

"""Filter dan proses data dari Excel."""
from openpyxl import load_workbook

wb = load_workbook("employees.xlsx", data_only=True)
ws = wb["Laporan"]

# === Filter: Karyawan Engineering ===
print("=== Karyawan Engineering ===")
headers = [cell.value for cell in ws[1]]
for row in ws.iter_rows(min_row=2, values_only=True):
    data = dict(zip(headers, row))
    if data.get("Departemen") == "Engineering":
        print(f"  {data['Nama']} - Rp{data['Gaji']:,.0f}")

# === Hitung rata-rata gaji ===
salaries = []
for row in ws.iter_rows(min_row=2, values_only=True):
    gaji = row[3]  # Kolom D = Gaji
    if gaji and isinstance(gaji, (int, float)):
        salaries.append(gaji)

avg_salary = sum(salaries) / len(salaries)
print(f"\nšŸ“Š Rata-rata gaji: Rp{avg_salary:,.0f}")
print(f"šŸ“Š Gaji tertinggi: Rp{max(salaries):,.0f}")
print(f"šŸ“Š Gaji terendah : Rp{min(salaries):,.0f}")

# === Cari karyawan tertentu ===
def find_employee(ws, name: str):
    """Mencari karyawan berdasarkan nama."""
    headers = [cell.value for cell in ws[1]]
    for row in ws.iter_rows(min_row=2, values_only=True):
        data = dict(zip(headers, row))
        if name.lower() in str(data.get("Nama", "")).lower():
            return data
    return None

result = find_employee(ws, "budi")
if result:
    print(f"\nšŸ” Ditemukan: {result['Nama']} - {result['Departemen']}")

Membaca Multiple Sheets

"""Membaca data dari semua sheet."""
from openpyxl import load_workbook

wb = load_workbook("employees.xlsx", data_only=True)

all_data = {}
for sheet_name in wb.sheetnames:
    ws = wb[sheet_name]
    sheet_data = []
    headers = None

    for row_idx, row in enumerate(ws.iter_rows(values_only=True)):
        if row_idx == 0:
            headers = list(row)
        else:
            if headers:
                sheet_data.append(dict(zip(headers, row)))

    all_data[sheet_name] = sheet_data
    print(f"šŸ“„ {sheet_name}: {len(sheet_data)} baris")

# Akses data dari sheet tertentu
for item in all_data.get("Laporan", []):
    print(f"  {item}")

5. Formatting & Styling

openpyxl mendukung formatting lengkap seperti font, alignment, border, fill, dan number format untuk membuat spreadsheet yang profesional.

"""Formatting dan styling Excel yang profesional."""
from openpyxl import Workbook
from openpyxl.styles import (
    Font, PatternFill, Alignment, Border, Side, NamedStyle
)
from openpyxl.utils import get_column_letter

wb = Workbook()
ws = wb.active
ws.title = "Laporan Penjualan"

# === Data ===
headers = ["No", "Produk", "Kategori", "Harga", "Qty", "Total"]
data = [
    [1, "Laptop ASUS ROG", "Elektronik", 25000000, 3, 75000000],
    [2, "Keyboard Mech", "Aksesoris", 850000, 20, 17000000],
    [3, "Monitor LG 27\"", "Elektronik", 4500000, 8, 36000000],
    [4, "Mouse Logitech", "Aksesoris", 1200000, 15, 18000000],
    [5, "Printer Canon", "Elektronik", 2800000, 5, 14000000],
]

# === Styles ===
# Header style
header_font = Font(name="Calibri", size=12, bold=True, color="FFFFFF")
header_fill = PatternFill(start_color="2F5496", end_color="2F5496", fill_type="solid")
header_alignment = Alignment(horizontal="center", vertical="center")

# Data style
data_font = Font(name="Calibri", size=11)
currency_format = '#,##0'
number_alignment = Alignment(horizontal="right")

# Border
thin_border = Border(
    left=Side(style="thin"),
    right=Side(style="thin"),
    top=Side(style="thin"),
    bottom=Side(style="thin"),
)

# === Title Row ===
ws.merge_cells("A1:F1")
title_cell = ws["A1"]
title_cell.value = "LAPORAN PENJUALAN Q1 2026"
title_cell.font = Font(name="Calibri", size=16, bold=True, color="2F5496")
title_cell.alignment = Alignment(horizontal="center", vertical="center")
ws.row_dimensions[1].height = 40

# === Subtitle ===
ws.merge_cells("A2:F2")
ws["A2"].value = "Periode: Januari - Maret 2026"
ws["A2"].font = Font(name="Calibri", size=11, italic=True, color="666666")
ws["A2"].alignment = Alignment(horizontal="center")

# === Headers (Row 3) ===
for col, header in enumerate(headers, 1):
    cell = ws.cell(row=3, column=col, value=header)
    cell.font = header_font
    cell.fill = header_fill
    cell.alignment = header_alignment
    cell.border = thin_border

# === Data (Row 4+) ===
# Alternating row colors
even_fill = PatternFill(start_color="D6E4F0", end_color="D6E4F0", fill_type="solid")
odd_fill = PatternFill(start_color="FFFFFF", end_color="FFFFFF", fill_type="solid")

for row_idx, row_data in enumerate(data, 4):
    fill = even_fill if (row_idx - 4) % 2 == 0 else odd_fill
    for col_idx, value in enumerate(row_data, 1):
        cell = ws.cell(row=row_idx, column=col_idx, value=value)
        cell.font = data_font
        cell.border = thin_border
        cell.fill = fill

        # Format kolom angka
        if col_idx in [4, 6]:  # Harga, Total
            cell.number_format = currency_format
            cell.alignment = number_alignment
        elif col_idx == 5:  # Qty
            cell.alignment = number_alignment

# === Summary Row ===
summary_row = len(data) + 4
summary_fill = PatternFill(start_color="FFD700", end_color="FFD700", fill_type="solid")
summary_font = Font(name="Calibri", size=11, bold=True)

ws.cell(row=summary_row, column=1, value="").border = thin_border
ws.cell(row=summary_row, column=2, value="TOTAL").font = summary_font
ws.cell(row=summary_row, column=2).border = thin_border
ws.cell(row=summary_row, column=2).fill = summary_fill
ws.cell(row=summary_row, column=3, value="").border = thin_border
ws.cell(row=summary_row, column=3).fill = summary_fill

# SUM formula
total_cell = ws.cell(row=summary_row, column=6)
total_cell.value = f"=SUM(F4:F{summary_row - 1})"
total_cell.font = summary_font
total_cell.number_format = currency_format
total_cell.border = thin_border
total_cell.fill = summary_fill
total_cell.alignment = number_alignment

for col in [4, 5]:
    ws.cell(row=summary_row, column=col).border = thin_border
    ws.cell(row=summary_row, column=col).fill = summary_fill

# === Column Widths ===
col_widths = {"A": 6, "B": 25, "C": 15, "D": 18, "E": 10, "F": 20}
for col_letter, width in col_widths.items():
    ws.column_dimensions[col_letter].width = width

# === Freeze Panes (header tetap terlihat saat scroll) ===
ws.freeze_panes = "A4"

# === Auto Filter ===
ws.auto_filter.ref = f"A3:F{summary_row - 1}"

wb.save("formatted_report.xlsx")
print("āœ… Laporan terformat berhasil dibuat!")
šŸ’” Tips Formatting
  • Gunakan NamedStyle untuk style yang bisa digunakan ulang
  • freeze_panes sangat penting untuk spreadsheet dengan banyak data
  • auto_filter memudahkan user menyaring data di Excel
  • merge_cells untuk title/header yang lebar

6. Formula & Fungsi Excel

Anda bisa menulis formula Excel langsung dari Python. Formula akan dihitung saat file dibuka di Excel (atau saat menggunakan data_only=True saat membaca).

Formula yang Sering Digunakan

Kategori Formula Contoh
MatematikaSUM, AVERAGE, MIN, MAX, COUNT=SUM(A1:A10)
TeksCONCATENATE, LEFT, RIGHT, LEN=CONCATENATE(A1," ",B1)
KondisionalIF, IFS, SWITCH=IF(A1>100,"High","Low")
LookupVLOOKUP, INDEX, MATCH, XLOOKUP=VLOOKUP(A1,Sheet2!A:B,2,0)
TanggalTODAY, YEAR, MONTH, DATEDIF=YEAR(A1)
StatistikSTDEV, MEDIAN, PERCENTILE=STDEV(A1:A100)

7. Membuat Chart

openpyxl mendukung pembuatan berbagai jenis chart langsung di Excel, termasuk bar chart, line chart, pie chart, dan scatter chart.

"""Membuat berbagai chart di Excel."""
from openpyxl import Workbook
from openpyxl.chart import BarChart, LineChart, PieChart, Reference
from openpyxl.chart.series import DataPoint
from openpyxl.chart.label import DataLabelList
from openpyxl.styles import Font, Alignment

wb = Workbook()

# ==========================================
# Sheet 1: Bar Chart — Penjualan per Produk
# ==========================================
ws1 = wb.active
ws1.title = "Bar Chart"

# Data
ws1.append(["Produk", "Q1", "Q2", "Q3", "Q4"])
ws1.append(["Laptop", 150, 180, 200, 220])
ws1.append(["Desktop", 100, 90, 85, 80])
ws1.append(["Monitor", 80, 100, 120, 140])
ws1.append(["Keyboard", 200, 210, 190, 230])
ws1.append(["Mouse", 300, 320, 310, 350])

# Membuat Bar Chart
bar_chart = BarChart()
bar_chart.type = "col"  # "col" = vertikal, "bar" = horizontal
bar_chart.title = "Penjualan per Produk (Unit)"
bar_chart.y_axis.title = "Jumlah Unit"
bar_chart.x_axis.title = "Produk"
bar_chart.style = 10  # Style 1-48

# Data reference
categories = Reference(ws1, min_col=1, min_row=2, max_row=6)
data = Reference(ws1, min_col=2, max_col=5, min_row=1, max_row=6)

bar_chart.add_data(data, titles_from_data=True)
bar_chart.set_categories(categories)
bar_chart.shape = 4
bar_chart.width = 20
bar_chart.height = 12

ws1.add_chart(bar_chart, "A9")

# ==========================================
# Sheet 2: Line Chart — Trend Bulanan
# ==========================================
ws2 = wb.create_sheet("Line Chart")

ws2.append(["Bulan", "Pendapatan", "Pengeluaran", "Profit"])
revenue_data = [
    ["Jan", 50, 30, 20], ["Feb", 55, 32, 23], ["Mar", 48, 28, 20],
    ["Apr", 62, 35, 27], ["Mei", 70, 38, 32], ["Jun", 75, 40, 35],
    ["Jul", 68, 37, 31], ["Agu", 72, 39, 33], ["Sep", 80, 42, 38],
    ["Okt", 85, 45, 40], ["Nov", 90, 48, 42], ["Des", 95, 50, 45],
]
for row in revenue_data:
    ws2.append(row)

# Line Chart
line_chart = LineChart()
line_chart.title = "Trend Pendapatan & Pengeluaran 2026 (Jutaan Rp)"
line_chart.y_axis.title = "Jutaan Rupiah"
line_chart.x_axis.title = "Bulan"
line_chart.style = 10
line_chart.width = 22
line_chart.height = 14

cats = Reference(ws2, min_col=1, min_row=2, max_row=13)
values = Reference(ws2, min_col=2, max_col=4, min_row=1, max_row=13)

line_chart.add_data(values, titles_from_data=True)
line_chart.set_categories(cats)

# Style lines
line_chart.series[0].graphicalProperties.line.width = 28000  # Width in EMUs
line_chart.series[1].graphicalProperties.line.width = 28000
line_chart.series[2].graphicalProperties.line.width = 28000

ws2.add_chart(line_chart, "A16")

# ==========================================
# Sheet 3: Pie Chart — Market Share
# ==========================================
ws3 = wb.create_sheet("Pie Chart")

ws3.append(["Brand", "Market Share (%)"])
ws3.append(["ASUS", 28])
ws3.append(["Lenovo", 22])
ws3.append(["HP", 18])
ws3.append(["Dell", 15])
ws3.append(["Acer", 10])
ws3.append(["Lainnya", 7])

pie_chart = PieChart()
pie_chart.title = "Market Share Laptop Indonesia 2026"
pie_chart.style = 10
pie_chart.width = 16
pie_chart.height = 12

labels = Reference(ws3, min_col=1, min_row=2, max_row=7)
data_ref = Reference(ws3, min_col=2, min_row=1, max_row=7)

pie_chart.add_data(data_ref, titles_from_data=True)
pie_chart.set_categories(labels)

# Data labels
pie_chart.dataLabels = DataLabelList()
pie_chart.dataLabels.showPercent = True
pie_chart.dataLabels.showCatName = True
pie_chart.dataLabels.showVal = False

ws3.add_chart(pie_chart, "A10")

# ==========================================
# Sheet 4: Combo Chart — Bar + Line
# ==========================================
ws4 = wb.create_sheet("Combo Chart")

ws4.append(["Tahun", "Pendapatan (M)", "Growth (%)"])
ws4.append([2020, 100, None])
ws4.append([2021, 120, 20])
ws4.append([2022, 145, 20.8])
ws4.append([2023, 170, 17.2])
ws4.append([2024, 210, 23.5])
ws4.append([2025, 250, 19.0])
ws4.append([2026, 300, 20.0])

from openpyxl.chart import BarChart, LineChart

# Bar chart untuk pendapatan
combo_bar = BarChart()
combo_bar.type = "col"
combo_bar.title = "Pendapatan & Growth Tahunan"
combo_bar.y_axis.title = "Pendapatan (Miliar Rp)"
combo_bar.style = 10
combo_bar.width = 18
combo_bar.height = 12

bar_data = Reference(ws4, min_col=2, min_row=1, max_row=8)
bar_cats = Reference(ws4, min_col=1, min_row=2, max_row=8)
combo_bar.add_data(bar_data, titles_from_data=True)
combo_bar.set_categories(bar_cats)

# Line chart untuk growth
combo_line = LineChart()
line_data = Reference(ws4, min_col=3, min_row=1, max_row=8)
combo_line.add_data(line_data, titles_from_data=True)
combo_line.y_axis.title = "Growth (%)"
combo_line.y_axis.axId = 200

# Gabungkan
combo_bar.y_axis.crosses = "min"
combo_bar += combo_line

ws4.add_chart(combo_bar, "A11")

# Simpan
wb.save("charts_example.xlsx")
print("āœ… Contoh chart Excel berhasil dibuat!")

8. pandas: Read/Write Excel

pandas adalah library analisis data yang sangat powerful. pandas menggunakan openpyxl (atau xlrd untuk .xls) sebagai backend untuk membaca dan menulis file Excel.

Membaca Excel dengan pandas

"""Membaca file Excel menggunakan pandas."""
import pandas as pd

# === Membaca Sheet Default ===
df = pd.read_excel("employees.xlsx", sheet_name="Laporan")
print(f"Shape: {df.shape}")  # (baris, kolom)
print(f"Columns: {list(df.columns)}")
print(f"\n{df.head()}")

# === Membaca Sheet Spesifik ===
df = pd.read_excel("employees.xlsx", sheet_name="Laporan", usecols="A:D")
print(df.head())

# === Membaca dengan Filter Baris ===
# Hanya baca baris tertentu
df = pd.read_excel(
    "employees.xlsx",
    sheet_name="Laporan",
    skiprows=1,      # Skip 1 baris pertama
    nrows=5,         # Ambil 5 baris
)
print(df.head())

# === Membaca Semua Sheet ===
all_sheets = pd.read_excel(
    "employees.xlsx",
    sheet_name=None,  # None = semua sheet
    engine="openpyxl",
)
for name, sheet_df in all_sheets.items():
    print(f"\nšŸ“„ Sheet '{name}': {sheet_df.shape}")

# === Membaca dengan Tipe Data Spesifik ===
df = pd.read_excel(
    "employees.xlsx",
    dtype={
        "ID": "int64",
        "Nama": "str",
        "Gaji": "float64",
    },
    parse_dates=["Tanggal Masuk"],  # Auto-parse tanggal
)

Menulis Excel dengan pandas

"""Menulis DataFrame ke file Excel menggunakan pandas."""
import pandas as pd
import numpy as np
from datetime import datetime

# === Membuat DataFrame Contoh ===
np.random.seed(42)
df = pd.DataFrame({
    "Tanggal": pd.date_range("2026-01-01", periods=100, freq="D"),
    "Produk": np.random.choice(["Laptop", "Mouse", "Keyboard", "Monitor"], 100),
    "Kategori": np.random.choice(["Elektronik", "Aksesoris"], 100),
    "Qty": np.random.randint(1, 50, 100),
    "Harga": np.random.choice([25000000, 850000, 500000, 4500000], 100),
})
df["Total"] = df["Qty"] * df["Harga"]

# === Menulis ke Excel ===
# Cara dasar
df.to_excel("output_pandas.xlsx", index=False, sheet_name="Data")

# === Menulis dengan Formatting ===
with pd.ExcelWriter("output_styled.xlsx", engine="openpyxl") as writer:
    # Sheet 1: Data lengkap
    df.to_excel(writer, sheet_name="Semua Data", index=False)

    # Sheet 2: Rekap per Produk
    rekap_produk = df.groupby("Produk").agg({
        "Qty": "sum",
        "Total": "sum",
    }).reset_index()
    rekap_produk.to_excel(writer, sheet_name="Rekap Produk", index=False)

    # Sheet 3: Rekap per Bulan
    df["Bulan"] = df["Tanggal"].dt.to_period("M")
    rekap_bulan = df.groupby("Bulan").agg({
        "Qty": "sum",
        "Total": "sum",
    }).reset_index()
    rekap_bulan["Bulan"] = rekap_bulan["Bulan"].astype(str)
    rekap_bulan.to_excel(writer, sheet_name="Rekap Bulanan", index=False)

    # Sheet 4: Summary
    summary = pd.DataFrame({
        "Metrik": ["Total Pendapatan", "Total Qty", "Rata-rata/Transaksi", "Jumlah Transaksi"],
        "Nilai": [
            f"Rp{df['Total'].sum():,.0f}",
            df["Qty"].sum(),
            f"Rp{df['Total'].mean():,.0f}",
            len(df),
        ],
    })
    summary.to_excel(writer, sheet_name="Summary", index=False)

print("āœ… File output_styled.xlsx berhasil dibuat dengan 4 sheet!")

Formatting Excel dari pandas

"""Formatting Excel dari pandas menggunakan openpyxl engine."""
import pandas as pd
from openpyxl import load_workbook
from openpyxl.styles import Font, PatternFill, Alignment, Border, Side
from openpyxl.chart import BarChart, Reference

# Buat data
df = pd.DataFrame({
    "Departemen": ["Engineering", "Marketing", "Finance", "HR", "Operations"],
    "Karyawan": [25, 15, 10, 8, 20],
    "Budget (Jt)": [500, 300, 200, 150, 400],
    "Actual (Jt)": [480, 320, 190, 145, 380],
})
df["Utilisasi (%)"] = (df["Actual (Jt)"] / df["Budget (Jt)"] * 100).round(1)

# Tulis data dulu
df.to_excel("formatted_pandas.xlsx", index=False, sheet_name="Budget")

# Buka dan format
wb = load_workbook("formatted_pandas.xlsx")
ws = wb["Budget"]

# Format header
header_font = Font(bold=True, color="FFFFFF", size=12)
header_fill = PatternFill("solid", fgColor="2F5496")

for cell in ws[1]:
    cell.font = header_font
    cell.fill = header_fill
    cell.alignment = Alignment(horizontal="center")

# Format data cells
for row in ws.iter_rows(min_row=2, max_row=ws.max_row, min_col=1, max_col=ws.max_column):
    for cell in row:
        cell.border = Border(
            left=Side("thin"), right=Side("thin"),
            top=Side("thin"), bottom=Side("thin"),
        )
        if isinstance(cell.value, (int, float)):
            cell.alignment = Alignment(horizontal="right")

# Conditional formatting — manual coloring
green_fill = PatternFill("solid", fgColor="C6EFCE")
red_fill = PatternFill("solid", fgColor="FFC7CE")

for row in range(2, ws.max_row + 1):
    utilization = ws.cell(row=row, column=5).value
    if utilization and utilization > 100:
        ws.cell(row=row, column=5).fill = red_fill
    elif utilization:
        ws.cell(row=row, column=5).fill = green_fill

# Tambah chart
chart = BarChart()
chart.title = "Budget vs Actual (Jutaan Rp)"
chart.width = 18
chart.height = 10

cats = Reference(ws, min_col=1, min_row=2, max_row=6)
data = Reference(ws, min_col=3, max_col=4, min_row=1, max_row=6)
chart.add_data(data, titles_from_data=True)
chart.set_categories(cats)

ws.add_chart(chart, "A9")

# Auto-width columns
for col in range(1, 6):
    max_len = max(len(str(ws.cell(row=r, column=col).value or ""))
                  for r in range(1, ws.max_row + 1))
    ws.column_dimensions[chr(64 + col)].width = max_len + 4

wb.save("formatted_pandas.xlsx")
print("āœ… Formatted pandas Excel berhasil dibuat!")

9. Multi-Sheet & Advanced

Append ke Sheet yang Sudah Ada

"""Append data ke Excel yang sudah ada."""
import pandas as pd
from openpyxl import load_workbook

def append_to_excel(filepath: str, new_data: pd.DataFrame, sheet_name: str = "Data"):
    """Append DataFrame ke sheet yang sudah ada."""
    try:
        # Baca data yang sudah ada
        existing = pd.read_excel(filepath, sheet_name=sheet_name)

        # Gabungkan
        combined = pd.concat([existing, new_data], ignore_index=True)

        # Tulis ulang
        with pd.ExcelWriter(filepath, engine="openpyxl", mode="a",
                           if_sheet_exists="replace") as writer:
            combined.to_excel(writer, sheet_name=sheet_name, index=False)

        print(f"āœ… {len(new_data)} baris ditambahkan (total: {len(combined)})")
    except FileNotFoundError:
        new_data.to_excel(filepath, index=False, sheet_name=sheet_name)
        print(f"āœ… File baru dibuat dengan {len(new_data)} baris")


# Contoh penggunaan
new_rows = pd.DataFrame({
    "Nama": ["User Baru 1", "User Baru 2"],
    "Departemen": ["IT", "Sales"],
    "Gaji": [15000000, 12000000],
})

append_to_excel("employees.xlsx", new_rows, "Laporan")

Merge Multiple Excel Files

"""Menggabungkan beberapa file Excel menjadi satu."""
import pandas as pd
from pathlib import Path


def merge_excel_files(input_dir: str, output_file: str, pattern: str = "*.xlsx"):
    """Gabungkan semua file Excel di direktori."""
    files = list(Path(input_dir).glob(pattern))
    all_data = []

    for file in files:
        try:
            df = pd.read_excel(file, engine="openpyxl")
            df["Source_File"] = file.name  # Track sumber data
            all_data.append(df)
            print(f"  šŸ“„ {file.name}: {len(df)} baris")
        except Exception as e:
            print(f"  āŒ Error reading {file.name}: {e}")

    if all_data:
        combined = pd.concat(all_data, ignore_index=True)
        combined.to_excel(output_file, index=False, sheet_name="Combined")
        print(f"\nāœ… {len(combined)} baris dari {len(files)} file → {output_file}")
        return combined
    else:
        print("āŒ Tidak ada data yang bisa digabungkan")
        return None


# Contoh penggunaan
# merge_excel_files("./data/", "merged_output.xlsx")

Excel dengan Data Validation

"""Menambahkan data validation (dropdown) di Excel."""
from openpyxl import Workbook
from openpyxl.worksheet.datavalidation import DataValidation
from openpyxl.styles import Font

wb = Workbook()
ws = wb.active
ws.title = "Form Input"

# Header
headers = ["Nama", "Departemen", "Status", "Prioritas", "Tanggal"]
for col, header in enumerate(headers, 1):
    ws.cell(row=1, column=col, value=header).font = Font(bold=True)

# Data Validation — Departemen (dropdown)
dept_dv = DataValidation(
    type="list",
    formula1='"Engineering,Marketing,Finance,HR,Operations"',
    allow_blank=True,
)
dept_dv.error = "Departemen tidak valid"
dept_dv.errorTitle = "Error"
dept_dv.prompt = "Pilih departemen"
dept_dv.promptTitle = "Departemen"
ws.add_data_validation(dept_dv)
dept_dv.add("B2:B100")

# Data Validation — Status (dropdown)
status_dv = DataValidation(
    type="list",
    formula1='"Active,Inactive,Pending"',
    allow_blank=True,
)
ws.add_data_validation(status_dv)
status_dv.add("C2:C100")

# Data Validation — Prioritas (dropdown)
priority_dv = DataValidation(
    type="list",
    formula1='"Low,Medium,High,Critical"',
    allow_blank=True,
)
ws.add_data_validation(priority_dv)
priority_dv.add("D2:D100")

# Data Validation — Tanggal
from datetime import date
date_dv = DataValidation(
    type="date",
    operator="greaterThan",
    formula1="2020-01-01",
    allow_blank=True,
)
ws.add_data_validation(date_dv)
date_dv.add("E2:E100")

# Data Validation — Nama (panjang teks)
name_dv = DataValidation(
    type="textLength",
    operator="lessThan",
    formula1="100",
)
name_dv.error = "Nama terlalu panjang (maks 100 karakter)"
ws.add_data_validation(name_dv)
name_dv.add("A2:A100")

# Column widths
for col in range(1, 6):
    ws.column_dimensions[chr(64 + col)].width = 18

wb.save("form_with_validation.xlsx")
print("āœ… Form dengan data validation berhasil dibuat")

10. Otomasi Excel

Batch Processing — Generate Report Otomatis

"""Otomasi pembuatan laporan Excel dari data."""
import pandas as pd
from openpyxl import load_workbook
from openpyxl.styles import Font, PatternFill, Alignment, Border, Side
from datetime import datetime
import os


def generate_monthly_report(data_file: str, output_dir: str = "reports"):
    """Generate laporan bulanan dari data mentah."""

    # Baca data mentah
    df = pd.read_excel(data_file, engine="openpyxl")
    df["Tanggal"] = pd.to_datetime(df["Tanggal"])
    df["Bulan"] = df["Tanggal"].dt.strftime("%Y-%m")

    os.makedirs(output_dir, exist_ok=True)

    # Generate per bulan
    for bulan, group in df.groupby("Bulan"):
        output_file = os.path.join(output_dir, f"laporan_{bulan}.xlsx")

        with pd.ExcelWriter(output_file, engine="openpyxl") as writer:
            # Sheet 1: Detail Transaksi
            group.to_excel(writer, sheet_name="Detail", index=False)

            # Sheet 2: Rekap Produk
            rekap = group.groupby("Produk").agg({
                "Qty": "sum",
                "Total": ["sum", "count"],
            }).round(0)
            rekap.columns = ["Total Qty", "Jumlah Transaksi", "Total Revenue"]
            rekap.to_excel(writer, sheet_name="Rekap Produk")

            # Sheet 3: Harian
            daily = group.groupby(df["Tanggal"].dt.date).agg({
                "Total": "sum",
            }).round(0)
            daily.to_excel(writer, sheet_name="Harian")

        # Format output
        wb = load_workbook(output_file)
        for sheet_name in wb.sheetnames:
            ws = wb[sheet_name]
            # Header formatting
            for cell in ws[1]:
                cell.font = Font(bold=True, color="FFFFFF")
                cell.fill = PatternFill("solid", fgColor="2F5496")
            # Auto-width
            for col in range(1, ws.max_column + 1):
                max_len = max(len(str(ws.cell(row=r, column=col).value or ""))
                             for r in range(1, min(ws.max_row + 1, 20)))
                ws.column_dimensions[chr(64 + col)].width = min(max_len + 4, 40)

        wb.save(output_file)
        print(f"  āœ… {output_file} ({len(group)} transaksi)")


# Contoh penggunaan:
# generate_monthly_report("data_penjualan.xlsx")

Excel Converter — Format ke Format

"""Konversi antar format file."""
import pandas as pd


def convert_excel_to_csv(excel_file: str, sheet_name: str = None, output_csv: str = None):
    """Konversi Excel ke CSV."""
    df = pd.read_excel(excel_file, sheet_name=sheet_name, engine="openpyxl")
    if output_csv is None:
        output_csv = excel_file.replace(".xlsx", ".csv")
    df.to_csv(output_csv, index=False, encoding="utf-8-sig")  # BOM untuk Excel Indonesia
    print(f"āœ… {excel_file} → {output_csv} ({len(df)} baris)")


def convert_csv_to_excel(csv_file: str, output_excel: str = None):
    """Konversi CSV ke Excel dengan formatting."""
    df = pd.read_csv(csv_file)
    if output_excel is None:
        output_excel = csv_file.replace(".csv", ".xlsx")
    df.to_excel(output_excel, index=False, sheet_name="Data")
    print(f"āœ… {csv_file} → {output_excel} ({len(df)} baris)")


def convert_json_to_excel(json_file: str, output_excel: str = None):
    """Konversi JSON ke Excel."""
    df = pd.read_json(json_file)
    if output_excel is None:
        output_excel = json_file.replace(".json", ".xlsx")
    df.to_excel(output_excel, index=False, sheet_name="Data")
    print(f"āœ… {json_file} → {output_excel} ({len(df)} baris)")


# Contoh
# convert_excel_to_csv("report.xlsx", "Sheet1")
# convert_csv_to_excel("data.csv")
# convert_json_to_excel("api_response.json")

11. Best Practice

Handling File Besar

"""Best practice untuk file Excel besar."""
import pandas as pd
from openpyxl import load_workbook
import gc


# === 1. Baca dalam Chunk ===
def read_large_excel_chunked(filepath: str, chunk_size: int = 10000):
    """Baca file besar dalam chunk."""
    total_rows = 0
    for chunk in pd.read_excel(
        filepath,
        sheet_name="Data",
        engine="openpyxl",
        chunksize=chunk_size,  # Untuk .csv, tidak untuk .xlsx
    ):
        total_rows += len(chunk)
        # Proses chunk di sini
        yield chunk

# NOTE: pd.read_excel() tidak support chunksize langsung.
# Gunakan openpyxl untuk streaming:
def stream_large_excel(filepath: str, batch_size: int = 1000):
    """Stream data dari file Excel besar menggunakan openpyxl."""
    wb = load_workbook(filepath, read_only=True, data_only=True)
    ws = wb.active

    headers = None
    batch = []

    for row_idx, row in enumerate(ws.iter_rows(values_only=True)):
        if row_idx == 0:
            headers = list(row)
            continue

        batch.append(dict(zip(headers, row)))

        if len(batch) >= batch_size:
            df = pd.DataFrame(batch)
            yield df
            batch = []
            gc.collect()

    # Sisa batch
    if batch:
        yield pd.DataFrame(batch)

    wb.close()


# Contoh penggunaan
# for chunk_df in stream_large_excel("big_file.xlsx", batch_size=5000):
#     # Proses per batch
#     result = chunk_df.groupby("Kategori")["Total"].sum()
#     print(result)


# === 2. Optimasi Memori ===
def optimize_excel_memory(df: pd.DataFrame) -> pd.DataFrame:
    """Optimasi tipe data DataFrame untuk menghemat memori."""
    for col in df.columns:
        col_type = df[col].dtype

        if col_type == "object":
            # Cek apakah bisa jadi category
            if df[col].nunique() / len(df) < 0.5:
                df[col] = df[col].astype("category")

        elif col_type == "int64":
            # Downcast integer
            df[col] = pd.to_numeric(df[col], downcast="integer")

        elif col_type == "float64":
            # Downcast float
            df[col] = pd.to_numeric(df[col], downcast="float")

    return df


# === 3. Error Handling ===
def safe_read_excel(filepath: str, **kwargs) -> pd.DataFrame:
    """Baca Excel dengan error handling."""
    try:
        df = pd.read_excel(filepath, engine="openpyxl", **kwargs)
        print(f"āœ… Berhasil membaca {filepath}: {df.shape}")
        return df
    except FileNotFoundError:
        print(f"āŒ File tidak ditemukan: {filepath}")
    except ValueError as e:
        print(f"āŒ Error membaca {filepath}: {e}")
    except Exception as e:
        print(f"āŒ Unexpected error: {e}")
    return pd.DataFrame()

Checklist Best Practice

šŸ“‹ Checklist Best Practice Excel di Python
  • āœ… Selalu gunakan engine="openpyxl" secara eksplisit untuk file .xlsx
  • āœ… Gunakan with statement saat menggunakan ExcelWriter
  • āœ… Set index=False saat to_excel() untuk menghindari kolom index
  • āœ… Gunakan read_only=True di openpyxl untuk file besar
  • āœ… Tutup workbook setelah selesai (wb.close())
  • āœ… Gunakan data_only=True saat membaca jika ingin nilai formula, bukan formula
  • āœ… Tambahkan freeze_panes untuk header yang tetap terlihat
  • āœ… Set auto_filter untuk memudahkan filtering
  • āœ… Format angka (number_format) untuk keterbacaan
  • āœ… Gunakan utf-8-sig encoding untuk CSV yang dibuka di Excel Indonesia

12. Quiz Pemahaman

Uji pemahaman Anda tentang manipulasi Excel dengan Python:

1. Library apa yang paling cocok untuk membuat chart di Excel?

2. Parameter apa yang digunakan untuk membaca nilai formula (bukan formula-nya)?

3. Bagaimana cara menulis DataFrame pandas ke Excel?

4. Metode apa yang digunakan untuk auto-adjust lebar kolom di openpyxl?

5. Encoding apa yang tepat agar CSV bisa dibuka benar di Excel Indonesia?

šŸ” Zoom
100%
šŸŽØ Tema