🩺 Skin Disease Classification — EfficientNet-B4 & ViT-Small-384 Ensemble

Repo chứa 4 model weights (.pth) cho bài toán phân loại bệnh da liễu từ ảnh dermoscopy, được huấn luyện với hai kiến trúc học sâu: EfficientNet-B4 và ViT-Small-Patch16-384. Các model phục vụ một Ensemble Pipeline — kết hợp voting thông minh để tăng độ chính xác.

📁 Danh sách model

File	Kiến trúc	Số lớp	Lớp bị loại	Input	~Size
`last_effnetb4.pth`	EfficientNet-B4	7	(đủ)	380×380	68 MB
`last_effnetb4_main.pth`	EfficientNet-B4	6	Bỏ `akiec`	380×380	68 MB
`last_vit_small.pth`	ViT-Small-Patch16-384	7	(đủ)	384×384	83 MB
`best_vit_small_384_marcoF1(2).pth`	ViT-Small-Patch16-384	6	Bỏ `akiec`	384×384	83 MB

🦠 Class mapping

7 lớp (dùng cho `last_effnetb4.pth` và `last_vit_small.pth`)

label_mapping_7 = {
    'AK': 0,    # Actinic Keratosis   — Bệnh sừng quang hóa (tiền ung thư)
    'BCC': 1,   # Basal Cell Carcinoma — Ung thư tế bào đáy
    'BKL': 2,   # Benign Keratosis    — Bệnh sừng lành tính
    'DF': 3,    # Dermatofibroma      — U sợi da
    'MEL': 4,   # Melanoma            — Ung thư tế bào hắc tố ⚠️
    'NV': 5,    # Melanocytic Nevi    — Nốt ruồi
    'VASC': 6,  # Vascular Lesions   — Tổn thương mạch máu
}

6 lớp (dùng cho `last_effnetb4_main.pth` và `best_vit_small_384_marcoF1(2).pth`)

label_mapping_6 = {
    'BCC': 0,   # Basal Cell Carcinoma
    'BKL': 1,   # Benign Keratosis
    'DF': 2,    # Dermatofibroma
    'MEL': 3,   # Melanoma ⚠️
    'NV': 4,    # Melanocytic Nevi
    'VASC': 5,  # Vascular Lesions
}
# ❌ akiec (Actinic Keratosis) đã bị loại khỏi tập dữ liệu huấn luyện

📊 Kết quả đánh giá (Test Set)

Dataset sau khi xử lý: 18,533 ảnh (ISIC2019 + HAM10000), stratified split 80/10/10.
Test set ~1,854 ảnh.

Model	Classes	Accuracy	Macro F1	Macro AUROC
`last_effnetb4.pth`	7	89.7%	0.88	—
`last_effnetb4_main.pth`	6	tương đương	tương đương	—
`last_vit_small.pth`	7	89.6%	0.87	—
`best_vit_small_384_marcoF1(2).pth`	6	tương đương	≥ giai đoạn 1	—

Metric chính được optimize trong training là Macro F1 (quan trọng hơn Accuracy vì dữ liệu mất cân bằng và đây là bài toán y tế).

🗃️ Dữ liệu huấn luyện

Kết hợp ISIC2019 và HAM10000, sau khi xử lý mất cân bằng:

Nhãn	Số ảnh	Nguồn
AK (Actinic Keratosis)	867	ISIC2019
BCC (Basal Cell Carcinoma)	3,323	ISIC2019
BKL (Benign Keratosis)	2,624	ISIC2019
DF (Dermatofibroma)	239	ISIC2019
MEL (Melanoma)	4,522	ISIC2019
NV (Melanocytic Nevi)	6,705	HAM10000 (thay thế NV của ISIC2019)
VASC (Vascular Lesions)	253	ISIC2019
Tổng	18,533

Xử lý mất cân bằng:

Loại bỏ ~12,000 ảnh NV gốc của ISIC2019, thay bằng NV từ HAM10000
Loại bỏ nhãn SCC (giảm từ 8 xuống 7 lớp)
CrossEntropyLoss với class_weight='balanced'

🏗️ Kiến trúc & Chi tiết huấn luyện

Model 1 & 3: EfficientNet-B4 (`last_effnetb4.pth` & `last_effnetb4_main.pth`)

Hai model EfficientNet được train từ cùng kiến trúc nhưng khác dataset (7 vs 6 lớp) và khác siêu tham số:

Tham số	`last_effnetb4.pth` (7 lớp)	`last_effnetb4_main.pth` (6 lớp)
Dataset	ISIC2019 + HAM10000, 7 lớp	ISIC2019 + HAM10000, 6 lớp (bỏ `akiec`)
Input size	380×380	380×380
Batch size	16	64
Optimizer	AdamW (lr=1e-4, wd=0.01)	AdamW (lr=1e-4, wd=0.01)
Scheduler	ReduceLROnPlateau (patience=5)	ReduceLROnPlateau (patience=5)
Label smoothing	0.05	0.1
Early stopping patience	8	10
Max epochs	50	50
Training stages	1 stage	1 stage

Architecture:

model = EfficientNet.from_pretrained('efficientnet-b4')
model._fc = nn.Linear(model._fc.in_features, num_classes)  # 7 hoặc 6

Data augmentation (train):

Resize(380×380) → RandomRotation(±15°) → RandomHorizontalFlip
→ ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.05)
→ RandomAffine(translate=0.05, scale=0.95-1.05)
→ ToTensor → RandomErasing(p=0.3) → Normalize(ImageNet)

Model 2 & 4: ViT-Small-Patch16-384 (`last_vit_small.pth` & `best_vit_small_384_marcoF1(2).pth`)

Tham số	`last_vit_small.pth` (7 lớp)	`best_vit_small_384_marcoF1(2).pth` (6 lớp)
Dataset	7 lớp	6 lớp (bỏ `akiec`)
Input size	384×384	384×384
Batch size	64	64
Dropout	0.2	0.2
Training stages	2 stages	2 stages
Stage 1	Freeze backbone, chỉ train `head` (lr=1e-5, 30 epochs)	Freeze backbone, chỉ train `head` (lr=1e-5, 30 epochs)
Stage 2	Unfreeze toàn bộ (lr=5e-5, 30 epochs)	Unfreeze toàn bộ (lr=5e-5, 30 epochs)
Scheduler	ReduceLROnPlateau (patience=3, factor=0.5)	ReduceLROnPlateau (patience=3, factor=0.5)
Early stopping patience	7	7
Best model saved by	Macro F1	Macro F1

Architecture:

import timm
model = timm.create_model('vit_small_patch16_384', pretrained=True,
                           num_classes=num_classes,  # 7 hoặc 6
                           drop_rate=0.2)

Data augmentation (train — ViT):

ToPILImage → RandomRotation(±30°) → RandomHorizontalFlip → RandomVerticalFlip
→ ColorJitter(0.3, 0.3, 0.3, 0.1) → RandomAffine(translate=0.1, scale=0.9-1.1)
→ ToTensor → RandomErasing(p=0.3, scale=0.02-0.2) → Normalize(ImageNet)

Lưu ý: ViT dùng augmentation mạnh hơn EfficientNet (Rotation ±30° vs ±15°, thêm VerticalFlip, ColorJitter mạnh hơn).

🚀 Cách sử dụng

Cài đặt

pip install torch torchvision timm efficientnet_pytorch huggingface_hub

Tải về tất cả models

from huggingface_hub import hf_hub_download

REPO_ID = "jamus0702/skin-disease-classification"

for filename in [
    "last_effnetb4.pth",
    "last_effnetb4_main.pth",
    "last_vit_small.pth",
    "best_vit_small_384_marcoF1(2).pth",
]:
    hf_hub_download(repo_id=REPO_ID, filename=filename,
                    local_dir="./weights", local_dir_use_symlinks=False)
    print(f"✅ {filename}")

Load EfficientNet-B4 (7 lớp)

import torch
from efficientnet_pytorch import EfficientNet
from torchvision import transforms
from PIL import Image

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

label_mapping = {'AK':0,'BCC':1,'BKL':2,'DF':3,'MEL':4,'NV':5,'VASC':6}
idx_to_label = {v: k for k, v in label_mapping.items()}

model = EfficientNet.from_name('efficientnet-b4')
model._fc = torch.nn.Linear(model._fc.in_features, 7)
model.load_state_dict(torch.load("weights/last_effnetb4.pth", map_location=device))
model = model.to(device).eval()

transform = transforms.Compose([
    transforms.Resize((380, 380)),
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225]),
])

image = Image.open("your_image.jpg").convert("RGB")
with torch.no_grad():
    probs = torch.softmax(model(transform(image).unsqueeze(0).to(device)), dim=1)[0]
    idx = probs.argmax().item()
print(f"{idx_to_label[idx]}: {probs[idx]*100:.1f}%")

Load ViT-Small-384 (6 lớp)

import torch, timm
from torchvision import transforms
from PIL import Image

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

label_mapping = {'BCC':0,'BKL':1,'DF':2,'MEL':3,'NV':4,'VASC':5}
idx_to_label = {v: k for k, v in label_mapping.items()}

model = timm.create_model('vit_small_patch16_384', pretrained=False,
                           num_classes=6, drop_rate=0.2)
model.load_state_dict(torch.load("weights/best_vit_small_384_marcoF1(2).pth",
                                  map_location=device))
model = model.to(device).eval()

transform = transforms.Compose([
    transforms.Resize((384, 384)),
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225]),
])

image = Image.open("your_image.jpg").convert("RGB")
with torch.no_grad():
    probs = torch.softmax(model(transform(image).unsqueeze(0).to(device)), dim=1)[0]
    idx = probs.argmax().item()
print(f"{idx_to_label[idx]}: {probs[idx]*100:.1f}%")

⚠️ Disclaimer

Đây là mô hình nghiên cứu học thuật, không thay thế chẩn đoán y tế chuyên nghiệp. Kết quả chỉ mang tính tham khảo — người dùng nên tham khảo ý kiến bác sĩ da liễu.

📚 Tài liệu tham khảo

Tan, M., & Le, Q. (2019). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML 2019.
Dosovitskiy, A., et al. (2021). An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale. ICLR 2021.
Tschandl, P., et al. (2018). The HAM10000 dataset. Scientific Data.
ISIC Archive. (2019). Skin Lesion Analysis Toward Melanoma Detection.

Downloads last month: -; Downloads are not tracked for this model. How to track