HIKARI-Subaru-8B-SkinGroup
Healthcare-oriented Intelligent Knowledge Augmented Retrieval and Inference
Named after Subaru (ζ΄) β the Pleiades star cluster, a group of stars for a model that classifies skin disease groups
π¦ Model Type: Merged Full Model
This is a fully merged model β the LoRA adapter weights have been merged directly into the base model weights.
β No adapter loading needed. Load and run directly with
transformers,vLLM, orSGLang.πΎ Size: ~17 GB (4 safetensor shards)
Overview
HIKARI-Subaru is Stage 1 of the HIKARI dermatology pipeline β a 4-class skin disease group classifier that divides skin lesions into broad disease families. Its output feeds the Stage 2 disease classifier as context, enabling more precise 10-class diagnosis.
| Property | Value |
|---|---|
| Task | 4-class skin disease group classification (Stage 1) |
| Base model | Qwen/Qwen3-VL-8B-Thinking |
| Training | Unsloth + LoRA (Fuzzy Top-K sampling) |
| Val accuracy | 88.68% on SkinCAP validation set |
| Model type | Merged full model |
π©Ί 4 Disease Groups
| Group | Included Diseases |
|---|---|
inflammatory |
atopic_dermatitis, psoriasis, seborrheic_dermatitis, urticaria |
infectious |
acne_vulgaris, tinea_versicolor |
neoplastic |
melanocytic_nevi, sccis (Bowen's disease), skin_tag |
photodermatoses |
photodermatoses |
π§ Usage
Stage 1 in the Full HIKARI Pipeline
π· Image
β
βΌ
[Stage 1] HIKARI-Subaru-8B-SkinGroup βββΊ group label β YOU ARE HERE
β
βΌ
[Stage 2] HIKARI-Sirius-8B-SkinDx-RAG βββΊ disease label
β
βΌ
[Stage 3] HIKARI-Vega-8B-SkinCaption-Fused βββΊ clinical caption
Quick Inference β transformers
from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
import torch
from PIL import Image
model_id = "E27085921/HIKARI-Subaru-8B-SkinGroup"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = Qwen3VLForConditionalGeneration.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
image = Image.open("skin_lesion.jpg").convert("RGB")
PROMPT = (
"Look at this skin lesion image. Based on the visual features, classify it into "
"one of these groups: inflammatory, infectious, neoplastic, or photodermatoses. "
"What group does this lesion belong to?"
)
messages = [{"role": "user", "content": [
{"type": "image", "image": image},
{"type": "text", "text": PROMPT},
]}]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=32, temperature=0.0, do_sample=False)
group = processor.batch_decode(
out[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True
)[0].strip()
print(group) # β "inflammatory"
Production β vLLM BnB-4bit β‘
from vllm import LLM, SamplingParams
from transformers import AutoProcessor
from PIL import Image
model_id = "E27085921/HIKARI-Subaru-8B-SkinGroup"
llm = LLM(
model=model_id,
quantization="bitsandbytes",
load_format="bitsandbytes",
trust_remote_code=True,
max_model_len=2048,
gpu_memory_utilization=0.88,
)
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
sp = SamplingParams(max_tokens=32, temperature=0.0)
PROMPT = (
"Look at this skin lesion image. Based on the visual features, classify it into "
"one of these groups: inflammatory, infectious, neoplastic, or photodermatoses. "
"What group does this lesion belong to?"
)
def classify_group(image: Image.Image) -> str:
messages = [{"role": "user", "content": [
{"type": "image", "image": image},
{"type": "text", "text": PROMPT},
]}]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
n = max(text.count("<|vision_start|>"), 1)
out = llm.generate({"prompt": text, "multi_modal_data": {"image": [image] * n}}, sp)
return out[0].outputs[0].text.strip()
img = Image.open("skin_lesion.jpg").convert("RGB")
print(classify_group(img)) # β "inflammatory"
π HIKARI Model Family
| Model | Task | Metric | Type |
|---|---|---|---|
| HIKARI-Subaru-8B-SkinGroup (this model) | 4-class group β Stage 1 | 88.68% | Merged |
| β HIKARI-Sirius-8B-SkinDx-RAG | 10-class disease β Stage 2 | 85.86% | Merged + LoRA |
| β HIKARI-Vega-8B-SkinCaption-Fused | Clinical caption β Stage 3 | BLEU-4: 29.33 | Merged + LoRA |
π Citation
@misc{hikari2026,
title = {HIKARI: RAG-in-Training for Skin Disease Diagnosis
with Cascaded Vision-Language Models},
author = {Watin Promfiy and Pawitra Boonprasart},
year = {2026},
institution = {King Mongkut's Institute of Technology Ladkrabang,
Department of Information Technology, Bangkok, Thailand}
}
Made with β€οΈ at King Mongkut's Institute of Technology Ladkrabang (KMITL)
Department of Information Technology
- Downloads last month
- 16