E27085921's picture
Upload HIKARI-Rigel-8B-SkinCaption-LoRA
ab5ba52 verified
---
license: apache-2.0
base_model: Qwen/Qwen3-VL-8B-Thinking
tags:
- dermatology
- medical
- lora
- peft
- skin-disease
- qwen3-vl
language:
- en
- th
pipeline_tag: image-text-to-text
---
<p align="center">
<img src="HIKARI_logo.png" alt="HIKARI" width="100%"/>
</p>
<h1 align="center">HIKARI-Rigel-8B-SkinCaption-LoRA</h1>
<p align="center">
<img src="https://img.shields.io/badge/Type-LoRA%20Adapter-blueviolet?style=flat-square"/>
<img src="https://img.shields.io/badge/Size-~1.1%20GB-lightblue?style=flat-square"/>
<img src="https://img.shields.io/badge/Base-Qwen3--VL--8B--Thinking-blue?style=flat-square"/>
<img src="https://img.shields.io/badge/License-Apache%202.0-orange?style=flat-square"/>
</p>
---
## πŸ”Œ Model Type: LoRA Adapter
> This is a **LoRA adapter** (~1.1 GB) β€” it must be loaded **on top of** the base model `Qwen/Qwen3-VL-8B-Thinking`.
>
> βœ… **Advantage:** Lightweight β€” download only ~1.1 GB instead of ~17 GB.
>
> ⚠️ **Requirement:** You must separately load `Qwen/Qwen3-VL-8B-Thinking` (base model, ~17 GB) first.
>
> πŸ’Ύ If you prefer a standalone ready-to-use model, see the merged version:
> **[E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)** (~17 GB)
---
## What is this adapter?
LoRA adapter for **[HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)** β€” Clinical skin lesion caption generation (checkpoint-init, ablation baseline). Metric: **BLEU-4: 9.82**.
This is the ablation baseline adapter. For the best caption model, see [HIKARI-Vega-8B-SkinCaption-Fused-LoRA](https://huggingface.co/E27085921/HIKARI-Vega-8B-SkinCaption-Fused-LoRA).
See the full model card at **[E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)** for complete details, usage examples, and performance comparison.
---
## Usage
```python
from peft import PeftModel
from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
import torch
from PIL import Image
# Step 1: Load base model (Qwen3-VL-8B-Thinking, ~17 GB)
base = Qwen3VLForConditionalGeneration.from_pretrained(
"Qwen/Qwen3-VL-8B-Thinking",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
# Step 2: Apply LoRA adapter (~1.1 GB)
model = PeftModel.from_pretrained(base, "E27085921/HIKARI-Rigel-8B-SkinCaption-LoRA")
processor = AutoProcessor.from_pretrained("E27085921/HIKARI-Rigel-8B-SkinCaption-LoRA", trust_remote_code=True)
# Step 3: Inference β€” see full examples at E27085921/HIKARI-Rigel-8B-SkinCaption
image = Image.open("skin_lesion.jpg").convert("RGB")
```
For complete inference examples including vLLM and SGLang production code, see:
**[E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)**
---
## πŸ“„ Citation
```bibtex
@misc{hikari2026,
title = {HIKARI: RAG-in-Training for Skin Disease Diagnosis
with Cascaded Vision-Language Models},
author = {Watin Promfiy and Pawitra Boonprasart},
year = {2026},
institution = {King Mongkut's Institute of Technology Ladkrabang,
Department of Information Technology, Bangkok, Thailand}
}
```
<p align="center">Made with ❀️ at <b>King Mongkut's Institute of Technology Ladkrabang (KMITL)</b></p>