| --- |
| license: apache-2.0 |
| base_model: Qwen/Qwen3-VL-8B-Thinking |
| tags: |
| - dermatology |
| - medical |
| - lora |
| - peft |
| - skin-disease |
| - qwen3-vl |
| language: |
| - en |
| - th |
| pipeline_tag: image-text-to-text |
| --- |
| |
| <p align="center"> |
| <img src="HIKARI_logo.png" alt="HIKARI" width="100%"/> |
| </p> |
|
|
| <h1 align="center">HIKARI-Rigel-8B-SkinCaption-LoRA</h1> |
|
|
| <p align="center"> |
| <img src="https://img.shields.io/badge/Type-LoRA%20Adapter-blueviolet?style=flat-square"/> |
| <img src="https://img.shields.io/badge/Size-~1.1%20GB-lightblue?style=flat-square"/> |
| <img src="https://img.shields.io/badge/Base-Qwen3--VL--8B--Thinking-blue?style=flat-square"/> |
| <img src="https://img.shields.io/badge/License-Apache%202.0-orange?style=flat-square"/> |
| </p> |
|
|
| --- |
|
|
| ## π Model Type: LoRA Adapter |
|
|
| > This is a **LoRA adapter** (~1.1 GB) β it must be loaded **on top of** the base model `Qwen/Qwen3-VL-8B-Thinking`. |
| > |
| > β
**Advantage:** Lightweight β download only ~1.1 GB instead of ~17 GB. |
| > |
| > β οΈ **Requirement:** You must separately load `Qwen/Qwen3-VL-8B-Thinking` (base model, ~17 GB) first. |
| > |
| > πΎ If you prefer a standalone ready-to-use model, see the merged version: |
| > **[E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)** (~17 GB) |
|
|
| --- |
|
|
| ## What is this adapter? |
|
|
| LoRA adapter for **[HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)** β Clinical skin lesion caption generation (checkpoint-init, ablation baseline). Metric: **BLEU-4: 9.82**. |
|
|
| This is the ablation baseline adapter. For the best caption model, see [HIKARI-Vega-8B-SkinCaption-Fused-LoRA](https://huggingface.co/E27085921/HIKARI-Vega-8B-SkinCaption-Fused-LoRA). |
|
|
| See the full model card at **[E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)** for complete details, usage examples, and performance comparison. |
|
|
| --- |
|
|
| ## Usage |
|
|
| ```python |
| from peft import PeftModel |
| from transformers import Qwen3VLForConditionalGeneration, AutoProcessor |
| import torch |
| from PIL import Image |
| |
| # Step 1: Load base model (Qwen3-VL-8B-Thinking, ~17 GB) |
| base = Qwen3VLForConditionalGeneration.from_pretrained( |
| "Qwen/Qwen3-VL-8B-Thinking", |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| trust_remote_code=True, |
| ) |
| |
| # Step 2: Apply LoRA adapter (~1.1 GB) |
| model = PeftModel.from_pretrained(base, "E27085921/HIKARI-Rigel-8B-SkinCaption-LoRA") |
| processor = AutoProcessor.from_pretrained("E27085921/HIKARI-Rigel-8B-SkinCaption-LoRA", trust_remote_code=True) |
| |
| # Step 3: Inference β see full examples at E27085921/HIKARI-Rigel-8B-SkinCaption |
| image = Image.open("skin_lesion.jpg").convert("RGB") |
| ``` |
|
|
| For complete inference examples including vLLM and SGLang production code, see: |
| **[E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)** |
|
|
| --- |
|
|
| ## π Citation |
|
|
| ```bibtex |
| @misc{hikari2026, |
| title = {HIKARI: RAG-in-Training for Skin Disease Diagnosis |
| with Cascaded Vision-Language Models}, |
| author = {Watin Promfiy and Pawitra Boonprasart}, |
| year = {2026}, |
| institution = {King Mongkut's Institute of Technology Ladkrabang, |
| Department of Information Technology, Bangkok, Thailand} |
| } |
| ``` |
|
|
| <p align="center">Made with β€οΈ at <b>King Mongkut's Institute of Technology Ladkrabang (KMITL)</b></p> |
|
|