E27085921
/

HIKARI-Rigel-8B-SkinCaption-LoRA

Image-Text-to-Text

Model card Files Files and versions

HIKARI-Rigel-8B-SkinCaption-LoRA / README.md

E27085921's picture

Upload HIKARI-Rigel-8B-SkinCaption-LoRA

ab5ba52 verified 9 days ago

|

history blame contribute delete

3.37 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen3-VL-8B-Thinking
	tags:
	- dermatology
	- medical
	- lora
	- peft
	- skin-disease
	- qwen3-vl
	language:
	- en
	- th
	pipeline_tag: image-text-to-text
	---

	<p align="center">
	<img src="HIKARI_logo.png" alt="HIKARI" width="100%"/>
	</p>

	<h1 align="center">HIKARI-Rigel-8B-SkinCaption-LoRA</h1>

	<p align="center">
	<img src="https://img.shields.io/badge/Type-LoRA%20Adapter-blueviolet?style=flat-square"/>
	<img src="https://img.shields.io/badge/Size-~1.1%20GB-lightblue?style=flat-square"/>
	<img src="https://img.shields.io/badge/Base-Qwen3--VL--8B--Thinking-blue?style=flat-square"/>
	<img src="https://img.shields.io/badge/License-Apache%202.0-orange?style=flat-square"/>
	</p>

	---

	## 🔌 Model Type: LoRA Adapter

	> This is a LoRA adapter (~1.1 GB) — it must be loaded on top of the base model `Qwen/Qwen3-VL-8B-Thinking`.
	>
	> ✅ Advantage: Lightweight — download only ~1.1 GB instead of ~17 GB.
	>
	> ⚠️ Requirement: You must separately load `Qwen/Qwen3-VL-8B-Thinking` (base model, ~17 GB) first.
	>
	> 💾 If you prefer a standalone ready-to-use model, see the merged version:
	> [E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption) (~17 GB)

	---

	## What is this adapter?

	LoRA adapter for [HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption) — Clinical skin lesion caption generation (checkpoint-init, ablation baseline). Metric: BLEU-4: 9.82.

	This is the ablation baseline adapter. For the best caption model, see [HIKARI-Vega-8B-SkinCaption-Fused-LoRA](https://huggingface.co/E27085921/HIKARI-Vega-8B-SkinCaption-Fused-LoRA).

	See the full model card at [E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption) for complete details, usage examples, and performance comparison.

	---

	## Usage

	```python
	from peft import PeftModel
	from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
	import torch
	from PIL import Image

	# Step 1: Load base model (Qwen3-VL-8B-Thinking, ~17 GB)
	base = Qwen3VLForConditionalGeneration.from_pretrained(
	"Qwen/Qwen3-VL-8B-Thinking",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True,
	)

	# Step 2: Apply LoRA adapter (~1.1 GB)
	model = PeftModel.from_pretrained(base, "E27085921/HIKARI-Rigel-8B-SkinCaption-LoRA")
	processor = AutoProcessor.from_pretrained("E27085921/HIKARI-Rigel-8B-SkinCaption-LoRA", trust_remote_code=True)

	# Step 3: Inference — see full examples at E27085921/HIKARI-Rigel-8B-SkinCaption
	image = Image.open("skin_lesion.jpg").convert("RGB")
	```

	For complete inference examples including vLLM and SGLang production code, see:
	[E27085921/HIKARI-Rigel-8B-SkinCaption](https://huggingface.co/E27085921/HIKARI-Rigel-8B-SkinCaption)

	---

	## 📄 Citation

	```bibtex
	@misc{hikari2026,
	title = {HIKARI: RAG-in-Training for Skin Disease Diagnosis
	with Cascaded Vision-Language Models},
	author = {Watin Promfiy and Pawitra Boonprasart},
	year = {2026},
	institution = {King Mongkut's Institute of Technology Ladkrabang,
	Department of Information Technology, Bangkok, Thailand}
	}
	```

	<p align="center">Made with ❤️ at <b>King Mongkut's Institute of Technology Ladkrabang (KMITL)</b></p>