JOOMED / README.md
JLKGroup's picture
Update README.md
10deba4 verified
---
license: apache-2.0
base_model: Qwen/Qwen3.6-35B-A3B
pipeline_tag: image-text-to-text
library_name: transformers
tags:
- vision-language-model
- medical-imaging
- brain-ct
- stroke
- region-classification
- lora
language:
- en
- ko
---
<div align="center">
# JOOMED ยท Brain-CT Lesion Region Classifier
**Qwen3.6-35B-A3B**(MoE Vision-Language Model)์„ LoRA๋กœ ํŒŒ์ธํŠœ๋‹ํ•œ ๋‡Œ์กธ์ค‘ ํŠนํ™” ์˜์—ญ ๋ถ„๋ฅ˜ ๋ชจ๋ธ
[![Base](https://img.shields.io/badge/base-Qwen3.6--35B--A3B-1f6feb)](https://huggingface.co/Qwen/Qwen3.6-35B-A3B)
[![License](https://img.shields.io/badge/license-Apache--2.0-16a34a)](./LICENSE)
[![Task](https://img.shields.io/badge/task-image--text--to--text-8957e5)](#)
</div>
---
CT summary ์˜์ƒ์—์„œ ๋ณ‘๋ณ€์ด ์œ„์น˜ํ•œ **ํ•ด๋ถ€ํ•™์  ์˜์—ญ(anatomical)** ๊ณผ **ํ˜ˆ๊ด€ ์ง€๋ฐฐ ์˜์—ญ(vascular)** ์„
์ขŒ์šฐ(L/R) ๊ตฌ๋ถ„๊ณผ ํ•จ๊ป˜ ๋ถ„๋ฅ˜ํ•˜์—ฌ **๊ตฌ์กฐํ™” JSON** ์œผ๋กœ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
```json
{"anatomical_regions": ["basal_ganglia_thalamus_right"],
"vascular_territories": ["MCA_right", "PCA_right"]}
```
## ๋ชจ๋ธ ๊ฐœ์š”
| | |
|---|---|
| **Base model** | `Qwen/Qwen3.6-35B-A3B` โ€” MoE VLM, 35B total / ~3B active (`qwen3_5_moe`) |
| **Adaptation** | LoRA (r=8, ฮฑ=16, attention `q/k/v/o_proj`) โ†’ ๋ฒ ์ด์Šค์— **๋ณ‘ํ•ฉ๋œ ํ’€๋ชจ๋ธ** |
| **Input โ†’ Output** | PNG (CT summary) โ†’ ์˜์—ญ๋ถ„๋ฅ˜ JSON |
| **Label space** | anatomical 23๊ทธ๋ฃน ยท vascular 14๊ทธ๋ฃน (coarse, ์ขŒ์šฐ ์œ ์ง€) |
| **Project** | RQT-25-090047 โ€” ๋‹ค์ค‘ ๋ชจ๋‹ฌ AI ๋‡Œ์กธ์ค‘ ์ž„์ƒ์ง€์› LLM (ใˆœ์ œ์ด์—˜์ผ€์ด) |
## ์„ฑ๋Šฅ
์˜์—ญ ์ถ”์ถœ ์ •ํ™•๋„ (set ๊ธฐ๋ฐ˜ per-sample F1, macro ํ‰๊ท ):
| ํ‰๊ฐ€์…‹ | n | Anatomical F1 | Vascular F1 | ํ‰๊ท  |
|---|---:|---:|---:|---:|
| ์ „์ฒด test | 12,140 | **0.741** | **0.802** | **0.771** |
๋ฌดํŒŒ์ธํŠœ๋‹ Base(0.29 / 0.30) ๋Œ€๋น„ **2.5๋ฐฐ ์ด์ƒ** ํ–ฅ์ƒ. ๊ฐœ์„ ์˜ ํ•ต์‹ฌ์€ recall ์ƒ์Šน
(anatomical +0.17, vascular +0.13)์œผ๋กœ, ๋ˆ„๋ฝ ๋ณ‘๋ณ€์ด ํฌ๊ฒŒ ๊ฐ์†Œํ–ˆ์Šต๋‹ˆ๋‹ค.
<details>
<summary>๋ณด์กฐ ํ…์ŠคํŠธ ์ง€ํ‘œ (์˜์—ญ ๋ผ๋ฒจ์„ ํ…์ŠคํŠธ๋กœ ์ง๋ ฌํ™”ํ•ด ์ธก์ • ยท ํŒ๋…๋ฌธ ํ’ˆ์งˆ ์•„๋‹˜)</summary>
| BLEU-1 | METEOR | BERTScore F1 | G-Eval |
|---:|---:|---:|---:|
| 0.798 | 0.794 | 0.952 | 3.68 |
> ์ถœ๋ ฅ์ด ์งง์€ ๋ผ๋ฒจ ๋ฌธ์žฅ์ด๋ผ ์ž์œ ํ…์ŠคํŠธ ํŒ๋…๋ฌธ ์ง€ํ‘œ์™€ ์ธก์ • ๋Œ€์ƒ์ด ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ๋น„๊ต ์‹œ ์ฃผ์˜.
</details>
## ์‚ฌ์šฉ๋ฒ•
```python
from transformers import AutoModelForImageTextToText, AutoProcessor
from PIL import Image
model_id = "JLKGroup/JOOMED"
model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype="bfloat16", device_map="auto")
processor = AutoProcessor.from_pretrained(model_id)
img = Image.open("ct_summary.png").convert("RGB")
prompt = "์ฃผ์–ด์ง„ CT summary ์ด๋ฏธ์ง€์—์„œ ๋ณ‘๋ณ€์ด ์†ํ•˜๋Š” anatomical region๊ณผ vascular territory๋ฅผ JSON์œผ๋กœ ๋‹ตํ•˜๋ผ."
messages = [{"role": "user", "content": [{"type": "image", "image": img}, {"type": "text", "text": prompt}]}]
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True, enable_thinking=False # ํ•™์Šต ํ…œํ”Œ๋ฆฟ๊ณผ ์ผ์น˜
)
inputs = processor(text=[text], images=[img], return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(processor.tokenizer.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```
> **Tip** โ€” ์ถ”๋ก  ์‹œ ๋ฐ˜๋“œ์‹œ `enable_thinking=False` ๋กœ ๋‘์–ด์•ผ ํ•™์Šต ์‹œ ํ…œํ”Œ๋ฆฟ๊ณผ ์ผ์น˜ํ•ฉ๋‹ˆ๋‹ค.
## ๋ผ๋ฒจ ์ฒด๊ณ„
| ์ถ• | ๊ทธ๋ฃน |
|---|---|
| Anatomical (ร—L/R) | frontal ยท parietal ยท temporal ยท occipital ยท insula ยท limbic ยท basal_ganglia_thalamus ยท cerebellum ยท brainstem ยท ventricle ยท white_matter_other |
| Vascular (ร—L/R) | ACA ยท MCA ยท PCA ยท basilar ยท cerebellar ยท anterior_choroidal ยท lateral_ventricle |
## ํ•œ๊ณ„ ๋ฐ ์ฃผ์˜
- **์˜๋ฃŒ ์—ฐ๊ตฌ์šฉ ๋ชจ๋ธ**์ž…๋‹ˆ๋‹ค. ์ž„์ƒ ์˜์‚ฌ๊ฒฐ์ •์˜ ๋‹จ๋… ๊ทผ๊ฑฐ๋กœ ์‚ฌ์šฉํ•˜์ง€ ๋งˆ์‹ญ์‹œ์˜ค.
## ๋ผ์ด์„ ์Šค
๋ฒ ์ด์Šค ๋ชจ๋ธ `Qwen3.6-35B-A3B`์˜ Apache-2.0๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.