ํ•ด์ถฉ ํƒ์ง€ VLM - Qwen3.5-9B LoRA

unsloth/Qwen3.5-9B๋ฅผ ํŒŒ์ธํŠœ๋‹ํ•œ ๋น„์ „-์–ธ์–ด PEFT ๊ธฐ๋ฐ˜ LoRA ์–ด๋Œ‘ํ„ฐ์ž…๋‹ˆ๋‹ค.
์ž‘๋ฌผ ์‚ฌ์ง„์—์„œ ํ•œ๊ตญ ๋†์ž‘๋ฌผ ํ•ด์ถฉ 19์ข…์„ ์‹๋ณ„ํ•ฉ๋‹ˆ๋‹ค.
์ œ๊ณต๋œ ์žŽ, ๊ณผ์‹ค, ์‹๋ฌผ ์ „์ฒด ์‚ฌ์ง„์— ๊ฐ์ง€๋œ ํ•ด์ถฉ์ด ์žˆ์„ ์‹œ ํ•ด์ถฉ์˜ ํ•œ๊ตญ์–ด ์ด๋ฆ„์„ ์ถœ๋ ฅํ•˜๊ณ , ํ•ด์ถฉ์ด ๊ฐ์ง€๋˜์ง€ ์•Š์œผ๋ฉด ์ •์ƒ์„ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

  • 19๊ฐœ ํด๋ž˜์Šค ๋ถ„๋ฅ˜๊ธฐ: ํ•ด์ถฉ 18์ข… + '์ •์ƒ' (ํ•ด์ถฉ ์—†์Œ)
  • ๋ฒ ์ด์Šค ๋ชจ๋ธ: unsloth/Qwen3.5-9B (๋น„์ „-์–ธ์–ด, ํ•˜์ด๋ธŒ๋ฆฌ๋“œ Linear + Self Attention)
  • ์–ด๋Œ‘ํ„ฐ ์œ ํ˜•: LoRA (PEFT), Rank 64, Alpha 128
  • ์–ธ์–ด: ํ•œ๊ตญ์–ด
  • ํฌ๊ธฐ: ์–ด๋Œ‘ํ„ฐ ๊ฐ€์ค‘์น˜ 693MB
ํด๋ž˜์Šค ๋ชฉ๋ก
  • ์ •์ƒ
  • ๊ฒ€๊ฑฐ์„ธ๋ฏธ๋ฐค๋‚˜๋ฐฉ
  • ๊ฝƒ๋…ธ๋ž‘์ด์ฑ„๋ฒŒ๋ ˆ
  • ๋‹ด๋ฐฐ๊ฐ€๋ฃจ์ด
  • ๋‹ด๋ฐฐ๊ฑฐ์„ธ๋ฏธ๋‚˜๋ฐฉ
  • ๋‹ด๋ฐฐ๋‚˜๋ฐฉ
  • ๋„๋‘‘๋‚˜๋ฐฉ
  • ๋จน๋…ธ๋ฆฐ์žฌ
  • ๋ชฉํ™”๋ฐ”๋‘‘๋ช…๋‚˜๋ฐฉ
  • ๋ฌด์žŽ๋ฒŒ
  • ๋ฐฐ์ถ”์ข€๋‚˜๋ฐฉ
  • ๋ฐฐ์ถ”ํฐ๋‚˜๋น„
  • ๋ฒผ๋ฃฉ์žŽ๋ฒŒ๋ ˆ
  • ๋ณต์ˆญ์•„ํ˜น์ง„๋”ง๋ฌผ
  • ํ™๋น„๋‹จ๋…ธ๋ฆฐ์žฌ
  • ์ฉ๋ฉ๋‚˜๋ฌด๋…ธ๋ฆฐ์žฌ
  • ์—ด๋Œ€๊ฑฐ์„ธ๋ฏธ๋‚˜๋ฐฉ
  • ํฐ28์ ๋ฐ•์ด๋ฌด๋‹น๋ฒŒ๋ ˆ
  • ํ†ฑ๋‹ค๋ฆฌ๊ฐœ๋ฏธํ—ˆ๋ฆฌ๋…ธ๋ฆฐ์žฌ
  • ํŒŒ๋ฐค๋‚˜๋ฐฉ

โš  ๋ฐฐํฌ ์ „์— ๋ฐ˜๋“œ์‹œ ์ฝ์–ด์•ผ ํ•  ๋‹จ ํ•œ ๊ฐ€์ง€

์ด LoRA๋Š” GGUF / llama.cpp / Ollama ๊ฒฝ๋กœ๋กœ ๋ฐฐํฌํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

ํ˜„์žฌ ์–ด๋Œ‘ํ„ฐ์˜ adapter_config.json์—์„œ target_modules์— in_proj_qkv, in_proj_z, in_proj_a, in_proj_b, out_proj๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋Š” Qwen3.5 ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์•„ํ‚คํ…์ฒ˜์˜ linear_attn ๊ณ„์—ด ํˆฌ์˜์— ํ•ด๋‹นํ•˜๋ฉฐ, GGUF ๋ณ€ํ™˜ ๊ฒฝ๋กœ(convert_hf_to_gguf.py์˜ _reorder_v_heads)์—์„œ LoRA ๋ธํƒ€๊ฐ€ ๋ณด์กด๋˜์ง€ ์•Š์•„ ์ถœ๋ ฅ ๋ถ•๊ดด๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์š”์•ฝ:

  • merge_and_unload ๋˜๋Š” save_pretrained_merged ํ›„ GGUF ๋ณ€ํ™˜: ๊ณ ์œ„ํ—˜
  • FastVisionModel + PeftModel.from_pretrained ๋Ÿฐํƒ€์ž„ LoRA: ๊ถŒ์žฅ
  • ๋ฐฐํฌ๋Š” HF Transformers/Unsloth ๊ธฐ๋ฐ˜ ์„œ๋ฒ„๋กœ ์œ ์ง€ ๊ถŒ์žฅ

ํ•™์Šต ์„ค์ •

์—ด๊ธฐ/์ ‘๊ธฐ
ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’
LoRA Rank 64
LoRA Alpha 128
rsLoRA ์‚ฌ์šฉ True
๋น„์ „ ๋ ˆ์ด์–ด ํŒŒ์ธํŠœ๋‹ False
ํ•™์Šต๋ฅ  0.000116
์›œ์—… ๋น„์œจ 0.03
๊ฐ€์ค‘์น˜ ๊ฐ์‡  0.013802
LR ์Šค์ผ€์ค„๋Ÿฌ linear
์˜ตํ‹ฐ๋งˆ์ด์ € adamw_torch
๋””๋ฐ”์ด์Šค๋‹น ๋ฐฐ์น˜ 1
๊ทธ๋ž˜๋””์–ธํŠธ ๋ˆ„์  8
์œ ํšจ ๋ฐฐ์น˜ 8
์ตœ๋Œ€ ์‹œํ€€์Šค ๊ธธ์ด 1024
Epoch ์ˆ˜ 1
Tight Crop ํ™•๋ฅ  0.4561
์ •๋ฐ€๋„ ํ˜•์‹ bf16
๊ทธ๋ž˜๋””์–ธํŠธ ์ฒดํฌํฌ์ธํŒ… True
ํ•™์Šต ์‹œ๊ฐ„ 1115๋ถ„
  • ํ•™์Šต ํ•˜๋“œ์›จ์–ด: RTX A40 48G

ํ•™์Šต ์ฝ”๋“œ์™€ ํ•˜์ดํผ-ํŒŒ๋ผ๋ฏธํ„ฐ ํƒ์ƒ‰ ์ฝ”๋“œ๋Š” WizWix/model-finetuner์— ์žˆ์Šต๋‹ˆ๋‹ค.

ํ‰๊ฐ€ ๊ฒฐ๊ณผ

Himedia-AI-01/kor-pest-detection-webp์˜ ๊ฒ€์ฆ ์„ธํŠธ ์ƒ˜ํ”Œ (์ „์ฒด 1,535๊ฐœ)๋กœ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค.

์ง€ํ‘œ ๊ฐ’
์ •ํ™•๋„ (Accuracy) 99.48%
์ •๋ฐ€๋„ (Precision, Macro) 99.07%
์ •๋ฐ€๋„ (Precision, Weighted) 99.49%
์žฌํ˜„์œจ (Recall, Macro) 98.79%
์žฌํ˜„์œจ (Recall, Weighted) 99.48%
F1 (Macro) 98.91%
F1 (Weighted) 99.48%
๊ฒ€์ฆ ์ƒ˜ํ”Œ ์ˆ˜ 1,535

W&B Report

ํ˜ผ๋™ ํ–‰๋ ฌ

ํด๋ž˜์Šค๋ณ„ ์„ฑ๋Šฅ

ํด๋ž˜์Šค ์ •๋ฐ€๋„ (Precision) ์žฌํ˜„์œจ (Recall) F1 ์ƒ˜ํ”Œ ์ˆ˜
๊ฒ€๊ฑฐ์„ธ๋ฏธ๋ฐค๋‚˜๋ฐฉ 0.9286 0.9286 0.9286 14
๊ฝƒ๋…ธ๋ž‘์ด์ฑ„๋ฒŒ๋ ˆ 1.0000 1.0000 1.0000 44
๋‹ด๋ฐฐ๊ฐ€๋ฃจ์ด 1.0000 1.0000 1.0000 41
๋‹ด๋ฐฐ๊ฑฐ์„ธ๋ฏธ๋‚˜๋ฐฉ 1.0000 1.0000 1.0000 46
๋‹ด๋ฐฐ๋‚˜๋ฐฉ 1.0000 1.0000 1.0000 69
๋„๋‘‘๋‚˜๋ฐฉ 1.0000 0.9231 0.9600 13
๋จน๋…ธ๋ฆฐ์žฌ 0.9894 1.0000 0.9947 93
๋ชฉํ™”๋ฐ”๋‘‘๋ช…๋‚˜๋ฐฉ 1.0000 1.0000 1.0000 34
๋ฌด์žŽ๋ฒŒ 1.0000 1.0000 1.0000 22
๋ฐฐ์ถ”์ข€๋‚˜๋ฐฉ 0.9733 1.0000 0.9865 73
๋ฐฐ์ถ”ํฐ๋‚˜๋น„ 1.0000 1.0000 1.0000 116
๋ฒผ๋ฃฉ์žŽ๋ฒŒ๋ ˆ 1.0000 1.0000 1.0000 203
๋ณต์ˆญ์•„ํ˜น์ง„๋”ง๋ฌผ 1.0000 1.0000 1.0000 35
์ฉ๋ฉ๋‚˜๋ฌด๋…ธ๋ฆฐ์žฌ 1.0000 0.9917 0.9958 120
์—ด๋Œ€๊ฑฐ์„ธ๋ฏธ๋‚˜๋ฐฉ 0.9423 0.9800 0.9608 50
์ •์ƒ 1.0000 0.9932 0.9966 147
ํฐ28์ ๋ฐ•์ด๋ฌด๋‹น๋ฒŒ๋ ˆ 1.0000 1.0000 1.0000 133
ํ†ฑ๋‹ค๋ฆฌ๊ฐœ๋ฏธํ—ˆ๋ฆฌ๋…ธ๋ฆฐ์žฌ 1.0000 1.0000 1.0000 75
ํŒŒ๋ฐค๋‚˜๋ฐฉ 0.9796 0.9412 0.9600 51
ํ™๋น„๋‹จ๋…ธ๋ฆฐ์žฌ 1.0000 1.0000 1.0000 156

LoRA ์–ด๋Œ‘ํ„ฐ ์œ ๋ฌด์— ๋”ฐ๋ฅธ ์ฐจ์ด

์œ ์‚ฌํ•œ ๋ฐ์ดํ„ฐ์…‹์˜ ์‹คํ—˜ ๊ฒฐ๊ณผ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์‚ฌ์šฉ ์˜ˆ์‹œ

๋น ๋ฅธ ์‹œ์ž‘ (๊ถŒ์žฅ ์ถ”๋ก  ๊ฒฝ๋กœ: Unsloth + Runtime PEFT)

import torch
from unsloth import FastVisionModel
from peft import PeftModel
from PIL import Image

BASE    = "unsloth/Qwen3.5-9B"
ADAPTER = "WizWix/kor-pest-detector"

# ํ•™์Šต ์‹œ ์‚ฌ์šฉํ•œ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ
SYSTEM_MSG = (
    "๋‹น์‹ ์€ ์ž‘๋ฌผ ํ•ด์ถฉ ์‹๋ณ„ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค. "
    "์‚ฌ์ง„์„ ๋ณด๊ณ  ํ•ด์ถฉ์˜ ์ด๋ฆ„๋งŒ ํ•œ๊ตญ์–ด๋กœ ๋‹ตํ•˜์„ธ์š”. "
    'ํ•ด์ถฉ์ด ์—†์œผ๋ฉด "์ •์ƒ"์ด๋ผ๊ณ ๋งŒ ๋‹ตํ•˜์„ธ์š”. '
    "๋ถ€๊ฐ€ ์„ค๋ช… ์—†์ด ์ด๋ฆ„๋งŒ ์ถœ๋ ฅํ•˜์„ธ์š”."
)

# ๋ฒ ์ด์Šค ๋ชจ๋ธ ๋กœ๋“œ + ์–ด๋Œ‘ํ„ฐ ์—ฐ๊ฒฐ
model, tokenizer = FastVisionModel.from_pretrained(BASE, load_in_4bit=False)
model = PeftModel.from_pretrained(model, ADAPTER)
FastVisionModel.for_inference(model)
model.eval()

# ์ด๋ฏธ์ง€ ์ค€๋น„
image = Image.open("pest.jpg").convert("RGB")

messages = [
    {"role": "system", "content": [{"type": "text", "text": SYSTEM_MSG}]},
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "์ด ์‚ฌ์ง„์— ์žˆ๋Š” ํ•ด์ถฉ์˜ ์ด๋ฆ„์„ ์•Œ๋ ค์ฃผ์„ธ์š”."},
        ],
    },
]

text = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    enable_thinking=False,
)
inputs = tokenizer(image, text, add_special_tokens=False, return_tensors="pt").to("cuda")

with torch.inference_mode():
    out = model.generate(
        **inputs,
        max_new_tokens=10,
        use_cache=True,
        stop_strings=["\n"],
        tokenizer=tokenizer.tokenizer,
    )

prediction = tokenizer.decode(out[0][inputs["input_ids"].shape[1] :], skip_special_tokens=True).strip()
print(prediction) # ์˜ˆ: "๋ฐฐ์ถ”ํฐ๋‚˜๋น„"

์ฃผ์˜: ๋ณ‘ํ•ฉ/GGUF ๋ณ€ํ™˜ ๊ฒฝ๋กœ ๋น„๊ถŒ์žฅ

์•„๋ž˜ ๊ฒฝ๋กœ๋Š” ํ˜„์žฌ ์–ด๋Œ‘ํ„ฐ ํƒ€๊นƒ ๋ชจ๋“ˆ ๊ตฌ์„ฑ(in_proj_qkv/z/a/b/out_proj)๊ณผ ์ถฉ๋Œ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์•„ ๊ถŒ์žฅํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

# ๋น„๊ถŒ์žฅ ์˜ˆ์‹œ (์‹คํ–‰ํ•˜์ง€ ๋งˆ์„ธ์š”)
# model = model.merge_and_unload()
# model.save_pretrained("./merged")
# ์ดํ›„ GGUF ๋ณ€ํ™˜ ๋ฐ llama.cpp/Ollama ๋ฐฐํฌ

๋ผ์ด์„ ์Šค

๋ฒ ์ด์Šค ๋ชจ๋ธ ๋ฐ ๋ฐ์ดํ„ฐ์…‹์˜ ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ์•ฝ๊ด€์€ unsloth/Qwen3.5-9B ๋ฐ Himedia-AI-01/kor-pest-detection-webp๋ฅผ ํ™•์ธํ•˜์„ธ์š”.

Downloads last month
43
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for WizWix/kor-pest-detector

Finetuned
Qwen/Qwen3.5-9B
Adapter
(65)
this model

Evaluation results

  • Accuracy on Himedia-AI-01/kor-pest-detection-webp
    self-reported
    0.995
  • F1 (macro) on Himedia-AI-01/kor-pest-detection-webp
    self-reported
    0.989
  • F1 (weighted) on Himedia-AI-01/kor-pest-detection-webp
    self-reported
    0.995
  • Precision (macro) on Himedia-AI-01/kor-pest-detection-webp
    self-reported
    0.991
  • Recall (macro) on Himedia-AI-01/kor-pest-detection-webp
    self-reported
    0.988