Qwen2.5-7B Construction Equipment Summarization LoRA (Korean) βœ… Best

건섀μž₯λΉ„ μœ μ§€λ³΄μˆ˜ ν΄λ ˆμž„ μš”μ•½ LoRA μ–΄λŒ‘ν„° (SFT v2 - μ΅œμ’… μΆ”μ²œ λͺ¨λΈ)

Model Description

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Training Method: SFT v2 (Supervised Fine-Tuning)
  • Domain: Construction equipment maintenance & repair claim summarization
  • Task: Korean claim report β†’ Korean summary
  • Framework: Transformers + PEFT

Performance (summarization_test.json, 100 samples)

Model ROUGE-1 ROUGE-2 ROUGE-L
Qwen Base 32.94 14.57 32.94
Qwen SFT v2 (this) 34.64 14.66 34.39
Qwen DPO v2 33.20 12.07 33.20

Comparison with GPT-OSS-20B

Model ROUGE-1 ROUGE-2 ROUGE-L
Qwen SFT v2 (7B, this) 34.64 14.66 34.39
GPT-OSS SFT v2 (20B) 34.41 14.74 34.16

Nearly identical performance despite Qwen being 7B vs GPT-OSS 20B.

Key Findings

  • Qwen Base already has strong Korean summarization ability (ROUGE-1 32.94)
  • SFT provides modest improvement (+1.70 ROUGE-1)
  • DPO hurts performance (-1.44 ROUGE-1 vs SFT)
  • 100% of samples produce valid Korean summaries

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct", torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, "madokalif/qwen2.5-7b-construction-summarization-ko-lora")

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

messages = [
    {"role": "system", "content": "건섀μž₯λΉ„ ν΄λ ˆμž„ λ³΄κ³ μ„œλ₯Ό 읽고 핡심 λ‚΄μš©μ„ κ°„κ²°ν•˜κ²Œ ν•œκ΅­μ–΄λ‘œ μš”μ•½ν•˜μ„Έμš”."},
    {"role": "user", "content": "λ‹€μŒ ν΄λ ˆμž„ λ³΄κ³ μ„œλ₯Ό μš”μ•½ν•˜μ„Έμš”:\n\nν˜„μƒ: λƒ‰κ°μˆ˜ ν˜ΈμŠ€μ—μ„œ λƒ‰κ°μˆ˜κ°€ μƒˆκ³  μžˆμŠ΅λ‹ˆλ‹€..."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.3, top_p=0.9, do_sample=True)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for madokalif/qwen2.5-7b-construction-summarization-ko-lora

Base model

Qwen/Qwen2.5-7B
Adapter
(1968)
this model