| --- |
| license: mit |
| language: |
| - en |
| base_model: Qwen/Qwen3-0.6B |
| tags: |
| - lora |
| - merged |
| - qwen3 |
| - chatbot |
| - university |
| pipeline_tag: text-generation |
| --- |
| |
| # UTN-Qwen3-0.6B-LoRA-merged |
|
|
| Qwen3-0.6B finetuned with LoRA (r=64, alpha=128) on UTN domain data, then merged into a standalone model. Ready for direct inference without PEFT. |
|
|
| ## Usage |
|
|
| ```python |
| import torch |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model_id = "saeedbenadeeb/UTN-Qwen3-0.6B-LoRA-merged" |
| |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| trust_remote_code=True, |
| ) |
| |
| messages = [ |
| {"role": "system", "content": "You are a helpful assistant for the University of Technology Nuremberg (UTN)."}, |
| {"role": "user", "content": "What are the admission requirements for AI & Robotics?"}, |
| ] |
| |
| prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False) |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| |
| with torch.no_grad(): |
| output = model.generate(**inputs, max_new_tokens=512, temperature=0.3, top_p=0.9, do_sample=True) |
| |
| print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) |
| ``` |
|
|
| ## Training |
|
|
| - **Base**: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
| - **Method**: LoRA (r=64, alpha=128, dropout=0.05, all linear layers) |
| - **Data**: 1,289 UTN Q&A pairs, 5 epochs, lr=3e-4 |
| - **Hardware**: NVIDIA A40 |
|
|
| ## Evaluation (Validation Set, 17 examples) |
|
|
| | Metric | Score | |
| |--------|-------| |
| | ROUGE-1 | 0.5924 | |
| | ROUGE-2 | 0.4967 | |
| | ROUGE-L | 0.5687 | |
|
|