utn-chatbot / README.md
Abdullah-Taha's picture
Add files using upload-large-folder tool
982c222 verified
---
license: mit
language:
- en
base_model: Qwen/Qwen3-0.6B
tags:
- lora
- merged
- qwen3
- chatbot
- university
pipeline_tag: text-generation
---
# UTN-Qwen3-0.6B-LoRA-merged
Qwen3-0.6B finetuned with LoRA (r=64, alpha=128) on UTN domain data, then merged into a standalone model. Ready for direct inference without PEFT.
## Usage
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "saeedbenadeeb/UTN-Qwen3-0.6B-LoRA-merged"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
messages = [
{"role": "system", "content": "You are a helpful assistant for the University of Technology Nuremberg (UTN)."},
{"role": "user", "content": "What are the admission requirements for AI & Robotics?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=512, temperature=0.3, top_p=0.9, do_sample=True)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```
## Training
- **Base**: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
- **Method**: LoRA (r=64, alpha=128, dropout=0.05, all linear layers)
- **Data**: 1,289 UTN Q&A pairs, 5 epochs, lr=3e-4
- **Hardware**: NVIDIA A40
## Evaluation (Validation Set, 17 examples)
| Metric | Score |
|--------|-------|
| ROUGE-1 | 0.5924 |
| ROUGE-2 | 0.4967 |
| ROUGE-L | 0.5687 |