File size: 3,928 Bytes
ea7db3b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | ---
license: mit
language:
- en
library_name: peft
base_model: Qwen/Qwen3-0.6B
tags:
- lora
- vera
- peft
- sft
- chatbot
- rag
- qwen3
- university
pipeline_tag: text-generation
---
# UTN Student Chatbot — Finetuned Qwen3-0.6B
A domain-adapted chatbot for the **University of Technology Nuremberg (UTN)**, built by finetuning [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) on curated UTN-specific Q&A data using parameter-efficient methods.
## Available Adapters
| Adapter | Method | Trainable Params | Path |
|---------|--------|-----------------|------|
| **LoRA** (recommended) | Low-Rank Adaptation (r=64, alpha=128) | 161M (21.4%) | `models/utn-qwen3-lora` |
| VeRA | Vector-based Random Matrix Adaptation (r=256) | 8M (1.1%) | `models/utn-qwen3-vera` |
## Evaluation Results
### Validation Set (17 examples)
| Metric | LoRA |
|--------|------|
| ROUGE-1 | 0.5924 |
| ROUGE-2 | 0.4967 |
| ROUGE-L | 0.5687 |
### FAQ Benchmark (34 questions, with CRAG RAG pipeline)
| Metric | LoRA + CRAG |
|--------|-------------|
| ROUGE-1 | 0.7096 |
| ROUGE-2 | 0.6124 |
| ROUGE-L | 0.6815 |
## Quick Start — LoRA (Recommended)
```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model_id = "Qwen/Qwen3-0.6B"
adapter_repo = "saeedbenadeeb/UTN_LLMs_Chatbot"
adapter_path = "models/utn-qwen3-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(
model,
adapter_repo,
subfolder=adapter_path,
)
model.eval()
messages = [
{"role": "system", "content": "You are a helpful assistant for the University of Technology Nuremberg (UTN)."},
{"role": "user", "content": "What are the admission requirements for AI & Robotics?"},
]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.3,
top_p=0.9,
do_sample=True,
)
response = tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
```
## Quick Start — VeRA
```python
# Same as above, but change the adapter path:
adapter_path = "models/utn-qwen3-vera"
model = PeftModel.from_pretrained(
model,
adapter_repo,
subfolder=adapter_path,
)
```
## Training Details
- **Base model**: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
- **Training data**: 1,289 curated UTN Q&A pairs (scraped from utn.de, FAQs, module handbooks)
- **Validation data**: 17 held-out examples
- **Trainer**: TRL SFTTrainer
- **Hardware**: NVIDIA A40 (48 GB)
- **LoRA config**: r=64, alpha=128, dropout=0.05, target=all linear layers, lr=3e-4, 5 epochs
- **VeRA config**: r=256, d_initial=0.1, prng_key=42, target=all linear layers, lr=5e-4, 5 epochs
- **Framework**: PEFT 0.18.1, Transformers 5.2.0, PyTorch 2.6.0
## Architecture
The full system uses a **Corrective RAG (CRAG)** pipeline:
1. **Hybrid retrieval**: FAISS dense search (BGE-small-en-v1.5) + BM25 sparse search, merged via Reciprocal Rank Fusion
2. **Relevance grading**: Score-based heuristic to verify retrieved documents answer the question
3. **Query rewriting**: If documents are irrelevant, the query is rewritten and retrieval retried
4. **Generation**: The finetuned Qwen3-0.6B + LoRA generates grounded answers from retrieved context
## Citation
```bibtex
@misc{utn-chatbot-2026,
title={UTN Student Chatbot: Domain-Adapted Qwen3-0.6B with CRAG},
author={Saeed Adeeb},
year={2026},
url={https://huggingface.co/saeedbenadeeb/UTN_LLMs_Chatbot}
}
```
|