saeedbenadeeb commited on
Commit
ea7db3b
·
verified ·
1 Parent(s): 494a9a5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +138 -3
README.md CHANGED
@@ -1,3 +1,138 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ library_name: peft
6
+ base_model: Qwen/Qwen3-0.6B
7
+ tags:
8
+ - lora
9
+ - vera
10
+ - peft
11
+ - sft
12
+ - chatbot
13
+ - rag
14
+ - qwen3
15
+ - university
16
+ pipeline_tag: text-generation
17
+ ---
18
+
19
+ # UTN Student Chatbot — Finetuned Qwen3-0.6B
20
+
21
+ A domain-adapted chatbot for the **University of Technology Nuremberg (UTN)**, built by finetuning [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) on curated UTN-specific Q&A data using parameter-efficient methods.
22
+
23
+ ## Available Adapters
24
+
25
+ | Adapter | Method | Trainable Params | Path |
26
+ |---------|--------|-----------------|------|
27
+ | **LoRA** (recommended) | Low-Rank Adaptation (r=64, alpha=128) | 161M (21.4%) | `models/utn-qwen3-lora` |
28
+ | VeRA | Vector-based Random Matrix Adaptation (r=256) | 8M (1.1%) | `models/utn-qwen3-vera` |
29
+
30
+ ## Evaluation Results
31
+
32
+ ### Validation Set (17 examples)
33
+
34
+ | Metric | LoRA |
35
+ |--------|------|
36
+ | ROUGE-1 | 0.5924 |
37
+ | ROUGE-2 | 0.4967 |
38
+ | ROUGE-L | 0.5687 |
39
+
40
+ ### FAQ Benchmark (34 questions, with CRAG RAG pipeline)
41
+
42
+ | Metric | LoRA + CRAG |
43
+ |--------|-------------|
44
+ | ROUGE-1 | 0.7096 |
45
+ | ROUGE-2 | 0.6124 |
46
+ | ROUGE-L | 0.6815 |
47
+
48
+ ## Quick Start — LoRA (Recommended)
49
+
50
+ ```python
51
+ import torch
52
+ from peft import PeftModel
53
+ from transformers import AutoModelForCausalLM, AutoTokenizer
54
+
55
+ base_model_id = "Qwen/Qwen3-0.6B"
56
+ adapter_repo = "saeedbenadeeb/UTN_LLMs_Chatbot"
57
+ adapter_path = "models/utn-qwen3-lora"
58
+
59
+ tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
60
+ model = AutoModelForCausalLM.from_pretrained(
61
+ base_model_id,
62
+ torch_dtype=torch.bfloat16,
63
+ device_map="auto",
64
+ trust_remote_code=True,
65
+ )
66
+ model = PeftModel.from_pretrained(
67
+ model,
68
+ adapter_repo,
69
+ subfolder=adapter_path,
70
+ )
71
+ model.eval()
72
+
73
+ messages = [
74
+ {"role": "system", "content": "You are a helpful assistant for the University of Technology Nuremberg (UTN)."},
75
+ {"role": "user", "content": "What are the admission requirements for AI & Robotics?"},
76
+ ]
77
+
78
+ prompt = tokenizer.apply_chat_template(
79
+ messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
80
+ )
81
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
82
+
83
+ with torch.no_grad():
84
+ output = model.generate(
85
+ **inputs,
86
+ max_new_tokens=512,
87
+ temperature=0.3,
88
+ top_p=0.9,
89
+ do_sample=True,
90
+ )
91
+
92
+ response = tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
93
+ print(response)
94
+ ```
95
+
96
+ ## Quick Start — VeRA
97
+
98
+ ```python
99
+ # Same as above, but change the adapter path:
100
+ adapter_path = "models/utn-qwen3-vera"
101
+
102
+ model = PeftModel.from_pretrained(
103
+ model,
104
+ adapter_repo,
105
+ subfolder=adapter_path,
106
+ )
107
+ ```
108
+
109
+ ## Training Details
110
+
111
+ - **Base model**: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
112
+ - **Training data**: 1,289 curated UTN Q&A pairs (scraped from utn.de, FAQs, module handbooks)
113
+ - **Validation data**: 17 held-out examples
114
+ - **Trainer**: TRL SFTTrainer
115
+ - **Hardware**: NVIDIA A40 (48 GB)
116
+ - **LoRA config**: r=64, alpha=128, dropout=0.05, target=all linear layers, lr=3e-4, 5 epochs
117
+ - **VeRA config**: r=256, d_initial=0.1, prng_key=42, target=all linear layers, lr=5e-4, 5 epochs
118
+ - **Framework**: PEFT 0.18.1, Transformers 5.2.0, PyTorch 2.6.0
119
+
120
+ ## Architecture
121
+
122
+ The full system uses a **Corrective RAG (CRAG)** pipeline:
123
+
124
+ 1. **Hybrid retrieval**: FAISS dense search (BGE-small-en-v1.5) + BM25 sparse search, merged via Reciprocal Rank Fusion
125
+ 2. **Relevance grading**: Score-based heuristic to verify retrieved documents answer the question
126
+ 3. **Query rewriting**: If documents are irrelevant, the query is rewritten and retrieval retried
127
+ 4. **Generation**: The finetuned Qwen3-0.6B + LoRA generates grounded answers from retrieved context
128
+
129
+ ## Citation
130
+
131
+ ```bibtex
132
+ @misc{utn-chatbot-2026,
133
+ title={UTN Student Chatbot: Domain-Adapted Qwen3-0.6B with CRAG},
134
+ author={Saeed Adeeb},
135
+ year={2026},
136
+ url={https://huggingface.co/saeedbenadeeb/UTN_LLMs_Chatbot}
137
+ }
138
+ ```