File size: 9,587 Bytes
fa64448 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 |
---
language:
- it
- en
license: apache-2.0
library_name: transformers
base_model: Qwen/Qwen2.5-32B-Instruct
tags:
- lora
- fine-tuned
- banking
- regtech
- compliance
- rag
- tool-calling
- italian
- qwen2.5
pipeline_tag: text-generation
---
# π¦ RegTech-32B-Instruct
> **Fine-tuned for RAG-powered banking compliance β not general knowledge.**
A specialized [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) model fine-tuned to excel within a **Retrieval-Augmented Generation (RAG) pipeline** for Italian banking regulatory compliance.
This model doesn't try to memorize regulations β it's trained to **work with retrieved context**: follow instructions precisely, produce structured outputs, call compliance tools, and maintain the right tone and terminology when grounded on regulatory documents.
---
## π― What This Model Does
This fine-tuning optimizes the model's **behavior within a RAG system**, not its factual knowledge. Specifically:
| Task | Description |
|---|---|
| π **RAG Q&A** | Answer regulatory questions grounded on retrieved documents |
| π§ **Tool Calling** | KYC verification, risk scoring, PEP checks, SOS reporting |
| π **Query Expansion** | Rewrite user queries with regulatory terminology for better retrieval |
| π§ **Intent Detection** | Classify if a message needs document search or is conversational |
| π **Document Reranking** | Score candidate documents by relevance |
| π **Structured JSON** | Topic extraction, metadata, impact analysis in JSON format |
| βοΈ **Impact Analysis** | Cross-reference external regulations against internal bank procedures |
---
## π Evaluation β LLM-as-Judge
Evaluated by **Claude Opus 4.6** (Anthropic) across 11 blind test scenarios. The judge compared base vs fine-tuned model outputs without knowing which was which.
### π Head-to-Head
```
βββββββββββββββββββββββββββββββββββββββββββ
β π’ Tuned Wins 7/11 (68.2%) β
β π΄ Base Wins 3/11 (31.8%) β
β βͺ Ties 1/11 β
βββββββββββββββββββββββββββββββββββββββββββ
```
### π Quality Scores (1β5)
| Criterion | Base | Tuned | Delta | |
|---|:---:|:---:|:---:|---|
| π― Instruction Following | 4.00 | **4.82** | +0.82 | π’π’π’ |
| π Context Adherence | 4.36 | **4.91** | +0.55 | π’π’ |
| β
Accuracy | 4.27 | **4.73** | +0.46 | π’ |
| π Format | 4.36 | **4.55** | +0.19 | β |
| π£οΈ Tone | 4.82 | **5.00** | +0.18 | β |
| **π Overall** | **4.36** | **4.80** | **+0.44** | **π’** |
> The biggest gains are in **instruction following** (+0.82) and **context adherence** (+0.55) β exactly what matters when the model must follow retrieved regulatory context faithfully. Tone reaches a perfect 5.00.
### π Results by Category
| Category | Base | Tuned | Tie |
|---|:---:|:---:|:---:|
| π« Refusal Handling | 0 | **1** | 1 |
| π¨ Style & Tone | 0 | **1** | 0 |
| π€ Data Extraction | 0 | **1** | 0 |
| β οΈ Edge Cases | 0 | **1** | 0 |
| π JSON Output | 1 | 1 | 0 |
| π RAG Q&A | 1 | 1 | 0 |
| π§ Tool Use | 1 | 1 | 0 |
### π Comparison with RegTech-4B-Instruct
| Metric | 4B | 32B |
|---|:---:|:---:|
| Base score (pre-tuning) | 4.11 | **4.36** |
| Tuned score | 4.68 | **4.80** |
| Best eval loss | 1.191 | **0.813** |
| Token accuracy | ~73% | **~81%** |
| Train/eval gap | 0.050 | **0.030** |
---
## π‘ Usage Examples
### π RAG Q&A β Answering from Retrieved Context
The model is designed to receive **retrieved regulatory documents as context** and answer based on them:
```python
messages = [
{
"role": "system",
"content": """Sei un assistente per la compliance bancaria.
Rispondi SOLO basandoti sul contesto fornito.
<contesto_recuperato>
Art. 92 CRR - Gli enti soddisfano in qualsiasi momento i seguenti
requisiti: a) CET1 del 4,5%; b) Tier 1 del 6%; c) capitale totale dell'8%.
Il coefficiente Γ¨ calcolato come rapporto tra i fondi propri e
l'importo complessivo dell'esposizione al rischio.
</contesto_recuperato>"""
},
{
"role": "user",
"content": "Quali sono i requisiti minimi di capitale secondo il CRR?"
}
]
```
### π Query Expansion β Improving RAG Retrieval
```python
messages = [
{
"role": "system",
"content": "Riscrivi la query dell'utente in una versione piΓΉ ricca per migliorare il recupero documentale (RAG). Aggiungi termini tecnici e riferimenti normativi. Rispondi SOLO con il JSON richiesto."
},
{
"role": "user",
"content": "## QUERY ORIGINALE: [obblighi segnalazione operazioni sospette]"
}
]
# Expected output:
# {"query": "obblighi segnalazione operazioni sospette SOS UIF D.Lgs. 231/2007
# art. 35 riciclaggio finanziamento terrorismo portale RADAR tempistiche
# invio indicatori anomalia"}
```
### π§ Tool Calling β Compliance Workflows
```python
messages = [
{
"role": "system",
"content": """Sei un assistente operativo per la compliance.
<tools>
{"name": "calcola_scoring_rischio", "parameters": {...}}
{"name": "controlla_liste_pep", "parameters": {...}}
{"name": "verifica_kyc", "parameters": {...}}
</tools>
<contesto_recuperato>
Procedura AML-003: L'adeguata verifica rafforzata (EDD) deve essere
applicata per PEP, paesi ad alto rischio e profili con scoring > 60.
</contesto_recuperato>"""
},
{
"role": "user",
"content": "Devo aprire un conto per una societΓ con sede a Dubai. Il legale rappresentante Γ¨ il sig. Al-Rashid."
}
]
# The model will:
# 1. Call controlla_liste_pep for the representative
# 2. Call calcola_scoring_rischio based on risk factors
# 3. Recommend EDD procedure per AML-003, grounded on retrieved policy
```
### π Document Reranking
```python
messages = [
{
"role": "system",
"content": "Valuta la rilevanza di ciascun candidato rispetto alla query. Restituisci solo i candidati rilevanti con score 0-100. Rispondi SOLO con il JSON richiesto."
},
{
"role": "user",
"content": '{"query": "requisiti CET1 fondi propri", "candidates": [{"id": "doc_001", "title": "Art. 92 CRR", "content": "..."}, {"id": "doc_002", "title": "DORA Art. 5", "content": "..."}]}'
}
]
# Expected: {"matches": [{"id": "doc_001", "relevance": 95}]}
```
---
## βοΈ Training Details
| | |
|---|---|
| 𧬠**Method** | LoRA β bf16 full precision (no quantization) |
| ποΈ **Base Model** | Qwen2.5-32B-Instruct |
| π¦ **Dataset** | 923 train / 102 eval samples |
| β±οΈ **Duration** | 40.0 minutes |
### π Training Metrics
| Metric | Value |
|---|---|
| Final Train Loss | 0.843 |
| Best Eval Loss | 0.813 (step 640/693) |
| Train/Eval Gap | 0.030 β
|
> Gap of 0.030 indicates **very stable training with no overfitting**.
---
## π Dataset Coverage
The training data covers the full lifecycle of a RAG-based compliance assistant:
| Category | Purpose |
|---|---|
| π·οΈ Title Generation | Generate conversation titles from user queries |
| π Query Expansion | Enrich queries with regulatory terms for better retrieval |
| π§ Intent Classification | Route queries to RAG vs conversational responses |
| π Document Reranking | Score retrieved documents by relevance |
| π Topic Extraction | Extract main topics from regulatory text pages |
| π Document Summarization | Summarize multi-page regulatory documents |
| βοΈ Relevance Filtering | Filter regulatory text relevant to banks |
| π
Metadata Extraction | Find application dates, issuing authorities |
| π§ Impact Analysis | Cross-reference regulations vs internal procedures |
| π¬ RAG Q&A + Tool Calling | Multi-turn compliance conversations with tools |
**Regulatory sources covered:** CRR/CRR3, DORA (UE 2022/2554), D.Lgs. 231/2007 (AML), D.Lgs. 385/1993 (TUB), Circolare 285, PSD2, MiFID II/MiFIR, D.P.R. 180/1950 and related Banca d'Italia provisions.
---
## π Deployment
### With vLLM
```bash
vllm serve ./models/RegTech-32B-Instruct --dtype bfloat16
```
### With Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("YOUR_REPO_ID", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("YOUR_REPO_ID")
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## β οΈ Important Notes
- π― **RAG-optimized** β trained to work with retrieved context, not to memorize regulations. Always provide relevant documents in the system prompt.
- π¦ **Domain-specific** β optimized for Italian banking compliance. General capabilities may differ from the base model.
- βοΈ **Not legal advice** β a tool to assist compliance professionals, not a substitute for regulatory expertise.
- π§ **Tool schemas** β tool calling works best with the specific function signatures used during training.
---
<p align="center">
Built with β€οΈ for banking RAG<br>
<em>Fine-tuned with LoRA β’ Evaluated by Claude Opus 4.6 β’ Powered by Qwen2.5</em><br>
<em>Contact For Commercial Use: https://landing.2sophia.ai</em>
</p> |