Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,273 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- it
|
| 4 |
+
- en
|
| 5 |
+
license: apache-2.0
|
| 6 |
+
library_name: transformers
|
| 7 |
+
base_model: Qwen/Qwen2.5-14B-Instruct
|
| 8 |
+
tags:
|
| 9 |
+
- lora
|
| 10 |
+
- fine-tuned
|
| 11 |
+
- banking
|
| 12 |
+
- regtech
|
| 13 |
+
- compliance
|
| 14 |
+
- rag
|
| 15 |
+
- tool-calling
|
| 16 |
+
- italian
|
| 17 |
+
- qwen2.5
|
| 18 |
+
pipeline_tag: text-generation
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
# π¦ RegTech-14B-Instruct
|
| 22 |
+
|
| 23 |
+
> **Fine-tuned for RAG-powered banking compliance β not general knowledge.**
|
| 24 |
+
|
| 25 |
+
A specialized [Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) model fine-tuned to excel within a **Retrieval-Augmented Generation (RAG) pipeline** for Italian banking regulatory compliance.
|
| 26 |
+
|
| 27 |
+
This model doesn't try to memorize regulations β it's trained to **work with retrieved context**: follow instructions precisely, produce structured outputs, call compliance tools, and maintain the right tone and terminology when grounded on regulatory documents.
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
## π― What This Model Does
|
| 32 |
+
|
| 33 |
+
This fine-tuning optimizes the model's **behavior within a RAG system**, not its factual knowledge. Specifically:
|
| 34 |
+
|
| 35 |
+
| Task | Description |
|
| 36 |
+
|---|---|
|
| 37 |
+
| π **RAG Q&A** | Answer regulatory questions grounded on retrieved documents |
|
| 38 |
+
| π§ **Tool Calling** | KYC verification, risk scoring, PEP checks, SOS reporting |
|
| 39 |
+
| π **Query Expansion** | Rewrite user queries with regulatory terminology for better retrieval |
|
| 40 |
+
| π§ **Intent Detection** | Classify if a message needs document search or is conversational |
|
| 41 |
+
| π **Document Reranking** | Score candidate documents by relevance |
|
| 42 |
+
| π **Structured JSON** | Topic extraction, metadata, impact analysis in JSON format |
|
| 43 |
+
| βοΈ **Impact Analysis** | Cross-reference external regulations against internal bank procedures |
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## π Evaluation β LLM-as-Judge
|
| 48 |
+
|
| 49 |
+
Evaluated by **Claude Opus 4.6** (Anthropic) across 11 blind test scenarios. The judge compared base vs fine-tuned model outputs without knowing which was which.
|
| 50 |
+
|
| 51 |
+
### π Head-to-Head
|
| 52 |
+
|
| 53 |
+
```
|
| 54 |
+
βββββββββββββββββββββββββββββββββββββββββββ
|
| 55 |
+
β π’ Tuned Wins 8/11 (77.3%) β
|
| 56 |
+
β π΄ Base Wins 2/11 (22.7%) β
|
| 57 |
+
β βͺ Ties 1/11 β
|
| 58 |
+
βββββββββββββββββββββββββββββββββββββββββββ
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
### π Quality Scores (1β5)
|
| 62 |
+
|
| 63 |
+
| Criterion | Base | Tuned | Delta | |
|
| 64 |
+
|---|:---:|:---:|:---:|---|
|
| 65 |
+
| π― Instruction Following | 3.55 | **4.64** | +1.09 | π’π’π’ |
|
| 66 |
+
| π Context Adherence | 3.82 | **4.82** | +1.00 | π’π’ |
|
| 67 |
+
| β
Accuracy | 4.00 | **4.73** | +0.73 | π’π’ |
|
| 68 |
+
| π Format | 4.18 | **4.45** | +0.27 | π’ |
|
| 69 |
+
| π£οΈ Tone | 4.73 | **4.82** | +0.09 | β |
|
| 70 |
+
| **π Overall** | **4.06** | **4.69** | **+0.64** | **π’π’** |
|
| 71 |
+
|
| 72 |
+
> Highest win rate across all model sizes at 77.3%. Instruction following jumps +1.09 and context adherence +1.00 β the fine-tuning dramatically improves the model's ability to stay grounded on retrieved regulatory context.
|
| 73 |
+
|
| 74 |
+
### π Results by Category
|
| 75 |
+
|
| 76 |
+
| Category | Base | Tuned | Tie |
|
| 77 |
+
|---|:---:|:---:|:---:|
|
| 78 |
+
| π RAG Q&A | 0 | **2** | 0 |
|
| 79 |
+
| π« Refusal Handling | 0 | **2** | 0 |
|
| 80 |
+
| β οΈ Edge Cases | 0 | **1** | 0 |
|
| 81 |
+
| π¨ Style & Tone | 0 | **1** | 0 |
|
| 82 |
+
| π€ Data Extraction | 0 | 0 | 1 |
|
| 83 |
+
| π JSON Output | 1 | 1 | 0 |
|
| 84 |
+
| π§ Tool Use | 1 | 1 | 0 |
|
| 85 |
+
|
| 86 |
+
### π Comparison Across Model Sizes
|
| 87 |
+
|
| 88 |
+
| Metric | 4B | 7B | 14B | 32B |
|
| 89 |
+
|---|:---:|:---:|:---:|:---:|
|
| 90 |
+
| Base score (pre-tuning) | 4.11 | 3.84 | 4.06 | **4.36** |
|
| 91 |
+
| Tuned score | 4.68 | 4.78 | 4.69 | **4.80** |
|
| 92 |
+
| Delta (improvement) | +0.57 | +0.95 | +0.64 | +0.44 |
|
| 93 |
+
| Win rate | 68.2% | 68.2% | **77.3%** | 68.2% |
|
| 94 |
+
| Best eval loss | 1.191 | 1.330 | 1.225 | **0.813** |
|
| 95 |
+
| Token accuracy | ~73% | ~72% | ~72% | **~81%** |
|
| 96 |
+
|
| 97 |
+
---
|
| 98 |
+
|
| 99 |
+
## π‘ Usage Examples
|
| 100 |
+
|
| 101 |
+
### π RAG Q&A β Answering from Retrieved Context
|
| 102 |
+
|
| 103 |
+
The model is designed to receive **retrieved regulatory documents as context** and answer based on them:
|
| 104 |
+
|
| 105 |
+
```python
|
| 106 |
+
messages = [
|
| 107 |
+
{
|
| 108 |
+
"role": "system",
|
| 109 |
+
"content": """Sei un assistente per la compliance bancaria.
|
| 110 |
+
Rispondi SOLO basandoti sul contesto fornito.
|
| 111 |
+
|
| 112 |
+
<contesto_recuperato>
|
| 113 |
+
Art. 92 CRR - Gli enti soddisfano in qualsiasi momento i seguenti
|
| 114 |
+
requisiti: a) CET1 del 4,5%; b) Tier 1 del 6%; c) capitale totale dell'8%.
|
| 115 |
+
Il coefficiente Γ¨ calcolato come rapporto tra i fondi propri e
|
| 116 |
+
l'importo complessivo dell'esposizione al rischio.
|
| 117 |
+
</contesto_recuperato>"""
|
| 118 |
+
},
|
| 119 |
+
{
|
| 120 |
+
"role": "user",
|
| 121 |
+
"content": "Quali sono i requisiti minimi di capitale secondo il CRR?"
|
| 122 |
+
}
|
| 123 |
+
]
|
| 124 |
+
```
|
| 125 |
+
|
| 126 |
+
### π Query Expansion β Improving RAG Retrieval
|
| 127 |
+
|
| 128 |
+
```python
|
| 129 |
+
messages = [
|
| 130 |
+
{
|
| 131 |
+
"role": "system",
|
| 132 |
+
"content": "Riscrivi la query dell'utente in una versione piΓΉ ricca per migliorare il recupero documentale (RAG). Aggiungi termini tecnici e riferimenti normativi. Rispondi SOLO con il JSON richiesto."
|
| 133 |
+
},
|
| 134 |
+
{
|
| 135 |
+
"role": "user",
|
| 136 |
+
"content": "## QUERY ORIGINALE: [obblighi segnalazione operazioni sospette]"
|
| 137 |
+
}
|
| 138 |
+
]
|
| 139 |
+
|
| 140 |
+
# Expected output:
|
| 141 |
+
# {"query": "obblighi segnalazione operazioni sospette SOS UIF D.Lgs. 231/2007
|
| 142 |
+
# art. 35 riciclaggio finanziamento terrorismo portale RADAR tempistiche
|
| 143 |
+
# invio indicatori anomalia"}
|
| 144 |
+
```
|
| 145 |
+
|
| 146 |
+
### π§ Tool Calling β Compliance Workflows
|
| 147 |
+
|
| 148 |
+
```python
|
| 149 |
+
messages = [
|
| 150 |
+
{
|
| 151 |
+
"role": "system",
|
| 152 |
+
"content": """Sei un assistente operativo per la compliance.
|
| 153 |
+
|
| 154 |
+
<tools>
|
| 155 |
+
{"name": "calcola_scoring_rischio", "parameters": {...}}
|
| 156 |
+
{"name": "controlla_liste_pep", "parameters": {...}}
|
| 157 |
+
{"name": "verifica_kyc", "parameters": {...}}
|
| 158 |
+
</tools>
|
| 159 |
+
|
| 160 |
+
<contesto_recuperato>
|
| 161 |
+
Procedura AML-003: L'adeguata verifica rafforzata (EDD) deve essere
|
| 162 |
+
applicata per PEP, paesi ad alto rischio e profili con scoring > 60.
|
| 163 |
+
</contesto_recuperato>"""
|
| 164 |
+
},
|
| 165 |
+
{
|
| 166 |
+
"role": "user",
|
| 167 |
+
"content": "Devo aprire un conto per una societΓ con sede a Dubai. Il legale rappresentante Γ¨ il sig. Al-Rashid."
|
| 168 |
+
}
|
| 169 |
+
]
|
| 170 |
+
|
| 171 |
+
# The model will:
|
| 172 |
+
# 1. Call controlla_liste_pep for the representative
|
| 173 |
+
# 2. Call calcola_scoring_rischio based on risk factors
|
| 174 |
+
# 3. Recommend EDD procedure per AML-003, grounded on retrieved policy
|
| 175 |
+
```
|
| 176 |
+
|
| 177 |
+
### π Document Reranking
|
| 178 |
+
|
| 179 |
+
```python
|
| 180 |
+
messages = [
|
| 181 |
+
{
|
| 182 |
+
"role": "system",
|
| 183 |
+
"content": "Valuta la rilevanza di ciascun candidato rispetto alla query. Restituisci solo i candidati rilevanti con score 0-100. Rispondi SOLO con il JSON richiesto."
|
| 184 |
+
},
|
| 185 |
+
{
|
| 186 |
+
"role": "user",
|
| 187 |
+
"content": '{"query": "requisiti CET1 fondi propri", "candidates": [{"id": "doc_001", "title": "Art. 92 CRR", "content": "..."}, {"id": "doc_002", "title": "DORA Art. 5", "content": "..."}]}'
|
| 188 |
+
}
|
| 189 |
+
]
|
| 190 |
+
|
| 191 |
+
# Expected: {"matches": [{"id": "doc_001", "relevance": 95}]}
|
| 192 |
+
```
|
| 193 |
+
|
| 194 |
+
---
|
| 195 |
+
|
| 196 |
+
## βοΈ Training Details
|
| 197 |
+
|
| 198 |
+
| | |
|
| 199 |
+
|---|---|
|
| 200 |
+
| 𧬠**Method** | LoRA β bf16 full precision (no quantization) |
|
| 201 |
+
| ποΈ **Base Model** | Qwen2.5-14B-Instruct |
|
| 202 |
+
| π¦ **Dataset** | 923 train / 102 eval samples |
|
| 203 |
+
| β±οΈ **Duration** | 23.5 minutes |
|
| 204 |
+
|
| 205 |
+
### π Training Metrics
|
| 206 |
+
|
| 207 |
+
| Metric | Value |
|
| 208 |
+
|---|---|
|
| 209 |
+
| Final Train Loss | 1.127 |
|
| 210 |
+
| Best Eval Loss | 1.225 (step 640/693) |
|
| 211 |
+
| Train/Eval Gap | 0.098 β
|
|
| 212 |
+
|
| 213 |
+
> Gap of 0.098 indicates **stable training with no overfitting**.
|
| 214 |
+
|
| 215 |
+
---
|
| 216 |
+
|
| 217 |
+
## π Dataset Coverage
|
| 218 |
+
|
| 219 |
+
The training data covers the full lifecycle of a RAG-based compliance assistant:
|
| 220 |
+
|
| 221 |
+
| Category | Purpose |
|
| 222 |
+
|---|---|
|
| 223 |
+
| π·οΈ Title Generation | Generate conversation titles from user queries |
|
| 224 |
+
| π Query Expansion | Enrich queries with regulatory terms for better retrieval |
|
| 225 |
+
| π§ Intent Classification | Route queries to RAG vs conversational responses |
|
| 226 |
+
| π Document Reranking | Score retrieved documents by relevance |
|
| 227 |
+
| π Topic Extraction | Extract main topics from regulatory text pages |
|
| 228 |
+
| π Document Summarization | Summarize multi-page regulatory documents |
|
| 229 |
+
| βοΈ Relevance Filtering | Filter regulatory text relevant to banks |
|
| 230 |
+
| π
Metadata Extraction | Find application dates, issuing authorities |
|
| 231 |
+
| π§ Impact Analysis | Cross-reference regulations vs internal procedures |
|
| 232 |
+
| π¬ RAG Q&A + Tool Calling | Multi-turn compliance conversations with tools |
|
| 233 |
+
|
| 234 |
+
**Regulatory sources covered:** CRR/CRR3, DORA (UE 2022/2554), D.Lgs. 231/2007 (AML), D.Lgs. 385/1993 (TUB), Circolare 285, PSD2, MiFID II/MiFIR, D.P.R. 180/1950 and related Banca d'Italia provisions.
|
| 235 |
+
|
| 236 |
+
---
|
| 237 |
+
|
| 238 |
+
## π Deployment
|
| 239 |
+
|
| 240 |
+
### With vLLM
|
| 241 |
+
```bash
|
| 242 |
+
vllm serve ./models/RegTech-14B-Instruct --dtype bfloat16
|
| 243 |
+
```
|
| 244 |
+
|
| 245 |
+
### With Transformers
|
| 246 |
+
```python
|
| 247 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 248 |
+
|
| 249 |
+
model = AutoModelForCausalLM.from_pretrained("YOUR_REPO_ID", torch_dtype="bfloat16", device_map="auto")
|
| 250 |
+
tokenizer = AutoTokenizer.from_pretrained("YOUR_REPO_ID")
|
| 251 |
+
|
| 252 |
+
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 253 |
+
inputs = tokenizer(text, return_tensors="pt").to(model.device)
|
| 254 |
+
outputs = model.generate(**inputs, max_new_tokens=512)
|
| 255 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 256 |
+
```
|
| 257 |
+
|
| 258 |
+
---
|
| 259 |
+
|
| 260 |
+
## β οΈ Important Notes
|
| 261 |
+
|
| 262 |
+
- π― **RAG-optimized** β trained to work with retrieved context, not to memorize regulations. Always provide relevant documents in the system prompt.
|
| 263 |
+
- π¦ **Domain-specific** β optimized for Italian banking compliance. General capabilities may differ from the base model.
|
| 264 |
+
- βοΈ **Not legal advice** β a tool to assist compliance professionals, not a substitute for regulatory expertise.
|
| 265 |
+
- π§ **Tool schemas** β tool calling works best with the specific function signatures used during training.
|
| 266 |
+
|
| 267 |
+
---
|
| 268 |
+
|
| 269 |
+
<p align="center">
|
| 270 |
+
Built with β€οΈ for banking RAG<br>
|
| 271 |
+
<em>Fine-tuned with LoRA β’ Evaluated by Claude Opus 4.6 β’ Powered by Qwen2.5</em><br>
|
| 272 |
+
<em>Contact For Commercial Use: https://landing.2sophia.ai</em>
|
| 273 |
+
</p>
|