MwSpace commited on
Commit
b09fd4c
Β·
verified Β·
1 Parent(s): 15fc732

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +290 -0
README.md ADDED
@@ -0,0 +1,290 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - it
4
+ - en
5
+ license: apache-2.0
6
+ library_name: transformers
7
+ base_model: Qwen/Qwen2.5-7B-Instruct
8
+ tags:
9
+ - lora
10
+ - fine-tuned
11
+ - banking
12
+ - regtech
13
+ - compliance
14
+ - rag
15
+ - tool-calling
16
+ - italian
17
+ - qwen2.5
18
+ pipeline_tag: text-generation
19
+ ---
20
+
21
+ # 🏦 RegTech-7B-Instruct
22
+
23
+ > **Fine-tuned for RAG-powered banking compliance β€” not general knowledge.**
24
+
25
+ A specialized [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model fine-tuned to excel within a **Retrieval-Augmented Generation (RAG) pipeline** for Italian banking regulatory compliance.
26
+
27
+ This model doesn't try to memorize regulations β€” it's trained to **work with retrieved context**: follow instructions precisely, produce structured outputs, call compliance tools, and maintain the right tone and terminology when grounded on regulatory documents.
28
+
29
+ ---
30
+
31
+ ## 🎯 What This Model Does
32
+
33
+ This fine-tuning optimizes the model's **behavior within a RAG system**, not its factual knowledge. Specifically:
34
+
35
+ | Task | Description |
36
+ |---|---|
37
+ | πŸ“‹ **RAG Q&A** | Answer regulatory questions grounded on retrieved documents |
38
+ | πŸ”§ **Tool Calling** | KYC verification, risk scoring, PEP checks, SOS reporting |
39
+ | πŸ” **Query Expansion** | Rewrite user queries with regulatory terminology for better retrieval |
40
+ | 🧠 **Intent Detection** | Classify if a message needs document search or is conversational |
41
+ | πŸ“Š **Document Reranking** | Score candidate documents by relevance |
42
+ | πŸ“ **Structured JSON** | Topic extraction, metadata, impact analysis in JSON format |
43
+ | βš–οΈ **Impact Analysis** | Cross-reference external regulations against internal bank procedures |
44
+
45
+ ---
46
+
47
+ ## πŸ“ˆ Evaluation β€” LLM-as-Judge
48
+
49
+ Evaluated by **Claude Opus 4.6** (Anthropic) across 11 blind test scenarios. The judge compared base vs fine-tuned model outputs without knowing which was which.
50
+
51
+ ### πŸ† Head-to-Head
52
+
53
+ ```
54
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
55
+ β”‚ 🟒 Tuned Wins 7/11 (68.2%) β”‚
56
+ β”‚ πŸ”΄ Base Wins 3/11 (31.8%) β”‚
57
+ β”‚ βšͺ Ties 1/11 β”‚
58
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
59
+ ```
60
+
61
+ ### πŸ“Š Quality Scores (1–5)
62
+
63
+ | Criterion | Base | Tuned | Delta | |
64
+ |---|:---:|:---:|:---:|---|
65
+ | 🎯 Instruction Following | 3.27 | **4.82** | +1.55 | 🟒🟒🟒 |
66
+ | πŸ“Ž Context Adherence | 3.64 | **5.00** | +1.36 | 🟒🟒🟒 |
67
+ | βœ… Accuracy | 3.73 | **4.73** | +1.00 | 🟒🟒 |
68
+ | πŸ“ Format | 4.09 | **4.64** | +0.55 | 🟒 |
69
+ | πŸ—£οΈ Tone | 4.45 | **4.73** | +0.28 | 🟒 |
70
+ | **πŸ“Š Overall** | **3.84** | **4.78** | **+0.95** | **🟒🟒** |
71
+
72
+ > **Largest improvement across all model sizes.** Instruction following jumps +1.55 and context adherence reaches a perfect 5.00 β€” the fine-tuning transforms this model's ability to follow retrieved regulatory context.
73
+
74
+ ### πŸ“‚ Results by Category
75
+
76
+ | Category | Base | Tuned | Tie |
77
+ |---|:---:|:---:|:---:|
78
+ | 🚫 Refusal Handling | 0 | **2** | 0 |
79
+ | ⚠️ Edge Cases | 0 | **1** | 0 |
80
+ | 🎨 Style & Tone | 0 | **1** | 0 |
81
+ | πŸ“€ Data Extraction | 0 | 0 | 1 |
82
+ | πŸ“‹ JSON Output | 1 | 1 | 0 |
83
+ | πŸ“– RAG Q&A | 1 | 1 | 0 |
84
+ | πŸ”§ Tool Use | 1 | 1 | 0 |
85
+
86
+ ### πŸ”„ Comparison Across Model Sizes
87
+
88
+ | Metric | 4B | 7B | 32B |
89
+ |---|:---:|:---:|:---:|
90
+ | Base score (pre-tuning) | 4.11 | 3.84 | **4.36** |
91
+ | Tuned score | 4.68 | **4.78** | **4.80** |
92
+ | Delta (improvement) | +0.57 | **+0.95** | +0.44 |
93
+ | Best eval loss | 1.191 | 1.330 | **0.813** |
94
+ | Token accuracy | ~73% | ~72% | **~81%** |
95
+ | Train/eval gap | 0.050 | 0.083 | **0.030** |
96
+
97
+ > The 7B shows the **highest delta** (+0.95) β€” it benefits the most from fine-tuning, reaching near-parity with the 32B tuned model (4.78 vs 4.80).
98
+
99
+ ---
100
+
101
+ ## πŸ’‘ Usage Examples
102
+
103
+ ### πŸ“‹ RAG Q&A β€” Answering from Retrieved Context
104
+
105
+ The model is designed to receive **retrieved regulatory documents as context** and answer based on them:
106
+
107
+ ```python
108
+ messages = [
109
+ {
110
+ "role": "system",
111
+ "content": """Sei un assistente per la compliance bancaria.
112
+ Rispondi SOLO basandoti sul contesto fornito.
113
+
114
+ <contesto_recuperato>
115
+ Art. 92 CRR - Gli enti soddisfano in qualsiasi momento i seguenti
116
+ requisiti: a) CET1 del 4,5%; b) Tier 1 del 6%; c) capitale totale dell'8%.
117
+ Il coefficiente Γ¨ calcolato come rapporto tra i fondi propri e
118
+ l'importo complessivo dell'esposizione al rischio.
119
+ </contesto_recuperato>"""
120
+ },
121
+ {
122
+ "role": "user",
123
+ "content": "Quali sono i requisiti minimi di capitale secondo il CRR?"
124
+ }
125
+ ]
126
+ ```
127
+
128
+ ### πŸ” Query Expansion β€” Improving RAG Retrieval
129
+
130
+ ```python
131
+ messages = [
132
+ {
133
+ "role": "system",
134
+ "content": "Riscrivi la query dell'utente in una versione piΓΉ ricca per migliorare il recupero documentale (RAG). Aggiungi termini tecnici e riferimenti normativi. Rispondi SOLO con il JSON richiesto."
135
+ },
136
+ {
137
+ "role": "user",
138
+ "content": "## QUERY ORIGINALE: [obblighi segnalazione operazioni sospette]"
139
+ }
140
+ ]
141
+
142
+ # Expected output:
143
+ # {"query": "obblighi segnalazione operazioni sospette SOS UIF D.Lgs. 231/2007
144
+ # art. 35 riciclaggio finanziamento terrorismo portale RADAR tempistiche
145
+ # invio indicatori anomalia"}
146
+ ```
147
+
148
+ ### πŸ”§ Tool Calling β€” Compliance Workflows
149
+
150
+ ```python
151
+ messages = [
152
+ {
153
+ "role": "system",
154
+ "content": """Sei un assistente operativo per la compliance.
155
+
156
+ <tools>
157
+ {"name": "calcola_scoring_rischio", "parameters": {...}}
158
+ {"name": "controlla_liste_pep", "parameters": {...}}
159
+ {"name": "verifica_kyc", "parameters": {...}}
160
+ </tools>
161
+
162
+ <contesto_recuperato>
163
+ Procedura AML-003: L'adeguata verifica rafforzata (EDD) deve essere
164
+ applicata per PEP, paesi ad alto rischio e profili con scoring > 60.
165
+ </contesto_recuperato>"""
166
+ },
167
+ {
168
+ "role": "user",
169
+ "content": "Devo aprire un conto per una societΓ  con sede a Dubai. Il legale rappresentante Γ¨ il sig. Al-Rashid."
170
+ }
171
+ ]
172
+
173
+ # The model will:
174
+ # 1. Call controlla_liste_pep for the representative
175
+ # 2. Call calcola_scoring_rischio based on risk factors
176
+ # 3. Recommend EDD procedure per AML-003, grounded on retrieved policy
177
+ ```
178
+
179
+ ### πŸ“Š Document Reranking
180
+
181
+ ```python
182
+ messages = [
183
+ {
184
+ "role": "system",
185
+ "content": "Valuta la rilevanza di ciascun candidato rispetto alla query. Restituisci solo i candidati rilevanti con score 0-100. Rispondi SOLO con il JSON richiesto."
186
+ },
187
+ {
188
+ "role": "user",
189
+ "content": '{"query": "requisiti CET1 fondi propri", "candidates": [{"id": "doc_001", "title": "Art. 92 CRR", "content": "..."}, {"id": "doc_002", "title": "DORA Art. 5", "content": "..."}]}'
190
+ }
191
+ ]
192
+
193
+ # Expected: {"matches": [{"id": "doc_001", "relevance": 95}]}
194
+ ```
195
+
196
+ ---
197
+
198
+ ## βš™οΈ Training Details
199
+
200
+ | | |
201
+ |---|---|
202
+ | 🧬 **Method** | LoRA β€” bf16 full precision (no quantization) |
203
+ | πŸ—οΈ **Base Model** | Qwen2.5-7B-Instruct |
204
+ | πŸ“¦ **Dataset** | 923 train / 102 eval samples |
205
+ | ⏱️ **Duration** | 13.2 minutes |
206
+
207
+ ### Hyperparameters
208
+
209
+ | Parameter | Value |
210
+ |---|---|
211
+ | LoRA Rank / Alpha | 16 / 32 |
212
+ | LoRA Dropout | 0.10 |
213
+ | Target Modules | q, k, v, o, gate, up, down proj |
214
+ | Learning Rate | 5e-6 (cosine scheduler) |
215
+ | Epochs | 3 |
216
+ | Effective Batch Size | 4 (2 Γ— 2 accum) |
217
+ | Max Sequence Length | 4096 |
218
+ | NEFTune Alpha | 5.0 |
219
+ | Warmup Ratio | 0.05 |
220
+
221
+ ### πŸ“‰ Training Metrics
222
+
223
+ | Metric | Value |
224
+ |---|---|
225
+ | Final Train Loss | 1.247 |
226
+ | Best Eval Loss | 1.330 (step 680/693) |
227
+ | Train/Eval Gap | 0.083 βœ… |
228
+
229
+ > Gap of 0.083 indicates **stable training with no overfitting**.
230
+
231
+ ---
232
+
233
+ ## πŸ“š Dataset Coverage
234
+
235
+ The training data covers the full lifecycle of a RAG-based compliance assistant:
236
+
237
+ | Category | Purpose |
238
+ |---|---|
239
+ | 🏷️ Title Generation | Generate conversation titles from user queries |
240
+ | πŸ” Query Expansion | Enrich queries with regulatory terms for better retrieval |
241
+ | 🧠 Intent Classification | Route queries to RAG vs conversational responses |
242
+ | πŸ“Š Document Reranking | Score retrieved documents by relevance |
243
+ | πŸ“ Topic Extraction | Extract main topics from regulatory text pages |
244
+ | πŸ“– Document Summarization | Summarize multi-page regulatory documents |
245
+ | βš–οΈ Relevance Filtering | Filter regulatory text relevant to banks |
246
+ | πŸ“… Metadata Extraction | Find application dates, issuing authorities |
247
+ | πŸ”§ Impact Analysis | Cross-reference regulations vs internal procedures |
248
+ | πŸ’¬ RAG Q&A + Tool Calling | Multi-turn compliance conversations with tools |
249
+
250
+ **Regulatory sources covered:** CRR/CRR3, DORA (UE 2022/2554), D.Lgs. 231/2007 (AML), D.Lgs. 385/1993 (TUB), Circolare 285, PSD2, MiFID II/MiFIR, D.P.R. 180/1950 and related Banca d'Italia provisions.
251
+
252
+ ---
253
+
254
+ ## πŸš€ Deployment
255
+
256
+ ### With vLLM
257
+ ```bash
258
+ vllm serve ./models/RegTech-7B-Instruct --dtype bfloat16
259
+ ```
260
+
261
+ ### With Transformers
262
+ ```python
263
+ from transformers import AutoModelForCausalLM, AutoTokenizer
264
+
265
+ model = AutoModelForCausalLM.from_pretrained("YOUR_REPO_ID", torch_dtype="bfloat16", device_map="auto")
266
+ tokenizer = AutoTokenizer.from_pretrained("YOUR_REPO_ID")
267
+
268
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
269
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
270
+ outputs = model.generate(**inputs, max_new_tokens=512)
271
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
272
+ ```
273
+
274
+ ---
275
+
276
+ ## ⚠️ Important Notes
277
+
278
+ - 🎯 **RAG-optimized** β€” trained to work with retrieved context, not to memorize regulations. Always provide relevant documents in the system prompt.
279
+ - 🏦 **Domain-specific** β€” optimized for Italian banking compliance. General capabilities may differ from the base model.
280
+ - βš–οΈ **Not legal advice** β€” a tool to assist compliance professionals, not a substitute for regulatory expertise.
281
+ - πŸ”§ **Tool schemas** β€” tool calling works best with the specific function signatures used during training.
282
+ - πŸ† **Best cost/performance ratio** β€” shows the largest improvement from fine-tuning (+0.95 delta) while reaching near-parity with the 32B model.
283
+
284
+ ---
285
+
286
+ <p align="center">
287
+ Built with ❀️ for banking RAG<br>
288
+ <em>Fine-tuned with LoRA β€’ Evaluated by Claude Opus 4.6 β€’ Powered by Qwen2.5</em><br>
289
+ <em>Contact For Commercial Use: https://landing.2sophia.ai</em>
290
+ </p>