README.md · Sophia-AI/RegTech-32B-Instruct at main

RegTech-32B-Instruct / README.md

MwSpace

Update README.md

e77c63e verified 1 day ago

preview code

raw

history blame contribute delete

9.59 kB

	---
	language:
	- it
	- en
	license: apache-2.0
	library_name: transformers
	base_model: Qwen/Qwen2.5-32B-Instruct
	tags:
	- lora
	- fine-tuned
	- banking
	- regtech
	- compliance
	- rag
	- tool-calling
	- italian
	- qwen2.5
	pipeline_tag: text-generation
	---

	# 🏦 RegTech-32B-Instruct

	> Fine-tuned for RAG-powered banking compliance — not general knowledge.

	A specialized [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) model fine-tuned to excel within a Retrieval-Augmented Generation (RAG) pipeline for Italian banking regulatory compliance.

	This model doesn't try to memorize regulations — it's trained to work with retrieved context: follow instructions precisely, produce structured outputs, call compliance tools, and maintain the right tone and terminology when grounded on regulatory documents.

	---

	## 🎯 What This Model Does

	This fine-tuning optimizes the model's behavior within a RAG system, not its factual knowledge. Specifically:

	\| Task \| Description \|
	\|---\|---\|
	\| 📋 RAG Q&A \| Answer regulatory questions grounded on retrieved documents \|
	\| 🔧 Tool Calling \| KYC verification, risk scoring, PEP checks, SOS reporting \|
	\| 🔍 Query Expansion \| Rewrite user queries with regulatory terminology for better retrieval \|
	\| 🧠 Intent Detection \| Classify if a message needs document search or is conversational \|
	\| 📊 Document Reranking \| Score candidate documents by relevance \|
	\| 📝 Structured JSON \| Topic extraction, metadata, impact analysis in JSON format \|
	\| ⚖️ Impact Analysis \| Cross-reference external regulations against internal bank procedures \|

	---

	## 📈 Evaluation — LLM-as-Judge

	Evaluated by Claude Opus 4.6 (Anthropic) across 11 blind test scenarios. The judge compared base vs fine-tuned model outputs without knowing which was which.

	### 🏆 Head-to-Head

	```
	┌─────────────────────────────────────────┐
	│ 🟢 Tuned Wins 7/11 (68.2%) │
	│ 🔴 Base Wins 3/11 (31.8%) │
	│ ⚪ Ties 1/11 │
	└─────────────────────────────────────────┘
	```

	### 📊 Quality Scores (1–5)

	\| Criterion \| Base \| Tuned \| Delta \| \|
	\|---\|:---:\|:---:\|:---:\|---\|
	\| 🎯 Instruction Following \| 4.00 \| 4.82 \| +0.82 \| 🟢🟢🟢 \|
	\| 📎 Context Adherence \| 4.36 \| 4.91 \| +0.55 \| 🟢🟢 \|
	\| ✅ Accuracy \| 4.27 \| 4.73 \| +0.46 \| 🟢 \|
	\| 📐 Format \| 4.36 \| 4.55 \| +0.19 \| ➖ \|
	\| 🗣️ Tone \| 4.82 \| 5.00 \| +0.18 \| ➖ \|
	\| 📊 Overall \| 4.36 \| 4.80 \| +0.44 \| 🟢 \|

	> The biggest gains are in instruction following (+0.82) and context adherence (+0.55) — exactly what matters when the model must follow retrieved regulatory context faithfully. Tone reaches a perfect 5.00.

	### 📂 Results by Category

	\| Category \| Base \| Tuned \| Tie \|
	\|---\|:---:\|:---:\|:---:\|
	\| 🚫 Refusal Handling \| 0 \| 1 \| 1 \|
	\| 🎨 Style & Tone \| 0 \| 1 \| 0 \|
	\| 📤 Data Extraction \| 0 \| 1 \| 0 \|
	\| ⚠️ Edge Cases \| 0 \| 1 \| 0 \|
	\| 📋 JSON Output \| 1 \| 1 \| 0 \|
	\| 📖 RAG Q&A \| 1 \| 1 \| 0 \|
	\| 🔧 Tool Use \| 1 \| 1 \| 0 \|

	### 🔄 Comparison with RegTech-4B-Instruct

	\| Metric \| 4B \| 32B \|
	\|---\|:---:\|:---:\|
	\| Base score (pre-tuning) \| 4.11 \| 4.36 \|
	\| Tuned score \| 4.68 \| 4.80 \|
	\| Best eval loss \| 1.191 \| 0.813 \|
	\| Token accuracy \| ~73% \| ~81% \|
	\| Train/eval gap \| 0.050 \| 0.030 \|

	---

	## 💡 Usage Examples

	### 📋 RAG Q&A — Answering from Retrieved Context

	The model is designed to receive retrieved regulatory documents as context and answer based on them:

	```python
	messages = [
	{
	"role": "system",
	"content": """Sei un assistente per la compliance bancaria.
	Rispondi SOLO basandoti sul contesto fornito.

	<contesto_recuperato>
	Art. 92 CRR - Gli enti soddisfano in qualsiasi momento i seguenti
	requisiti: a) CET1 del 4,5%; b) Tier 1 del 6%; c) capitale totale dell'8%.
	Il coefficiente è calcolato come rapporto tra i fondi propri e
	l'importo complessivo dell'esposizione al rischio.
	</contesto_recuperato>"""
	},
	{
	"role": "user",
	"content": "Quali sono i requisiti minimi di capitale secondo il CRR?"
	}
	]
	```

	### 🔍 Query Expansion — Improving RAG Retrieval

	```python
	messages = [
	{
	"role": "system",
	"content": "Riscrivi la query dell'utente in una versione più ricca per migliorare il recupero documentale (RAG). Aggiungi termini tecnici e riferimenti normativi. Rispondi SOLO con il JSON richiesto."
	},
	{
	"role": "user",
	"content": "## QUERY ORIGINALE: [obblighi segnalazione operazioni sospette]"
	}
	]

	# Expected output:
	# {"query": "obblighi segnalazione operazioni sospette SOS UIF D.Lgs. 231/2007
	# art. 35 riciclaggio finanziamento terrorismo portale RADAR tempistiche
	# invio indicatori anomalia"}
	```

	### 🔧 Tool Calling — Compliance Workflows

	```python
	messages = [
	{
	"role": "system",
	"content": """Sei un assistente operativo per la compliance.

	<tools>
	{"name": "calcola_scoring_rischio", "parameters": {...}}
	{"name": "controlla_liste_pep", "parameters": {...}}
	{"name": "verifica_kyc", "parameters": {...}}
	</tools>

	<contesto_recuperato>
	Procedura AML-003: L'adeguata verifica rafforzata (EDD) deve essere
	applicata per PEP, paesi ad alto rischio e profili con scoring > 60.
	</contesto_recuperato>"""
	},
	{
	"role": "user",
	"content": "Devo aprire un conto per una società con sede a Dubai. Il legale rappresentante è il sig. Al-Rashid."
	}
	]

	# The model will:
	# 1. Call controlla_liste_pep for the representative
	# 2. Call calcola_scoring_rischio based on risk factors
	# 3. Recommend EDD procedure per AML-003, grounded on retrieved policy
	```

	### 📊 Document Reranking

	```python
	messages = [
	{
	"role": "system",
	"content": "Valuta la rilevanza di ciascun candidato rispetto alla query. Restituisci solo i candidati rilevanti con score 0-100. Rispondi SOLO con il JSON richiesto."
	},
	{
	"role": "user",
	"content": '{"query": "requisiti CET1 fondi propri", "candidates": [{"id": "doc_001", "title": "Art. 92 CRR", "content": "..."}, {"id": "doc_002", "title": "DORA Art. 5", "content": "..."}]}'
	}
	]

	# Expected: {"matches": [{"id": "doc_001", "relevance": 95}]}
	```

	---

	## ⚙️ Training Details

	\| \| \|
	\|---\|---\|
	\| 🧬 Method \| LoRA — bf16 full precision (no quantization) \|
	\| 🏗️ Base Model \| Qwen2.5-32B-Instruct \|
	\| 📦 Dataset \| 923 train / 102 eval samples \|
	\| ⏱️ Duration \| 40.0 minutes \|

	### 📉 Training Metrics

	\| Metric \| Value \|
	\|---\|---\|
	\| Final Train Loss \| 0.843 \|
	\| Best Eval Loss \| 0.813 (step 640/693) \|
	\| Train/Eval Gap \| 0.030 ✅ \|

	> Gap of 0.030 indicates very stable training with no overfitting.

	---

	## 📚 Dataset Coverage

	The training data covers the full lifecycle of a RAG-based compliance assistant:

	\| Category \| Purpose \|
	\|---\|---\|
	\| 🏷️ Title Generation \| Generate conversation titles from user queries \|
	\| 🔍 Query Expansion \| Enrich queries with regulatory terms for better retrieval \|
	\| 🧠 Intent Classification \| Route queries to RAG vs conversational responses \|
	\| 📊 Document Reranking \| Score retrieved documents by relevance \|
	\| 📝 Topic Extraction \| Extract main topics from regulatory text pages \|
	\| 📖 Document Summarization \| Summarize multi-page regulatory documents \|
	\| ⚖️ Relevance Filtering \| Filter regulatory text relevant to banks \|
	\| 📅 Metadata Extraction \| Find application dates, issuing authorities \|
	\| 🔧 Impact Analysis \| Cross-reference regulations vs internal procedures \|
	\| 💬 RAG Q&A + Tool Calling \| Multi-turn compliance conversations with tools \|

	Regulatory sources covered: CRR/CRR3, DORA (UE 2022/2554), D.Lgs. 231/2007 (AML), D.Lgs. 385/1993 (TUB), Circolare 285, PSD2, MiFID II/MiFIR, D.P.R. 180/1950 and related Banca d'Italia provisions.

	---

	## 🚀 Deployment

	### With vLLM
	```bash
	vllm serve ./models/RegTech-32B-Instruct --dtype bfloat16
	```

	### With Transformers
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("YOUR_REPO_ID", torch_dtype="bfloat16", device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained("YOUR_REPO_ID")

	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=512)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## ⚠️ Important Notes

	- 🎯 RAG-optimized — trained to work with retrieved context, not to memorize regulations. Always provide relevant documents in the system prompt.
	- 🏦 Domain-specific — optimized for Italian banking compliance. General capabilities may differ from the base model.
	- ⚖️ Not legal advice — a tool to assist compliance professionals, not a substitute for regulatory expertise.
	- 🔧 Tool schemas — tool calling works best with the specific function signatures used during training.

	---

	<p align="center">
	Built with ❤️ for banking RAG<br>
	<em>Fine-tuned with LoRA • Evaluated by Claude Opus 4.6 • Powered by Qwen2.5</em><br>
	<em>Contact For Commercial Use: https://landing.2sophia.ai</em>
	</p>