Create README.md

Browse files

Files changed (1) hide show

README.md +276 -0

README.md ADDED Viewed

	@@ -0,0 +1,276 @@

+---
+language:
+  - it
+  - en
+license: apache-2.0
+library_name: transformers
+base_model: Qwen/Qwen3-4B-Instruct-2507
+tags:
+  - lora
+  - fine-tuned
+  - banking
+  - regtech
+  - compliance
+  - rag
+  - tool-calling
+  - italian
+  - qwen3
+pipeline_tag: text-generation
+---
+# 🏦 RegTech-4B-Instruct
+> **Fine-tuned for RAG-powered banking compliance — not general knowledge.**
+A specialized [Qwen3-4B-Instruct](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) model fine-tuned to excel within a **Retrieval-Augmented Generation (RAG) pipeline** for Italian banking regulatory compliance.
+This model doesn't try to memorize regulations — it's trained to **work with retrieved context**: follow instructions precisely, produce structured outputs, call compliance tools, and maintain the right tone and terminology when grounded on regulatory documents.
+---
+## 🎯 What This Model Does
+This fine-tuning optimizes the model's **behavior within a RAG system**, not its factual knowledge. Specifically:
+| Task | Description |
+|---|---|
+| 📋 **RAG Q&A** | Answer regulatory questions grounded on retrieved documents |
+| 🔧 **Tool Calling** | KYC verification, risk scoring, PEP checks, SOS reporting |
+| 🔍 **Query Expansion** | Rewrite user queries with regulatory terminology for better retrieval |
+| 🧠 **Intent Detection** | Classify if a message needs document search or is conversational |
+| 📊 **Document Reranking** | Score candidate documents by relevance |
+| 📝 **Structured JSON** | Topic extraction, metadata, impact analysis in JSON format |
+| ⚖️ **Impact Analysis** | Cross-reference external regulations against internal bank procedures |
+---
+## 📈 Evaluation — LLM-as-Judge
+Evaluated by **Claude Opus 4.6** (Anthropic) across 11 blind test scenarios. The judge compared base vs fine-tuned model outputs without knowing which was which.
+### 🏆 Head-to-Head
+```
+┌─────────────────────────────────────────┐
+│  🟢 Tuned Wins    7/11    (68.2%)       │
+│  🔴 Base Wins     3/11    (31.8%)       │
+│  ⚪ Ties          1/11                   │
+└─────────────────────────────────────────┘
+```
+### 📊 Quality Scores (1–5)
+| Criterion | Base | Tuned | Delta | |
+|---|:---:|:---:|:---:|---|
+| 🎯 Instruction Following | 3.64 | **4.55** | +0.91 | 🟢🟢🟢 |
+| 📎 Context Adherence | 4.09 | **4.82** | +0.73 | 🟢🟢 |
+| ✅ Accuracy | 4.18 | **4.64** | +0.46 | 🟢 |
+| 📐 Format | 4.09 | **4.55** | +0.46 | 🟢 |
+| 🗣️ Tone | 4.55 | **4.82** | +0.27 | 🟢 |
+| **📊 Overall** | **4.11** | **4.68** | **+0.57** | **🟢** |
+> The biggest gains are in **instruction following** (+0.91) and **context adherence** (+0.73) — exactly what matters when the model must follow retrieved regulatory context faithfully.
+### 📂 Results by Category
+| Category | Base | Tuned | Tie |
+|---|:---:|:---:|:---:|
+| 🔧 Tool Use | 0 | **2** | 0 |
+| 🚫 Refusal Handling | 0 | **1** | 1 |
+| 🎨 Style & Tone | 0 | **1** | 0 |
+| 📤 Data Extraction | 0 | **1** | 0 |
+| 📋 JSON Output | 1 | 1 | 0 |
+| 📖 RAG Q&A | 1 | 1 | 0 |
+| ⚠️ Edge Cases | 1 | 0 | 0 |
+---
+## 💡 Usage Examples
+### 📋 RAG Q&A — Answering from Retrieved Context
+The model is designed to receive **retrieved regulatory documents as context** and answer based on them:
+```python
+messages = [
+    {
+        "role": "system",
+        "content": """Sei un assistente per la compliance bancaria.
+Rispondi SOLO basandoti sul contesto fornito.
+<contesto_recuperato>
+Art. 92 CRR - Gli enti soddisfano in qualsiasi momento i seguenti
+requisiti: a) CET1 del 4,5%; b) Tier 1 del 6%; c) capitale totale dell'8%.
+Il coefficiente è calcolato come rapporto tra i fondi propri e
+l'importo complessivo dell'esposizione al rischio.
+</contesto_recuperato>"""
+    },
+    {
+        "role": "user",
+        "content": "Quali sono i requisiti minimi di capitale secondo il CRR?"
+    }
+]
+```
+### 🔍 Query Expansion — Improving RAG Retrieval
+```python
+messages = [
+    {
+        "role": "system",
+        "content": "Riscrivi la query dell'utente in una versione più ricca per migliorare il recupero documentale (RAG). Aggiungi termini tecnici e riferimenti normativi. Rispondi SOLO con il JSON richiesto."
+    },
+    {
+        "role": "user",
+        "content": "## QUERY ORIGINALE: [obblighi segnalazione operazioni sospette]"
+    }
+]
+# Expected output:
+# {"query": "obblighi segnalazione operazioni sospette SOS UIF D.Lgs. 231/2007
+#   art. 35 riciclaggio finanziamento terrorismo portale RADAR tempistiche
+#   invio indicatori anomalia"}
+```
+### 🔧 Tool Calling — Compliance Workflows
+```python
+messages = [
+    {
+        "role": "system",
+        "content": """Sei un assistente operativo per la compliance.
+<tools>
+{"name": "calcola_scoring_rischio", "parameters": {...}}
+{"name": "controlla_liste_pep", "parameters": {...}}
+{"name": "verifica_kyc", "parameters": {...}}
+</tools>
+<contesto_recuperato>
+Procedura AML-003: L'adeguata verifica rafforzata (EDD) deve essere
+applicata per PEP, paesi ad alto rischio e profili con scoring > 60.
+</contesto_recuperato>"""
+    },
+    {
+        "role": "user",
+        "content": "Devo aprire un conto per una società con sede a Dubai. Il legale rappresentante è il sig. Al-Rashid."
+    }
+]
+# The model will:
+# 1. Call controlla_liste_pep for the representative
+# 2. Call calcola_scoring_rischio based on risk factors
+# 3. Recommend EDD procedure per AML-003, grounded on retrieved policy
+```
+### 📊 Document Reranking
+```python
+messages = [
+    {
+        "role": "system",
+        "content": "Valuta la rilevanza di ciascun candidato rispetto alla query. Restituisci solo i candidati rilevanti con score 0-100. Rispondi SOLO con il JSON richiesto."
+    },
+    {
+        "role": "user",
+        "content": '{"query": "requisiti CET1 fondi propri", "candidates": [{"id": "doc_001", "title": "Art. 92 CRR", "content": "..."}, {"id": "doc_002", "title": "DORA Art. 5", "content": "..."}]}'
+    }
+]
+# Expected: {"matches": [{"id": "doc_001", "relevance": 95}]}
+```
+---
+## ⚙️ Training Details
+| | |
+|---|---|
+| 🧬 **Method** | LoRA — bf16 full precision (no quantization) |
+| 🏗️ **Base Model** | Qwen3-4B-Instruct-2507 |
+| 📦 **Dataset** | 923 train / 102 eval samples |
+| ⏱️ **Duration** | 11.9 minutes |
+### Hyperparameters
+| Parameter | Value |
+|---|---|
+| LoRA Rank / Alpha | 16 / 32 |
+| LoRA Dropout | 0.10 |
+| Target Modules | q, k, v, o, gate, up, down proj |
+| Learning Rate | 5e-6 (cosine scheduler) |
+| Epochs | 3 |
+| Effective Batch Size | 4 (2 × 2 accum) |
+| Max Sequence Length | 4096 |
+| NEFTune Alpha | 5.0 |
+| Warmup Ratio | 0.05 |
+### 📉 Training Metrics
+| Metric | Value |
+|---|---|
+| Final Train Loss | 1.241 |
+| Best Eval Loss | 1.191 (step 680/693) |
+| Train/Eval Gap | 0.050 ✅ |
+> Gap of 0.050 indicates **stable training with no overfitting**.
+---
+## 📚 Dataset Coverage
+The training data covers the full lifecycle of a RAG-based compliance assistant:
+| Category | Purpose |
+|---|---|
+| 🏷️ Title Generation | Generate conversation titles from user queries |
+| 🔍 Query Expansion | Enrich queries with regulatory terms for better retrieval |
+| 🧠 Intent Classification | Route queries to RAG vs conversational responses |
+| 📊 Document Reranking | Score retrieved documents by relevance |
+| 📝 Topic Extraction | Extract main topics from regulatory text pages |
+| 📖 Document Summarization | Summarize multi-page regulatory documents |
+| ⚖️ Relevance Filtering | Filter regulatory text relevant to banks |
+| 📅 Metadata Extraction | Find application dates, issuing authorities |
+| 🔧 Impact Analysis | Cross-reference regulations vs internal procedures |
+| 💬 RAG Q&A + Tool Calling | Multi-turn compliance conversations with tools |
+**Regulatory sources covered:** CRR/CRR3, DORA (UE 2022/2554), D.Lgs. 231/2007 (AML), D.Lgs. 385/1993 (TUB), Circolare 285, PSD2, MiFID II/MiFIR, D.P.R. 180/1950 and related Banca d'Italia provisions.
+---
+## 🚀 Deployment
+### With vLLM
+```bash
+vllm serve ./models/RegTech-4B-Instruct --dtype bfloat16
+```
+### With Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("YOUR_REPO_ID", torch_dtype="bfloat16", device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained("YOUR_REPO_ID")
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+---
+## ⚠️ Important Notes
+- 🎯 **RAG-optimized** — trained to work with retrieved context, not to memorize regulations. Always provide relevant documents in the system prompt.
+- 🏦 **Domain-specific** — optimized for Italian banking compliance. General capabilities may differ from the base model.
+- ⚖️ **Not legal advice** — a tool to assist compliance professionals, not a substitute for regulatory expertise.
+- 🔧 **Tool schemas** — tool calling works best with the specific function signatures used during training.
+---
+<p align="center">
+  Built with ❤️ for banking RAG<br>
+  <em>Fine-tuned with LoRA • Evaluated by Claude Opus 4.6 • Powered by Qwen3</em>
+  <em>Contact For Commercial Use: https://landing.2sophia.ai</em>
+</p>