YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
FCA Financial Advice vs Guidance Classifier
A RoBERTa-base model fine-tuned to classify financial communications into three regulatory categories under UK FCA regulations:
| Label | Description | Key Signals |
|---|---|---|
| guidance | Generic, educational financial information | No named products, no individual assessment, no "suitable for you" |
| targeted_support | Segment-personalised communications (FCA CP23/24) | Limited personal data, "people like you" framing, action categories |
| advice | Personal recommendations requiring FCA authorisation | Named specific products, individual circumstances, suitability assertions |
Performance
| Metric | Validation | Test |
|---|---|---|
| Accuracy | 100% | 100% |
| F1 Macro | 1.000 | 1.000 |
| F1 (guidance) | 1.000 | 1.000 |
| F1 (targeted_support) | 1.000 | 1.000 |
| F1 (advice) | 1.000 | 1.000 |
Usage
from transformers import pipeline
classifier = pipeline("text-classification", model="djordjebatic/fca-financial-classifier")
# Guidance example
result = classifier("ISAs allow you to save up to Β£20,000 per year tax-free. There are several types including Cash ISAs and Stocks and Shares ISAs.")
# β [{'label': 'guidance', 'score': 0.99}]
# Targeted Support example
result = classifier("You're approaching 55 and your pension balance is Β£95,000. Many people in your situation explore their drawdown options.")
# β [{'label': 'targeted_support', 'score': 0.99}]
# Advice example
result = classifier("Based on your risk profile, I recommend investing Β£30,000 in the Vanguard LifeStrategy 80% Equity Fund. This is suitable for your circumstances.")
# β [{'label': 'advice', 'score': 0.99}]
Training Data
Trained on djordjebatic/fca-financial-classification β a synthetic dataset generated using Qwen/Qwen2.5-7B-Instruct as teacher model.
Synthetic Data Generation Pipeline
The dataset was generated using a multi-stage pipeline:
- Seed Data: 22 expert-crafted examples grounded in FCA PERG 8 (PERG 8.17G-8.37G), RAO Article 53, and FCA CP23/24 (Targeted Support framework)
- Diversity Grid: 10 financial domains Γ 14 channels Γ 10 personas = 1,400 unique combinations
- Teacher Generation: Qwen2.5-7B-Instruct generated 5 examples per prompt with temperature=0.85
- LLM-as-Judge: Same model verified each example's label accuracy (PASS/FAIL)
- Post-processing: Length filtering, deduplication, class balancing
Pipeline compatible with sdg_hub β a custom flow.yaml is included.
Dataset Statistics
| Split | Size | Distribution |
|---|---|---|
| Train | 458 | 153/152/153 (guidance/targeted_support/advice) |
| Validation | 57 | 19/19/19 |
| Test | 58 | 19/20/19 |
- Generated: 750 raw examples
- Quality pass rate: 90.9% (682 passed LLM judge)
- After balancing: 573 examples (191 per class)
Diversity Dimensions
- Domains: investments, pensions, savings, mortgages, insurance, equity_release, tax_planning, retirement_income, estate_planning, debt_management
- Channels: website_faq, email, app_notification, phone_transcript, suitability_letter, platform_message, newsletter, chatbot, letter, video_call_notes, robo_adviser, social_media, brochure, annual_review
- Personas: young professional, family with children, pre-retiree, retiree, HNW, first-time buyer, self-employed, recently divorced, inheritor, low-income saver
Regulatory Sources
- FCA PERG 8 β Perimeter Guidance Manual Ch.8 (PERG 8.17Gβ8.37G)
- RAO Article 53 β Regulated Activities Order 2001
- UK MiFID Article 9 β Personal recommendation definition
- FCA CP23/24 (Dec 2023) β Advice/Guidance Boundary Review: Targeted Support
- FCA CP24/7 (Jul 2024) β Targeted Support and Simplified Advice
- FCA FG17/8 (2017) β Finalised Guidance for automated investment services
The Three-Class Decision Framework
Does it name a specific product/provider?
ββ NO β Does it use individual's data to suggest action?
β ββ NO β GUIDANCE
β ββ YES β TARGETED SUPPORT
ββ YES β Is it presented as suitable for this individual?
ββ NO β TARGETED SUPPORT or GUIDANCE
ββ YES β REGULATED ADVICE
Model Details
- Base model: FacebookAI/roberta-base (125M params)
- Training: 5 epochs, lr=2e-5, batch_size=16, max_length=512
- Optimizer: AdamW, weight_decay=0.01, warmup_ratio=0.1
- Early stopping: patience=3, metric=f1_macro
Limitations
- Trained on synthetic data β may not capture all real-world edge cases
- The 100% test accuracy likely reflects synthetic data homogeneity rather than perfect generalisation
- Targeted Support is a proposed regulatory category (FCA CP23/24) with boundaries still being refined
- Edge cases at class boundaries (e.g., guidance with "you" language, targeted support naming product categories) need real-world validation
- UK-specific regulatory framework β not applicable to other jurisdictions
Files
seeds/seed_examples.jsonβ Expert-crafted seed examples with PERG 8 reasoningseeds/fca_regulatory_context.mdβ Regulatory context documentflows/fca_classification/flow.yamlβ sdg_hub flow definitionflows/fca_classification/prompts/β Prompt templates for generation and quality judgingscripts/generate_self_contained.pyβ Self-contained data generation scriptscripts/train_classifier.pyβ Training script
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support