| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - insurance |
| - fraud-detection |
| - xgboost |
| - isolation-forest |
| - uk-insurance |
| - tabular-classification |
| - bytical |
| library_name: xgboost |
| pipeline_tag: tabular-classification |
| datasets: |
| - piyushptiwari/insureos-training-data |
| model-index: |
| - name: InsureFraudNet |
| results: |
| - task: |
| type: tabular-classification |
| name: Insurance Fraud Detection |
| metrics: |
| - type: roc_auc |
| value: 1.0 |
| name: AUC-ROC (Motor) |
| - type: roc_auc |
| value: 1.0 |
| name: AUC-ROC (Property) |
| - type: roc_auc |
| value: 1.0 |
| name: AUC-ROC (Liability) |
| --- |
| |
| # InsureFraudNet — Insurance Fraud Detection |
|
|
| **Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations. |
|
|
| ## Model Description |
|
|
| InsureFraudNet is a multi-line-of-business fraud detection system for UK insurance claims. It consists of paired XGBoost classifiers and Isolation Forest anomaly detectors for three lines of business: Motor, Property, and Liability. |
|
|
| ### Architecture |
|
|
| Each line of business has: |
| - **XGBoost Classifier** — Supervised gradient-boosted tree for fraud probability scoring |
| - **Isolation Forest** — Unsupervised anomaly detection for novel fraud patterns |
|
|
| ### Lines of Business |
|
|
| | LoB | Training Claims | Fraud Rate | Features | AUC-ROC | F1 | |
| |-----|----------------|------------|----------|---------|-----| |
| | **Motor** | 25,000 | 8% | 23 | **1.000** | **1.000** | |
| | **Property** | 15,000 | 8% | 20 | **1.000** | **1.000** | |
| | **Liability** | 10,000 | 8% | 14 | **1.000** | **1.000** | |
|
|
| ### Top Fraud Indicators by LoB |
|
|
| **Motor:** |
| | Feature | Importance | |
| |---------|-----------| |
| | claim_reserve_ratio | 48.9% | |
| | days_to_report | 43.7% | |
| | policy_age_days | 5.7% | |
| | previous_claims_3y | 1.4% | |
|
|
| **Property:** |
| | Feature | Importance | |
| |---------|-----------| |
| | days_to_report | 40.9% | |
| | policy_age_days | 37.6% | |
| | claim_reserve_ratio | 20.0% | |
| | previous_claims_3y | 1.4% | |
|
|
| **Liability:** |
| | Feature | Importance | |
| |---------|-----------| |
| | previous_claims_3y | 56.1% | |
| | days_to_report | 43.9% | |
|
|
| ### Files |
|
|
| | File | Description | |
| |------|-------------| |
| | `xgb_motor.json` | XGBoost model for motor fraud | |
| | `xgb_property.json` | XGBoost model for property fraud | |
| | `xgb_liability.json` | XGBoost model for liability fraud | |
| | `iforest_motor.pkl` | Isolation Forest for motor anomalies | |
| | `iforest_property.pkl` | Isolation Forest for property anomalies | |
| | `iforest_liability.pkl` | Isolation Forest for liability anomalies | |
| | `training_results.json` | Full training metrics and feature importance | |
|
|
| ## How to Use |
|
|
| ```python |
| import xgboost as xgb |
| import pickle |
| import numpy as np |
| |
| # Load motor fraud model |
| model = xgb.XGBClassifier() |
| model.load_model("xgb_motor.json") |
| |
| # Load isolation forest |
| with open("iforest_motor.pkl", "rb") as f: |
| iforest = pickle.load(f) |
| |
| # Example claim features |
| claim = np.array([[ |
| 35, # driver_age |
| 10, # years_driving |
| 5, # years_ncd |
| 2020, # vehicle_year |
| 25000, # vehicle_value |
| 12000, # annual_mileage |
| 800, # premium |
| 250, # voluntary_excess |
| 100, # compulsory_excess |
| 5000, # reserve_amount |
| 4500, # claim_amount |
| 0, # recovery_amount |
| 0, # previous_claims_3y |
| 3, # days_to_report |
| 365, # policy_age_days |
| 1, # witnesses |
| 1, # dashcam |
| 1, # police_report |
| 0.9, # claim_reserve_ratio |
| 5.625, # claim_premium_ratio |
| 0, # new_policy |
| 0, # late_report |
| 4 # vehicle_age |
| ]]) |
| |
| # Predict fraud probability |
| fraud_prob = model.predict_proba(claim)[0][1] |
| is_anomaly = iforest.predict(claim)[0] == -1 |
| |
| print(f"Fraud probability: {fraud_prob:.2%}") |
| print(f"Anomaly detected: {is_anomaly}") |
| ``` |
|
|
| ## Part of the INSUREOS Model Suite |
|
|
| This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI: |
|
|
| | Model | Task | Metric | |
| |-------|------|--------| |
| | [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 | |
| | [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 | |
| | [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 | |
| | **InsureFraudNet** (this model) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 | |
| | [InsurePricing](https://huggingface.co/piyushptiwari/InsurePricing) | Insurance pricing (GLM + EBM) | MAE: £11,132 | |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{bytical2026insurefraudnet, |
| title={InsureFraudNet: Multi-LoB Insurance Fraud Detection}, |
| author={Bytical AI}, |
| year={2026}, |
| url={https://huggingface.co/piyushptiwari/InsureFraudNet} |
| } |
| ``` |
|
|
| ## About Bytical AI |
|
|
| [Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce. |
|
|