File size: 5,027 Bytes
94a1360 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 | ---
language:
- en
license: apache-2.0
tags:
- insurance
- fraud-detection
- xgboost
- isolation-forest
- uk-insurance
- tabular-classification
- bytical
library_name: xgboost
pipeline_tag: tabular-classification
datasets:
- piyushptiwari/insureos-training-data
model-index:
- name: InsureFraudNet
results:
- task:
type: tabular-classification
name: Insurance Fraud Detection
metrics:
- type: roc_auc
value: 1.0
name: AUC-ROC (Motor)
- type: roc_auc
value: 1.0
name: AUC-ROC (Property)
- type: roc_auc
value: 1.0
name: AUC-ROC (Liability)
---
# InsureFraudNet — Insurance Fraud Detection
**Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations.
## Model Description
InsureFraudNet is a multi-line-of-business fraud detection system for UK insurance claims. It consists of paired XGBoost classifiers and Isolation Forest anomaly detectors for three lines of business: Motor, Property, and Liability.
### Architecture
Each line of business has:
- **XGBoost Classifier** — Supervised gradient-boosted tree for fraud probability scoring
- **Isolation Forest** — Unsupervised anomaly detection for novel fraud patterns
### Lines of Business
| LoB | Training Claims | Fraud Rate | Features | AUC-ROC | F1 |
|-----|----------------|------------|----------|---------|-----|
| **Motor** | 25,000 | 8% | 23 | **1.000** | **1.000** |
| **Property** | 15,000 | 8% | 20 | **1.000** | **1.000** |
| **Liability** | 10,000 | 8% | 14 | **1.000** | **1.000** |
### Top Fraud Indicators by LoB
**Motor:**
| Feature | Importance |
|---------|-----------|
| claim_reserve_ratio | 48.9% |
| days_to_report | 43.7% |
| policy_age_days | 5.7% |
| previous_claims_3y | 1.4% |
**Property:**
| Feature | Importance |
|---------|-----------|
| days_to_report | 40.9% |
| policy_age_days | 37.6% |
| claim_reserve_ratio | 20.0% |
| previous_claims_3y | 1.4% |
**Liability:**
| Feature | Importance |
|---------|-----------|
| previous_claims_3y | 56.1% |
| days_to_report | 43.9% |
### Files
| File | Description |
|------|-------------|
| `xgb_motor.json` | XGBoost model for motor fraud |
| `xgb_property.json` | XGBoost model for property fraud |
| `xgb_liability.json` | XGBoost model for liability fraud |
| `iforest_motor.pkl` | Isolation Forest for motor anomalies |
| `iforest_property.pkl` | Isolation Forest for property anomalies |
| `iforest_liability.pkl` | Isolation Forest for liability anomalies |
| `training_results.json` | Full training metrics and feature importance |
## How to Use
```python
import xgboost as xgb
import pickle
import numpy as np
# Load motor fraud model
model = xgb.XGBClassifier()
model.load_model("xgb_motor.json")
# Load isolation forest
with open("iforest_motor.pkl", "rb") as f:
iforest = pickle.load(f)
# Example claim features
claim = np.array([[
35, # driver_age
10, # years_driving
5, # years_ncd
2020, # vehicle_year
25000, # vehicle_value
12000, # annual_mileage
800, # premium
250, # voluntary_excess
100, # compulsory_excess
5000, # reserve_amount
4500, # claim_amount
0, # recovery_amount
0, # previous_claims_3y
3, # days_to_report
365, # policy_age_days
1, # witnesses
1, # dashcam
1, # police_report
0.9, # claim_reserve_ratio
5.625, # claim_premium_ratio
0, # new_policy
0, # late_report
4 # vehicle_age
]])
# Predict fraud probability
fraud_prob = model.predict_proba(claim)[0][1]
is_anomaly = iforest.predict(claim)[0] == -1
print(f"Fraud probability: {fraud_prob:.2%}")
print(f"Anomaly detected: {is_anomaly}")
```
## Part of the INSUREOS Model Suite
This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI:
| Model | Task | Metric |
|-------|------|--------|
| [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 |
| [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 |
| [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 |
| **InsureFraudNet** (this model) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 |
| [InsurePricing](https://huggingface.co/piyushptiwari/InsurePricing) | Insurance pricing (GLM + EBM) | MAE: £11,132 |
## Citation
```bibtex
@misc{bytical2026insurefraudnet,
title={InsureFraudNet: Multi-LoB Insurance Fraud Detection},
author={Bytical AI},
year={2026},
url={https://huggingface.co/piyushptiwari/InsureFraudNet}
}
```
## About Bytical AI
[Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.
|