File size: 5,027 Bytes
94a1360
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
language:
- en
license: apache-2.0
tags:
- insurance
- fraud-detection
- xgboost
- isolation-forest
- uk-insurance
- tabular-classification
- bytical
library_name: xgboost
pipeline_tag: tabular-classification
datasets:
- piyushptiwari/insureos-training-data
model-index:
- name: InsureFraudNet
  results:
  - task:
      type: tabular-classification
      name: Insurance Fraud Detection
    metrics:
    - type: roc_auc
      value: 1.0
      name: AUC-ROC (Motor)
    - type: roc_auc
      value: 1.0
      name: AUC-ROC (Property)
    - type: roc_auc
      value: 1.0
      name: AUC-ROC (Liability)
---

# InsureFraudNet — Insurance Fraud Detection

**Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations.

## Model Description

InsureFraudNet is a multi-line-of-business fraud detection system for UK insurance claims. It consists of paired XGBoost classifiers and Isolation Forest anomaly detectors for three lines of business: Motor, Property, and Liability.

### Architecture

Each line of business has:
- **XGBoost Classifier** — Supervised gradient-boosted tree for fraud probability scoring
- **Isolation Forest** — Unsupervised anomaly detection for novel fraud patterns

### Lines of Business

| LoB | Training Claims | Fraud Rate | Features | AUC-ROC | F1 |
|-----|----------------|------------|----------|---------|-----|
| **Motor** | 25,000 | 8% | 23 | **1.000** | **1.000** |
| **Property** | 15,000 | 8% | 20 | **1.000** | **1.000** |
| **Liability** | 10,000 | 8% | 14 | **1.000** | **1.000** |

### Top Fraud Indicators by LoB

**Motor:**
| Feature | Importance |
|---------|-----------|
| claim_reserve_ratio | 48.9% |
| days_to_report | 43.7% |
| policy_age_days | 5.7% |
| previous_claims_3y | 1.4% |

**Property:**
| Feature | Importance |
|---------|-----------|
| days_to_report | 40.9% |
| policy_age_days | 37.6% |
| claim_reserve_ratio | 20.0% |
| previous_claims_3y | 1.4% |

**Liability:**
| Feature | Importance |
|---------|-----------|
| previous_claims_3y | 56.1% |
| days_to_report | 43.9% |

### Files

| File | Description |
|------|-------------|
| `xgb_motor.json` | XGBoost model for motor fraud |
| `xgb_property.json` | XGBoost model for property fraud |
| `xgb_liability.json` | XGBoost model for liability fraud |
| `iforest_motor.pkl` | Isolation Forest for motor anomalies |
| `iforest_property.pkl` | Isolation Forest for property anomalies |
| `iforest_liability.pkl` | Isolation Forest for liability anomalies |
| `training_results.json` | Full training metrics and feature importance |

## How to Use

```python
import xgboost as xgb
import pickle
import numpy as np

# Load motor fraud model
model = xgb.XGBClassifier()
model.load_model("xgb_motor.json")

# Load isolation forest
with open("iforest_motor.pkl", "rb") as f:
    iforest = pickle.load(f)

# Example claim features
claim = np.array([[
    35,     # driver_age
    10,     # years_driving
    5,      # years_ncd
    2020,   # vehicle_year
    25000,  # vehicle_value
    12000,  # annual_mileage
    800,    # premium
    250,    # voluntary_excess
    100,    # compulsory_excess
    5000,   # reserve_amount
    4500,   # claim_amount
    0,      # recovery_amount
    0,      # previous_claims_3y
    3,      # days_to_report
    365,    # policy_age_days
    1,      # witnesses
    1,      # dashcam
    1,      # police_report
    0.9,    # claim_reserve_ratio
    5.625,  # claim_premium_ratio
    0,      # new_policy
    0,      # late_report
    4       # vehicle_age
]])

# Predict fraud probability
fraud_prob = model.predict_proba(claim)[0][1]
is_anomaly = iforest.predict(claim)[0] == -1

print(f"Fraud probability: {fraud_prob:.2%}")
print(f"Anomaly detected: {is_anomaly}")
```

## Part of the INSUREOS Model Suite

This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI:

| Model | Task | Metric |
|-------|------|--------|
| [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 |
| [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 |
| [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 |
| **InsureFraudNet** (this model) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 |
| [InsurePricing](https://huggingface.co/piyushptiwari/InsurePricing) | Insurance pricing (GLM + EBM) | MAE: £11,132 |

## Citation

```bibtex
@misc{bytical2026insurefraudnet,
  title={InsureFraudNet: Multi-LoB Insurance Fraud Detection},
  author={Bytical AI},
  year={2026},
  url={https://huggingface.co/piyushptiwari/InsureFraudNet}
}
```

## About Bytical AI

[Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.