File size: 5,027 Bytes

94a1360

---
language:
- en
license: apache-2.0
tags:
- insurance
- fraud-detection
- xgboost
- isolation-forest
- uk-insurance
- tabular-classification
- bytical
library_name: xgboost
pipeline_tag: tabular-classification
datasets:
- piyushptiwari/insureos-training-data
model-index:
- name: InsureFraudNet
  results:
  - task:
      type: tabular-classification
      name: Insurance Fraud Detection
    metrics:
    - type: roc_auc
      value: 1.0
      name: AUC-ROC (Motor)
    - type: roc_auc
      value: 1.0
      name: AUC-ROC (Property)
    - type: roc_auc
      value: 1.0
      name: AUC-ROC (Liability)
---

# InsureFraudNet — Insurance Fraud Detection

**Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations.

## Model Description

InsureFraudNet is a multi-line-of-business fraud detection system for UK insurance claims. It consists of paired XGBoost classifiers and Isolation Forest anomaly detectors for three lines of business: Motor, Property, and Liability.

### Architecture

Each line of business has:
- **XGBoost Classifier** — Supervised gradient-boosted tree for fraud probability scoring
- **Isolation Forest** — Unsupervised anomaly detection for novel fraud patterns

### Lines of Business

| LoB | Training Claims | Fraud Rate | Features | AUC-ROC | F1 |
|-----|----------------|------------|----------|---------|-----|
| **Motor** | 25,000 | 8% | 23 | **1.000** | **1.000** |
| **Property** | 15,000 | 8% | 20 | **1.000** | **1.000** |
| **Liability** | 10,000 | 8% | 14 | **1.000** | **1.000** |

### Top Fraud Indicators by LoB

**Motor:**
| Feature | Importance |
|---------|-----------|
| claim_reserve_ratio | 48.9% |
| days_to_report | 43.7% |
| policy_age_days | 5.7% |
| previous_claims_3y | 1.4% |

**Property:**
| Feature | Importance |
|---------|-----------|
| days_to_report | 40.9% |
| policy_age_days | 37.6% |
| claim_reserve_ratio | 20.0% |
| previous_claims_3y | 1.4% |

**Liability:**
| Feature | Importance |
|---------|-----------|
| previous_claims_3y | 56.1% |
| days_to_report | 43.9% |

### Files

| File | Description |
|------|-------------|
| `xgb_motor.json` | XGBoost model for motor fraud |
| `xgb_property.json` | XGBoost model for property fraud |
| `xgb_liability.json` | XGBoost model for liability fraud |
| `iforest_motor.pkl` | Isolation Forest for motor anomalies |
| `iforest_property.pkl` | Isolation Forest for property anomalies |
| `iforest_liability.pkl` | Isolation Forest for liability anomalies |
| `training_results.json` | Full training metrics and feature importance |

## How to Use

```python
import xgboost as xgb
import pickle
import numpy as np

# Load motor fraud model
model = xgb.XGBClassifier()
model.load_model("xgb_motor.json")

# Load isolation forest
with open("iforest_motor.pkl", "rb") as f:
    iforest = pickle.load(f)

# Example claim features
claim = np.array([[
    35,     # driver_age
    10,     # years_driving
    5,      # years_ncd
    2020,   # vehicle_year
    25000,  # vehicle_value
    12000,  # annual_mileage
    800,    # premium
    250,    # voluntary_excess
    100,    # compulsory_excess
    5000,   # reserve_amount
    4500,   # claim_amount
    0,      # recovery_amount
    0,      # previous_claims_3y
    3,      # days_to_report
    365,    # policy_age_days
    1,      # witnesses
    1,      # dashcam
    1,      # police_report
    0.9,    # claim_reserve_ratio
    5.625,  # claim_premium_ratio
    0,      # new_policy
    0,      # late_report
    4       # vehicle_age
]])

# Predict fraud probability
fraud_prob = model.predict_proba(claim)[0][1]
is_anomaly = iforest.predict(claim)[0] == -1

print(f"Fraud probability: {fraud_prob:.2%}")
print(f"Anomaly detected: {is_anomaly}")
```

## Part of the INSUREOS Model Suite

This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI:

| Model | Task | Metric |
|-------|------|--------|
| [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 |
| [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 |
| [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 |
| **InsureFraudNet** (this model) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 |
| [InsurePricing](https://huggingface.co/piyushptiwari/InsurePricing) | Insurance pricing (GLM + EBM) | MAE: £11,132 |

## Citation

```bibtex
@misc{bytical2026insurefraudnet,
  title={InsureFraudNet: Multi-LoB Insurance Fraud Detection},
  author={Bytical AI},
  year={2026},
  url={https://huggingface.co/piyushptiwari/InsureFraudNet}
}
```

## About Bytical AI

[Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.