Upload folder using huggingface_hub

94a1360 verified 5 days ago

5.03 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- insurance
	- fraud-detection
	- xgboost
	- isolation-forest
	- uk-insurance
	- tabular-classification
	- bytical
	library_name: xgboost
	pipeline_tag: tabular-classification
	datasets:
	- piyushptiwari/insureos-training-data
	model-index:
	- name: InsureFraudNet
	results:
	- task:
	type: tabular-classification
	name: Insurance Fraud Detection
	metrics:
	- type: roc_auc
	value: 1.0
	name: AUC-ROC (Motor)
	- type: roc_auc
	value: 1.0
	name: AUC-ROC (Property)
	- type: roc_auc
	value: 1.0
	name: AUC-ROC (Liability)
	---

	# InsureFraudNet — Insurance Fraud Detection

	Created by [Bytical AI](https://bytical.ai) — AI agents that run insurance operations.

	## Model Description

	InsureFraudNet is a multi-line-of-business fraud detection system for UK insurance claims. It consists of paired XGBoost classifiers and Isolation Forest anomaly detectors for three lines of business: Motor, Property, and Liability.

	### Architecture

	Each line of business has:
	- XGBoost Classifier — Supervised gradient-boosted tree for fraud probability scoring
	- Isolation Forest — Unsupervised anomaly detection for novel fraud patterns

	### Lines of Business

	\| LoB \| Training Claims \| Fraud Rate \| Features \| AUC-ROC \| F1 \|
	\|-----\|----------------\|------------\|----------\|---------\|-----\|
	\| Motor \| 25,000 \| 8% \| 23 \| 1.000 \| 1.000 \|
	\| Property \| 15,000 \| 8% \| 20 \| 1.000 \| 1.000 \|
	\| Liability \| 10,000 \| 8% \| 14 \| 1.000 \| 1.000 \|

	### Top Fraud Indicators by LoB

	Motor:
	\| Feature \| Importance \|
	\|---------\|-----------\|
	\| claim_reserve_ratio \| 48.9% \|
	\| days_to_report \| 43.7% \|
	\| policy_age_days \| 5.7% \|
	\| previous_claims_3y \| 1.4% \|

	Property:
	\| Feature \| Importance \|
	\|---------\|-----------\|
	\| days_to_report \| 40.9% \|
	\| policy_age_days \| 37.6% \|
	\| claim_reserve_ratio \| 20.0% \|
	\| previous_claims_3y \| 1.4% \|

	Liability:
	\| Feature \| Importance \|
	\|---------\|-----------\|
	\| previous_claims_3y \| 56.1% \|
	\| days_to_report \| 43.9% \|

	### Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `xgb_motor.json` \| XGBoost model for motor fraud \|
	\| `xgb_property.json` \| XGBoost model for property fraud \|
	\| `xgb_liability.json` \| XGBoost model for liability fraud \|
	\| `iforest_motor.pkl` \| Isolation Forest for motor anomalies \|
	\| `iforest_property.pkl` \| Isolation Forest for property anomalies \|
	\| `iforest_liability.pkl` \| Isolation Forest for liability anomalies \|
	\| `training_results.json` \| Full training metrics and feature importance \|

	## How to Use

	```python
	import xgboost as xgb
	import pickle
	import numpy as np

	# Load motor fraud model
	model = xgb.XGBClassifier()
	model.load_model("xgb_motor.json")

	# Load isolation forest
	with open("iforest_motor.pkl", "rb") as f:
	iforest = pickle.load(f)

	# Example claim features
	claim = np.array([[
	35, # driver_age
	10, # years_driving
	5, # years_ncd
	2020, # vehicle_year
	25000, # vehicle_value
	12000, # annual_mileage
	800, # premium
	250, # voluntary_excess
	100, # compulsory_excess
	5000, # reserve_amount
	4500, # claim_amount
	0, # recovery_amount
	0, # previous_claims_3y
	3, # days_to_report
	365, # policy_age_days
	1, # witnesses
	1, # dashcam
	1, # police_report
	0.9, # claim_reserve_ratio
	5.625, # claim_premium_ratio
	0, # new_policy
	0, # late_report
	4 # vehicle_age
	]])

	# Predict fraud probability
	fraud_prob = model.predict_proba(claim)[0][1]
	is_anomaly = iforest.predict(claim)[0] == -1

	print(f"Fraud probability: {fraud_prob:.2%}")
	print(f"Anomaly detected: {is_anomaly}")
	```

	## Part of the INSUREOS Model Suite

	This model is part of the INSUREOS — a complete AI/ML suite for insurance operations built by Bytical AI:

	\| Model \| Task \| Metric \|
	\|-------\|------\|--------\|
	\| [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) \| Insurance domain LLM \| ROUGE-1: 0.384 \|
	\| [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) \| 12-class document classification \| F1: 1.0 \|
	\| [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) \| 13-entity Named Entity Recognition \| F1: 1.0 \|
	\| InsureFraudNet (this model) \| Fraud detection (Motor/Property/Liability) \| AUC-ROC: 1.0 \|
	\| [InsurePricing](https://huggingface.co/piyushptiwari/InsurePricing) \| Insurance pricing (GLM + EBM) \| MAE: £11,132 \|

	## Citation

	```bibtex
	@misc{bytical2026insurefraudnet,
	title={InsureFraudNet: Multi-LoB Insurance Fraud Detection},
	author={Bytical AI},
	year={2026},
	url={https://huggingface.co/piyushptiwari/InsureFraudNet}
	}
	```

	## About Bytical AI

	[Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner \| NVIDIA \| Salesforce.