File size: 5,493 Bytes
6923550 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | ---
language:
- en
license: apache-2.0
tags:
- insurance
- pricing
- glm
- ebm
- explainable-ai
- uk-insurance
- tabular-regression
- actuarial
- bytical
pipeline_tag: tabular-regression
datasets:
- piyushptiwari/insureos-training-data
model-index:
- name: InsurePricing
results:
- task:
type: tabular-regression
name: Insurance Premium Pricing
metrics:
- type: mae
value: 11132
name: MAE — EBM (£)
- type: mae
value: 12245
name: MAE — GLM (£)
---
# InsurePricing — Insurance Premium Pricing Models
**Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations.
## Model Description
InsurePricing provides two complementary pricing models for UK motor insurance premiums, designed for actuarial and underwriting workflows:
1. **Tweedie GLM** — Generalized Linear Model with Tweedie distribution (power=1.5), the industry-standard approach for insurance pricing
2. **Explainable Boosting Machine (EBM)** — Interpretable glass-box model from Microsoft Research (InterpretML) that provides per-feature explanations
### Model Comparison
| Model | MAE (£) | RMSE (£) | MAPE (%) | Interpretable |
|-------|---------|----------|----------|---------------|
| **EBM** | **£11,132** | £14,787 | 177.6% | Yes — per-feature shape functions |
| **Tweedie GLM** | £12,245 | £17,615 | 198.8% | Yes — coefficients |
### Risk Factors (17 Features)
| Feature | Type | Description |
|---------|------|-------------|
| driver_age | Numeric | Age of primary driver |
| years_driving | Numeric | Years of driving experience |
| years_ncd | Numeric | No-claims discount years |
| vehicle_year | Numeric | Year of vehicle manufacture |
| vehicle_value | Numeric | Vehicle market value (£) |
| annual_mileage | Numeric | Estimated annual miles |
| voluntary_excess | Numeric | Voluntary excess amount (£) |
| compulsory_excess | Numeric | Compulsory excess amount (£) |
| previous_claims_3y | Numeric | Claims in last 3 years |
| policy_age_days | Numeric | Days since policy inception |
| vehicle_age | Derived | Current year minus vehicle_year |
| driver_experience_ratio | Derived | years_driving / driver_age |
| ncd_ratio | Derived | years_ncd / years_driving |
| vehicle_make_enc | Encoded | Vehicle manufacturer |
| fuel_type_enc | Encoded | Fuel type |
| occupation_enc | Encoded | Driver occupation |
| region_enc | Encoded | UK region |
### EBM Top Feature Importances
| Feature | Importance |
|---------|-----------|
| previous_claims_3y | 3,259 |
| policy_age_days | 2,684 |
| previous_claims_3y × policy_age_days | 1,608 |
| region_enc | 221 |
| vehicle_make_enc | 173 |
| annual_mileage | 172 |
| compulsory_excess | 165 |
| voluntary_excess | 163 |
| ncd_ratio | 153 |
### Training Data
- 25,000 synthetic UK motor insurance policies (20K train / 5K test)
- Features include driver demographics, vehicle attributes, claim history, and policy details
### Files
| File | Description |
|------|-------------|
| `tweedie_glm.pkl` | Scikit-learn Tweedie GLM pipeline |
| `pricing_ebm.pkl` | InterpretML EBM model |
| `label_encoders.pkl` | Fitted label encoders for categorical features |
| `training_results.json` | Full training metrics and feature coefficients |
## How to Use
```python
import pickle
import numpy as np
# Load EBM model
with open("pricing_ebm.pkl", "rb") as f:
ebm = pickle.load(f)
with open("label_encoders.pkl", "rb") as f:
encoders = pickle.load(f)
# Example: price a motor policy
features = np.array([[
30, # driver_age
8, # years_driving
4, # years_ncd
2022, # vehicle_year
20000, # vehicle_value
10000, # annual_mileage
200, # voluntary_excess
100, # compulsory_excess
0, # previous_claims_3y
180, # policy_age_days
4, # vehicle_age
0.267, # driver_experience_ratio
0.5, # ncd_ratio
3, # vehicle_make_enc
1, # fuel_type_enc
5, # occupation_enc
7 # region_enc
]])
premium = ebm.predict(features)[0]
print(f"Predicted premium: £{premium:,.2f}")
# Get per-feature explanations (EBM glass-box)
explanations = ebm.explain_local(features)
```
## Part of the INSUREOS Model Suite
This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI:
| Model | Task | Metric |
|-------|------|--------|
| [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 |
| [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 |
| [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 |
| [InsureFraudNet](https://huggingface.co/piyushptiwari/InsureFraudNet) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 |
| **InsurePricing** (this model) | Insurance pricing (GLM + EBM) | MAE: £11,132 |
## Citation
```bibtex
@misc{bytical2026insurepricing,
title={InsurePricing: Explainable Insurance Premium Pricing Models},
author={Bytical AI},
year={2026},
url={https://huggingface.co/piyushptiwari/InsurePricing}
}
```
## About Bytical AI
[Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.
|