| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - insurance |
| - pricing |
| - glm |
| - ebm |
| - explainable-ai |
| - uk-insurance |
| - tabular-regression |
| - actuarial |
| - bytical |
| pipeline_tag: tabular-regression |
| datasets: |
| - piyushptiwari/insureos-training-data |
| model-index: |
| - name: InsurePricing |
| results: |
| - task: |
| type: tabular-regression |
| name: Insurance Premium Pricing |
| metrics: |
| - type: mae |
| value: 11132 |
| name: MAE — EBM (£) |
| - type: mae |
| value: 12245 |
| name: MAE — GLM (£) |
| --- |
| |
| # InsurePricing — Insurance Premium Pricing Models |
|
|
| **Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations. |
|
|
| ## Model Description |
|
|
| InsurePricing provides two complementary pricing models for UK motor insurance premiums, designed for actuarial and underwriting workflows: |
|
|
| 1. **Tweedie GLM** — Generalized Linear Model with Tweedie distribution (power=1.5), the industry-standard approach for insurance pricing |
| 2. **Explainable Boosting Machine (EBM)** — Interpretable glass-box model from Microsoft Research (InterpretML) that provides per-feature explanations |
|
|
| ### Model Comparison |
|
|
| | Model | MAE (£) | RMSE (£) | MAPE (%) | Interpretable | |
| |-------|---------|----------|----------|---------------| |
| | **EBM** | **£11,132** | £14,787 | 177.6% | Yes — per-feature shape functions | |
| | **Tweedie GLM** | £12,245 | £17,615 | 198.8% | Yes — coefficients | |
|
|
| ### Risk Factors (17 Features) |
|
|
| | Feature | Type | Description | |
| |---------|------|-------------| |
| | driver_age | Numeric | Age of primary driver | |
| | years_driving | Numeric | Years of driving experience | |
| | years_ncd | Numeric | No-claims discount years | |
| | vehicle_year | Numeric | Year of vehicle manufacture | |
| | vehicle_value | Numeric | Vehicle market value (£) | |
| | annual_mileage | Numeric | Estimated annual miles | |
| | voluntary_excess | Numeric | Voluntary excess amount (£) | |
| | compulsory_excess | Numeric | Compulsory excess amount (£) | |
| | previous_claims_3y | Numeric | Claims in last 3 years | |
| | policy_age_days | Numeric | Days since policy inception | |
| | vehicle_age | Derived | Current year minus vehicle_year | |
| | driver_experience_ratio | Derived | years_driving / driver_age | |
| | ncd_ratio | Derived | years_ncd / years_driving | |
| | vehicle_make_enc | Encoded | Vehicle manufacturer | |
| | fuel_type_enc | Encoded | Fuel type | |
| | occupation_enc | Encoded | Driver occupation | |
| | region_enc | Encoded | UK region | |
| |
| ### EBM Top Feature Importances |
| |
| | Feature | Importance | |
| |---------|-----------| |
| | previous_claims_3y | 3,259 | |
| | policy_age_days | 2,684 | |
| | previous_claims_3y × policy_age_days | 1,608 | |
| | region_enc | 221 | |
| | vehicle_make_enc | 173 | |
| | annual_mileage | 172 | |
| | compulsory_excess | 165 | |
| | voluntary_excess | 163 | |
| | ncd_ratio | 153 | |
|
|
| ### Training Data |
|
|
| - 25,000 synthetic UK motor insurance policies (20K train / 5K test) |
| - Features include driver demographics, vehicle attributes, claim history, and policy details |
|
|
| ### Files |
|
|
| | File | Description | |
| |------|-------------| |
| | `tweedie_glm.pkl` | Scikit-learn Tweedie GLM pipeline | |
| | `pricing_ebm.pkl` | InterpretML EBM model | |
| | `label_encoders.pkl` | Fitted label encoders for categorical features | |
| | `training_results.json` | Full training metrics and feature coefficients | |
|
|
| ## How to Use |
|
|
| ```python |
| import pickle |
| import numpy as np |
| |
| # Load EBM model |
| with open("pricing_ebm.pkl", "rb") as f: |
| ebm = pickle.load(f) |
| with open("label_encoders.pkl", "rb") as f: |
| encoders = pickle.load(f) |
| |
| # Example: price a motor policy |
| features = np.array([[ |
| 30, # driver_age |
| 8, # years_driving |
| 4, # years_ncd |
| 2022, # vehicle_year |
| 20000, # vehicle_value |
| 10000, # annual_mileage |
| 200, # voluntary_excess |
| 100, # compulsory_excess |
| 0, # previous_claims_3y |
| 180, # policy_age_days |
| 4, # vehicle_age |
| 0.267, # driver_experience_ratio |
| 0.5, # ncd_ratio |
| 3, # vehicle_make_enc |
| 1, # fuel_type_enc |
| 5, # occupation_enc |
| 7 # region_enc |
| ]]) |
| |
| premium = ebm.predict(features)[0] |
| print(f"Predicted premium: £{premium:,.2f}") |
| |
| # Get per-feature explanations (EBM glass-box) |
| explanations = ebm.explain_local(features) |
| ``` |
|
|
| ## Part of the INSUREOS Model Suite |
|
|
| This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI: |
|
|
| | Model | Task | Metric | |
| |-------|------|--------| |
| | [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 | |
| | [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 | |
| | [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 | |
| | [InsureFraudNet](https://huggingface.co/piyushptiwari/InsureFraudNet) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 | |
| | **InsurePricing** (this model) | Insurance pricing (GLM + EBM) | MAE: £11,132 | |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{bytical2026insurepricing, |
| title={InsurePricing: Explainable Insurance Premium Pricing Models}, |
| author={Bytical AI}, |
| year={2026}, |
| url={https://huggingface.co/piyushptiwari/InsurePricing} |
| } |
| ``` |
|
|
| ## About Bytical AI |
|
|
| [Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce. |
|
|