--- language: - en license: apache-2.0 tags: - insurance - pricing - glm - ebm - explainable-ai - uk-insurance - tabular-regression - actuarial - bytical pipeline_tag: tabular-regression datasets: - piyushptiwari/insureos-training-data model-index: - name: InsurePricing results: - task: type: tabular-regression name: Insurance Premium Pricing metrics: - type: mae value: 11132 name: MAE — EBM (£) - type: mae value: 12245 name: MAE — GLM (£) --- # InsurePricing — Insurance Premium Pricing Models **Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations. ## Model Description InsurePricing provides two complementary pricing models for UK motor insurance premiums, designed for actuarial and underwriting workflows: 1. **Tweedie GLM** — Generalized Linear Model with Tweedie distribution (power=1.5), the industry-standard approach for insurance pricing 2. **Explainable Boosting Machine (EBM)** — Interpretable glass-box model from Microsoft Research (InterpretML) that provides per-feature explanations ### Model Comparison | Model | MAE (£) | RMSE (£) | MAPE (%) | Interpretable | |-------|---------|----------|----------|---------------| | **EBM** | **£11,132** | £14,787 | 177.6% | Yes — per-feature shape functions | | **Tweedie GLM** | £12,245 | £17,615 | 198.8% | Yes — coefficients | ### Risk Factors (17 Features) | Feature | Type | Description | |---------|------|-------------| | driver_age | Numeric | Age of primary driver | | years_driving | Numeric | Years of driving experience | | years_ncd | Numeric | No-claims discount years | | vehicle_year | Numeric | Year of vehicle manufacture | | vehicle_value | Numeric | Vehicle market value (£) | | annual_mileage | Numeric | Estimated annual miles | | voluntary_excess | Numeric | Voluntary excess amount (£) | | compulsory_excess | Numeric | Compulsory excess amount (£) | | previous_claims_3y | Numeric | Claims in last 3 years | | policy_age_days | Numeric | Days since policy inception | | vehicle_age | Derived | Current year minus vehicle_year | | driver_experience_ratio | Derived | years_driving / driver_age | | ncd_ratio | Derived | years_ncd / years_driving | | vehicle_make_enc | Encoded | Vehicle manufacturer | | fuel_type_enc | Encoded | Fuel type | | occupation_enc | Encoded | Driver occupation | | region_enc | Encoded | UK region | ### EBM Top Feature Importances | Feature | Importance | |---------|-----------| | previous_claims_3y | 3,259 | | policy_age_days | 2,684 | | previous_claims_3y × policy_age_days | 1,608 | | region_enc | 221 | | vehicle_make_enc | 173 | | annual_mileage | 172 | | compulsory_excess | 165 | | voluntary_excess | 163 | | ncd_ratio | 153 | ### Training Data - 25,000 synthetic UK motor insurance policies (20K train / 5K test) - Features include driver demographics, vehicle attributes, claim history, and policy details ### Files | File | Description | |------|-------------| | `tweedie_glm.pkl` | Scikit-learn Tweedie GLM pipeline | | `pricing_ebm.pkl` | InterpretML EBM model | | `label_encoders.pkl` | Fitted label encoders for categorical features | | `training_results.json` | Full training metrics and feature coefficients | ## How to Use ```python import pickle import numpy as np # Load EBM model with open("pricing_ebm.pkl", "rb") as f: ebm = pickle.load(f) with open("label_encoders.pkl", "rb") as f: encoders = pickle.load(f) # Example: price a motor policy features = np.array([[ 30, # driver_age 8, # years_driving 4, # years_ncd 2022, # vehicle_year 20000, # vehicle_value 10000, # annual_mileage 200, # voluntary_excess 100, # compulsory_excess 0, # previous_claims_3y 180, # policy_age_days 4, # vehicle_age 0.267, # driver_experience_ratio 0.5, # ncd_ratio 3, # vehicle_make_enc 1, # fuel_type_enc 5, # occupation_enc 7 # region_enc ]]) premium = ebm.predict(features)[0] print(f"Predicted premium: £{premium:,.2f}") # Get per-feature explanations (EBM glass-box) explanations = ebm.explain_local(features) ``` ## Part of the INSUREOS Model Suite This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI: | Model | Task | Metric | |-------|------|--------| | [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 | | [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 | | [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 | | [InsureFraudNet](https://huggingface.co/piyushptiwari/InsureFraudNet) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 | | **InsurePricing** (this model) | Insurance pricing (GLM + EBM) | MAE: £11,132 | ## Citation ```bibtex @misc{bytical2026insurepricing, title={InsurePricing: Explainable Insurance Premium Pricing Models}, author={Bytical AI}, year={2026}, url={https://huggingface.co/piyushptiwari/InsurePricing} } ``` ## About Bytical AI [Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.