InsureDocClassifier / README.md
piyushptiwari's picture
Upload folder using huggingface_hub
06a0c29 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - insurance
  - document-classification
  - modernbert
  - uk-insurance
  - text-classification
  - bytical
library_name: transformers
pipeline_tag: text-classification
base_model: answerdotai/ModernBERT-base
datasets:
  - piyushptiwari/insureos-training-data
model-index:
  - name: InsureDocClassifier
    results:
      - task:
          type: text-classification
          name: Insurance Document Classification
        metrics:
          - type: f1
            value: 1
            name: F1 (macro)
          - type: accuracy
            value: 1
            name: Accuracy

InsureDocClassifier — Insurance Document Classification

Created by Bytical AI — AI agents that run insurance operations.

Model Description

InsureDocClassifier is a 12-class insurance document classifier built on ModernBERT-base. It automatically categorizes insurance documents into their correct type, enabling automated document routing, indexing, and processing in insurance operations.

Document Classes (12)

ID Document Type Description
0 Policy Schedule Policy details and coverage summary
1 Certificate of Insurance Proof of insurance document
2 Claim Form Insurance claim submission form
3 Loss Adjuster Report Assessment report from loss adjuster
4 Bordereaux — Premium Premium transaction records
5 Bordereaux — Claims Claims transaction records
6 Endorsement Policy amendment document
7 Renewal Notice Policy renewal notification
8 Statement of Fact Declaration of material facts
9 FNOL Report First Notification of Loss report
10 Subrogation Notice Recovery rights notification
11 Policy Wording Full policy terms and conditions

Training Details

Parameter Value
Base Model answerdotai/ModernBERT-base
Training Samples 10,000 synthetic insurance documents
Epochs 5
Eval Loss 4.17e-06
GPU NVIDIA Tesla T4 16GB

Evaluation Results

Metric Score
Accuracy 1.0
F1 (macro) 1.0
F1 (weighted) 1.0
Eval Samples/sec 32.96

How to Use

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("piyushptiwari/InsureDocClassifier")
tokenizer = AutoTokenizer.from_pretrained("piyushptiwari/InsureDocClassifier")

text = "We hereby confirm that the above-named insured holds a valid policy of insurance..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(-1).item()

labels = {
    0: "Policy Schedule", 1: "Certificate of Insurance", 2: "Claim Form",
    3: "Loss Adjuster Report", 4: "Bordereaux — Premium", 5: "Bordereaux — Claims",
    6: "Endorsement", 7: "Renewal Notice", 8: "Statement of Fact",
    9: "FNOL Report", 10: "Subrogation Notice", 11: "Policy Wording"
}
print(f"Document type: {labels[predicted_class]}")

Part of the INSUREOS Model Suite

This model is part of the INSUREOS — a complete AI/ML suite for insurance operations built by Bytical AI:

Model Task Metric
InsureLLM-4B Insurance domain LLM ROUGE-1: 0.384
InsureDocClassifier (this model) 12-class document classification F1: 1.0
InsureNER 13-entity Named Entity Recognition F1: 1.0
InsureFraudNet Fraud detection (Motor/Property/Liability) AUC-ROC: 1.0
InsurePricing Insurance pricing (GLM + EBM) MAE: £11,132

Citation

@misc{bytical2026insuredocclassifier,
  title={InsureDocClassifier: Insurance Document Classification with ModernBERT},
  author={Bytical AI},
  year={2026},
  url={https://huggingface.co/piyushptiwari/InsureDocClassifier}
}

About Bytical AI

Bytical builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.