InsureDocClassifier โ Insurance Document Classification
Created by Bytical AI โ AI agents that run insurance operations.
Model Description
InsureDocClassifier is a 12-class insurance document classifier built on ModernBERT-base. It automatically categorizes insurance documents into their correct type, enabling automated document routing, indexing, and processing in insurance operations.
Document Classes (12)
| ID | Document Type | Description |
|---|---|---|
| 0 | Policy Schedule | Policy details and coverage summary |
| 1 | Certificate of Insurance | Proof of insurance document |
| 2 | Claim Form | Insurance claim submission form |
| 3 | Loss Adjuster Report | Assessment report from loss adjuster |
| 4 | Bordereaux โ Premium | Premium transaction records |
| 5 | Bordereaux โ Claims | Claims transaction records |
| 6 | Endorsement | Policy amendment document |
| 7 | Renewal Notice | Policy renewal notification |
| 8 | Statement of Fact | Declaration of material facts |
| 9 | FNOL Report | First Notification of Loss report |
| 10 | Subrogation Notice | Recovery rights notification |
| 11 | Policy Wording | Full policy terms and conditions |
Training Details
| Parameter | Value |
|---|---|
| Base Model | answerdotai/ModernBERT-base |
| Training Samples | 10,000 synthetic insurance documents |
| Epochs | 5 |
| Eval Loss | 4.17e-06 |
| GPU | NVIDIA Tesla T4 16GB |
Evaluation Results
| Metric | Score |
|---|---|
| Accuracy | 1.0 |
| F1 (macro) | 1.0 |
| F1 (weighted) | 1.0 |
| Eval Samples/sec | 32.96 |
How to Use
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("piyushptiwari/InsureDocClassifier")
tokenizer = AutoTokenizer.from_pretrained("piyushptiwari/InsureDocClassifier")
text = "We hereby confirm that the above-named insured holds a valid policy of insurance..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(-1).item()
labels = {
0: "Policy Schedule", 1: "Certificate of Insurance", 2: "Claim Form",
3: "Loss Adjuster Report", 4: "Bordereaux โ Premium", 5: "Bordereaux โ Claims",
6: "Endorsement", 7: "Renewal Notice", 8: "Statement of Fact",
9: "FNOL Report", 10: "Subrogation Notice", 11: "Policy Wording"
}
print(f"Document type: {labels[predicted_class]}")
Part of the INSUREOS Model Suite
This model is part of the INSUREOS โ a complete AI/ML suite for insurance operations built by Bytical AI:
| Model | Task | Metric |
|---|---|---|
| InsureLLM-4B | Insurance domain LLM | ROUGE-1: 0.384 |
| InsureDocClassifier (this model) | 12-class document classification | F1: 1.0 |
| InsureNER | 13-entity Named Entity Recognition | F1: 1.0 |
| InsureFraudNet | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 |
| InsurePricing | Insurance pricing (GLM + EBM) | MAE: ยฃ11,132 |
Citation
@misc{bytical2026insuredocclassifier,
title={InsureDocClassifier: Insurance Document Classification with ModernBERT},
author={Bytical AI},
year={2026},
url={https://huggingface.co/piyushptiwari/InsureDocClassifier}
}
About Bytical AI
Bytical builds AI agents that run insurance operations โ claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.
- Downloads last month
- 27
Model tree for piyushptiwari/InsureDocClassifier
Base model
answerdotai/ModernBERT-baseDataset used to train piyushptiwari/InsureDocClassifier
Evaluation results
- F1 (macro)self-reported1.000
- Accuracyself-reported1.000