--- language: - en license: apache-2.0 tags: - insurance - document-classification - modernbert - uk-insurance - text-classification - bytical library_name: transformers pipeline_tag: text-classification base_model: answerdotai/ModernBERT-base datasets: - piyushptiwari/insureos-training-data model-index: - name: InsureDocClassifier results: - task: type: text-classification name: Insurance Document Classification metrics: - type: f1 value: 1.0 name: F1 (macro) - type: accuracy value: 1.0 name: Accuracy --- # InsureDocClassifier — Insurance Document Classification **Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations. ## Model Description InsureDocClassifier is a 12-class insurance document classifier built on ModernBERT-base. It automatically categorizes insurance documents into their correct type, enabling automated document routing, indexing, and processing in insurance operations. ### Document Classes (12) | ID | Document Type | Description | |----|--------------|-------------| | 0 | Policy Schedule | Policy details and coverage summary | | 1 | Certificate of Insurance | Proof of insurance document | | 2 | Claim Form | Insurance claim submission form | | 3 | Loss Adjuster Report | Assessment report from loss adjuster | | 4 | Bordereaux — Premium | Premium transaction records | | 5 | Bordereaux — Claims | Claims transaction records | | 6 | Endorsement | Policy amendment document | | 7 | Renewal Notice | Policy renewal notification | | 8 | Statement of Fact | Declaration of material facts | | 9 | FNOL Report | First Notification of Loss report | | 10 | Subrogation Notice | Recovery rights notification | | 11 | Policy Wording | Full policy terms and conditions | ### Training Details | Parameter | Value | |-----------|-------| | Base Model | answerdotai/ModernBERT-base | | Training Samples | 10,000 synthetic insurance documents | | Epochs | 5 | | Eval Loss | 4.17e-06 | | GPU | NVIDIA Tesla T4 16GB | ### Evaluation Results | Metric | Score | |--------|-------| | **Accuracy** | **1.0** | | **F1 (macro)** | **1.0** | | **F1 (weighted)** | **1.0** | | Eval Samples/sec | 32.96 | ## How to Use ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer model = AutoModelForSequenceClassification.from_pretrained("piyushptiwari/InsureDocClassifier") tokenizer = AutoTokenizer.from_pretrained("piyushptiwari/InsureDocClassifier") text = "We hereby confirm that the above-named insured holds a valid policy of insurance..." inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) outputs = model(**inputs) predicted_class = outputs.logits.argmax(-1).item() labels = { 0: "Policy Schedule", 1: "Certificate of Insurance", 2: "Claim Form", 3: "Loss Adjuster Report", 4: "Bordereaux — Premium", 5: "Bordereaux — Claims", 6: "Endorsement", 7: "Renewal Notice", 8: "Statement of Fact", 9: "FNOL Report", 10: "Subrogation Notice", 11: "Policy Wording" } print(f"Document type: {labels[predicted_class]}") ``` ## Part of the INSUREOS Model Suite This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI: | Model | Task | Metric | |-------|------|--------| | [InsureLLM-4B](https://huggingface.co/piyushptiwari/InsureLLM-4B) | Insurance domain LLM | ROUGE-1: 0.384 | | **InsureDocClassifier** (this model) | 12-class document classification | F1: 1.0 | | [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 | | [InsureFraudNet](https://huggingface.co/piyushptiwari/InsureFraudNet) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 | | [InsurePricing](https://huggingface.co/piyushptiwari/InsurePricing) | Insurance pricing (GLM + EBM) | MAE: £11,132 | ## Citation ```bibtex @misc{bytical2026insuredocclassifier, title={InsureDocClassifier: Insurance Document Classification with ModernBERT}, author={Bytical AI}, year={2026}, url={https://huggingface.co/piyushptiwari/InsureDocClassifier} } ``` ## About Bytical AI [Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.