πŸ“„ Insurance Document OCR

Extract structured data from insurance documents

Model Description

Specialized OCR model for extracting information from insurance-related documents including claims forms, policy documents, ID cards, and damage photos.

Supported Documents

Document Type Fields Extracted
Claims Form Claim #, Date, Amount, Description
Policy Document Policy #, Coverage, Limits, Deductible
Driver's License Name, DOB, License #, Address
Vehicle Registration VIN, Make, Model, Year, Plate
Medical Bills Provider, Date, Charges, Diagnosis
Repair Estimates Shop, Parts, Labor, Total
Police Reports Report #, Date, Officers, Description

Output Format

{
  "document_type": "claims_form",
  "confidence": 0.96,
  "extracted_fields": {
    "claim_number": "CLM-2024-78432",
    "incident_date": "2024-01-15",
    "claim_amount": 2450.00,
    "description": "Rear-end collision at intersection",
    "policy_number": "POL-AUTO-12345"
  },
  "raw_text": "...",
  "bounding_boxes": [...]
}

Performance

Metric Score
Character Accuracy 98.7%
Field Extraction 95.2%
Document Classification 97.8%
Processing Time 1.2s/page

Usage

from transformers import pipeline

ocr = pipeline("image-to-text", model="gcc-insurance-ml-models/document-ocr-insurance")

result = ocr("claim_form.jpg")
print(result["extracted_fields"])

Integration

Document Upload
     ↓
[Document OCR] β†’ Structured Data
     ↓
Auto-populate claim form
     ↓
Validate against policy
     ↓
Route to triage

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support