legal_contract_named_entity_recognizer
Overview
This model is a BERT-based Token Classifier fine-tuned for the Legal domain. It automatically extracts key entities from commercial contracts, including the parties involved, effective dates, governing jurisdictions, and financial amounts.
Model Architecture
The model uses a BERT-Large backbone with a token-level classification head.
- Tagging Scheme: Follows the BIO (Beginning, Inside, Outside) format.
- Contextual Embeddings: Captures the dense semantic relationships between legal definitions (e.g., distinguishing between a "Notice Date" and an "Effective Date").
- Fine-tuning: Trained on the CUAD (Contract Understanding Atticus Dataset) and proprietary legal corpora.
Intended Use
- Contract Lifecycle Management (CLM): Automating the extraction of metadata for digital repositories.
- Due Diligence: Rapidly identifying governing laws and liability amounts across thousands of merger documents.
- Regulatory Compliance: Checking for the presence of specific mandatory parties or dates in financial agreements.
Limitations
- Legalese Variation: Older or highly non-standard contract formats may result in lower entity recall.
- Nested Entities: Does not support hierarchical or overlapping entities (e.g., an "Amount" inside a "Payment Clause").
- OCR Errors: Performance is highly dependent on the quality of the text; poorly scanned PDFs with OCR noise will degrade accuracy.
- Downloads last month
- -