legal_contract_named_entity_recognizer

Overview

This model is a BERT-based Token Classifier fine-tuned for the Legal domain. It automatically extracts key entities from commercial contracts, including the parties involved, effective dates, governing jurisdictions, and financial amounts.

Model Architecture

The model uses a BERT-Large backbone with a token-level classification head.

  • Tagging Scheme: Follows the BIO (Beginning, Inside, Outside) format.
  • Contextual Embeddings: Captures the dense semantic relationships between legal definitions (e.g., distinguishing between a "Notice Date" and an "Effective Date").
  • Fine-tuning: Trained on the CUAD (Contract Understanding Atticus Dataset) and proprietary legal corpora.

Intended Use

  • Contract Lifecycle Management (CLM): Automating the extraction of metadata for digital repositories.
  • Due Diligence: Rapidly identifying governing laws and liability amounts across thousands of merger documents.
  • Regulatory Compliance: Checking for the presence of specific mandatory parties or dates in financial agreements.

Limitations

  • Legalese Variation: Older or highly non-standard contract formats may result in lower entity recall.
  • Nested Entities: Does not support hierarchical or overlapping entities (e.g., an "Amount" inside a "Payment Clause").
  • OCR Errors: Performance is highly dependent on the quality of the text; poorly scanned PDFs with OCR noise will degrade accuracy.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support