Shoriful025 commited on
Commit
0be6568
·
verified ·
1 Parent(s): 3f73c3c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - ner
6
+ - legal-nlp
7
+ - token-classification
8
+ - bert
9
+ ---
10
+
11
+ # legal_contract_named_entity_recognizer
12
+
13
+ ## Overview
14
+ This model is a BERT-based Token Classifier fine-tuned for the Legal domain. It automatically extracts key entities from commercial contracts, including the parties involved, effective dates, governing jurisdictions, and financial amounts.
15
+
16
+ ## Model Architecture
17
+ The model uses a **BERT-Large** backbone with a token-level classification head.
18
+
19
+
20
+
21
+ - **Tagging Scheme:** Follows the BIO (Beginning, Inside, Outside) format.
22
+ - **Contextual Embeddings:** Captures the dense semantic relationships between legal definitions (e.g., distinguishing between a "Notice Date" and an "Effective Date").
23
+ - **Fine-tuning:** Trained on the CUAD (Contract Understanding Atticus Dataset) and proprietary legal corpora.
24
+
25
+ ## Intended Use
26
+ - **Contract Lifecycle Management (CLM):** Automating the extraction of metadata for digital repositories.
27
+ - **Due Diligence:** Rapidly identifying governing laws and liability amounts across thousands of merger documents.
28
+ - **Regulatory Compliance:** Checking for the presence of specific mandatory parties or dates in financial agreements.
29
+
30
+ ## Limitations
31
+ - **Legalese Variation:** Older or highly non-standard contract formats may result in lower entity recall.
32
+ - **Nested Entities:** Does not support hierarchical or overlapping entities (e.g., an "Amount" inside a "Payment Clause").
33
+ - **OCR Errors:** Performance is highly dependent on the quality of the text; poorly scanned PDFs with OCR noise will degrade accuracy.