theatticusproject/cuad
Viewer โข Updated โข 84.3k โข 3.67k โข 23
LegalBERT_Finetuned is a domain-specific transformer model fine-tuned for legal clause classification and tier-based contract review.
This model forms the backbone of the NLP Contract Summarization & Tier-wise Clause Review project โ an AI-driven framework that automates clause identification, categorization, and prioritization for large-scale legal documents.
nlpaueb/legal-bert-base-uncasedThis model is designed for:
Model fine-tuned on the Contract Understanding Atticus Dataset (CUAD):
Each clause was mapped to one of five review tiers:
| Tier | Description | Example Clauses |
|---|---|---|
| 1 โ Critical | Core clauses with major legal implications | Termination, Liability, Governing Law |
| 2 โ Important | Major obligations & constraints | Indemnification, Insurance, Non-compete |
| 3 โ Moderate | Common operational clauses | Warranty Duration, Renewal Terms |
| 4 โ Low | Procedural / administrative | Notice Periods, Third-Party Beneficiary |
| 5 โ Trivial | Boilerplate or metadata | Effective Date, Parties, Document Name |
nlpaueb/legal-bert-base-uncased| Metric | Score |
|---|---|
| Accuracy | ~93% |
| F1-Score | ~0.91 |
| Macro-F1 | ~0.88 |
The model demonstrated strong generalization across unseen clause structures and consistent labeling across the 41 categories.
Clause Classification
Summarization (Downstream)
GPT-4.1-mini for abstractive contract summaries. pdfplumber and python-docx. combined_clauses.csv. from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("bhargav-07-bidkar/LegalBERT_Finetuned")
model = AutoModelForSequenceClassification.from_pretrained("bhargav-07-bidkar/LegalBERT_Finetuned")
text = "The Company shall indenify and hold the Client harmless from any damages or liabilities."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
predicted_label = torch.argmax(outputs.logits, dim=1).item()
print(predicted_label)
Base model
nlpaueb/legal-bert-base-uncased