GLiNER ContractNER Multi - Fine-Grained Legal Entity Extraction

Model Name: gliner-contractner-multi-v2.1 (Agile Lab Fine-tune) Base Architecture: GLiNER Multi v2.1 (Backbone: microsoft/mdeberta-v3-base)

Model Description

GLiNER ContractNER Multi is a multilingual span-based Named Entity Recognition (NER) model fine-tuned by Agile Lab on the ContractNER dataset. It is designed to extract fine-grained entities from legal contracts with high precision.

Built on the GLiNER Multi v2.1 architecture, this model achieves 80%+ F1 score on contract-specific entity extraction, significantly outperforming general-purpose LLMs and domain-specific legal models in our benchmarks.

Key Highlights

  • Contract-Specialized: Fine-tuned on 3,240+ annotated contract chunks from SEC EDGAR filings.
  • Granular Extraction: Capable of identifying 18 specific entity types including parties, dates, financial terms (salaries, shares), and regulatory references.
  • Open-Vocabulary NER: Supports promptable entity extractionβ€”you can provide custom label names at inference time without retraining.
  • Multilingual Capability: Inherits multilingual behavior from GLiNER Multi v2.1 and mDeBERTa-v3-base, though optimized primarily for English contracts (performance may degrade on low-resource languages).
  • Production-Ready: A recommended threshold of 0.8–0.9 balances high precision with acceptable recall, minimizing costly false positives in legal review workflows.

πŸš€ How to Use

To use this model, you need to install the gliner library.

Installation

pip install gliner

Inference Code

from gliner import GLiNER

# Load the model (replace 'AgileLab/your-model-repo' with your actual HF repo ID)
model = GLiNER.from_pretrained("AgileLab/gliner-contractner-multi-v2.1")

# Example contract text
text = """
This EMPLOYMENT AGREEMENT is made effective as of January 1, 2026,
by and between Tech Solutions Inc. ("Company") and Jane Doe ("Executive").
The Executive shall serve as Chief Technology Officer.
The Company agrees to pay the Executive an annual base salary of $250,000.00.
"""

# Define the entities you want to extract (Open Vocabulary)
labels = [
    "Parties", "EffectiveDate", "Role", "Salary", "TerminationDate"
]

# Predict
entities = model.predict_entities(text, labels, threshold=0.5)

# Print results
for entity in entities:
    print(f"{entity['text']} => {entity['label']} (Score: {entity['score']:.2f})")

πŸ“Š Evaluation & Benchmarks

In our comprehensive evaluation on the validation set, this model achieved an overall F1 score of 80.0%, demonstrating a strong balance between precision and recall.

Performance vs. Other Models

  • GLiNER ContractNER (This Model): 80.0% F1
  • General Purpose LLMs (Qwen, Gemma): < 35% F1
  • Standalone DeBERTa Models: 46% – 78% F1
  • Legal-Specific Models (LegalBERT, ContractBERT): < 10% F1

This model is best-in-class for contract entity extraction when utilizing the GLiNER span+query architecture.

Detailed Metrics by Entity (Validation Split)

Entity Precision Recall F1 Score Support
Act 81.16 74.67 77.78 75
Address 68.00 77.27 72.34 22
Court 80.00 80.00 80.00 20
EffectiveDate 62.50 96.15 75.76 26
PII_Ref 77.27 100.00 87.18 17
Parties 70.13 85.71 77.14 63
Percentage 59.46 91.67 72.13 24
Price 42.50 94.44 58.62 18
Principal 36.25 90.62 51.79 32
Ratio 29.79 73.68 42.42 19
Regulation 66.67 88.37 76.00 43
RenewalTerm 42.86 75.00 54.55 12
Rent 30.00 75.00 42.86 8
Role 90.32 80.00 84.85 35
Salary 42.86 100.00 60.00 18
Shares 40.48 89.47 55.74 19
TerminationDate 23.64 72.22 35.62 18
Title 70.11 73.49 71.76 83

Supported Entity Schema

The model was trained on the ContractNER schema. While you can use custom labels, performance is best with categories semantically similar to:

Document Metadata

  • EffectiveDate: Contract start date (e.g., "January 1, 2026").
  • TerminationDate: Contract end or expiration date.
  • RenewalTerm: Renewal periods or conditions.
  • Title: Official document title.

Actors & Roles

  • Parties: Legal entities entering the agreement (companies, individuals).
  • Role: Professional titles and positions (e.g., "Chief Executive Officer").

Contact Information

  • Address: Physical addresses.
  • PII_Ref: Personal identifiable information references (phone, email, fax).

Financial Values

  • Salary: Compensation amounts (always with currency symbol, e.g., "$225,000.00").
  • Price: Goods/services prices.
  • Principal: Loan principal amounts.
  • Shares: Stock or equity quantities.
  • Percentage: Percentage values (e.g., "50%").
  • Ratio: Financial ratios.
  • Rent: Lease or rental amounts.

Legal and Regulatory

  • Court: Judicial bodies and tribunals (e.g., "State of Texas").
  • Act: Legislative acts and laws.
  • Regulation: Regulatory references (e.g., "Rule 10b5-1").

Training Details

Data Source & Preprocessing

  • Dataset: ContractNER corpus (Adibhatla et al., 2023) β€” Real contracts from SEC EDGAR (U.S. Securities and Exchange Commission filings).
  • Original Size: ~5,000+ annotated contract segments.
  • Consolidated Dataset: ~3,240 chunks after stratified reduction and class consolidation.
  • Adjustments:
    • Removed RevolvingCredit class (too rare and ambiguous).
    • Rebalanced dataset to ensure minimum representation per class.
    • Split: 80% training / 20% validation (random split).
    • Methodology: Human-in-the-loop iterative labeling.

Architecture & Configuration

  • Base Model: GLiNER Multi v2.1 (209M parameters).
  • Encoder Backbone: microsoft/mdeberta-v3-base (86M backbone + 190M embedding parameters).
  • Architecture Type: Span-based NER with entity-query matching.
  • Hardware: NVIDIA L4 GPU.
  • Training Time: ~30 minutes per fine-tuning run.

Visualizations

Loss Curves

Training Loss

Model Comparison

Model Comparison


License

Apache 2.0

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for lucasorrentino/Contractner

Quantized
(3)
this model

Space using lucasorrentino/Contractner 1