Amicus NER v2 - Nigerian Legal Named Entity Recognition

amicus-ner-v2 is a production-ready Named Entity Recognition model for Nigerian legal text. It is a LoRA fine-tuned version of WhiteRoomProdigy/amicus-ner-v1, which is based on nlpaueb/legal-bert-base-uncased.

This model identifies 8 legal entity types in Nigerian court judgements, briefs, and legal documents.


Entity Labels

Label Description Example
CASE_NAME Party names in litigation Amusa v. INEC
CITATION Law report references (NWLR, LPELR, SCNJ, FWLR) (2023) 14 NWLR (Pt.637) 70
STATUTE Legislation, sections, constitutional provisions Section 137(1)(b) of CFRN 1999
COURT Nigerian courts and tribunals Supreme Court of Nigeria
DATE Judgment and filing dates 15th March 2022
JUDGE Judicial officers with designations Justice Bello JSC
RATIO Ratio decidendi passages -
HELD Court holding / decision text -

What's New in v2

Improvement v1 v2
Training method Full fine-tune LoRA (r=16, ~0.8% params trained)
Class imbalance Untreated Weighted CrossEntropy (O-weight = 0.05)
Training data Base legal-bert weights Distant supervision + 600 synthetic examples
Synthetic data None 600 Gemini-generated entity-rich sentences
Export PyTorch only PyTorch + ONNX INT8 quantized
Inference speed Baseline ~3-4x faster (ONNX INT8 on CPU)

Model Details

Property Value
Architecture BERT-base (nlpaueb/legal-bert-base-uncased)
Fine-tuning method PEFT LoRA - rank 16, alpha 32
Target modules query, value (attention projection layers)
Training epochs 8
Batch size 16
Learning rate 3e-4
Loss function Weighted CrossEntropyLoss (entity = 1.0, O = 0.05)
Dataset Distant supervision from LawPavilion + Legalpedia + 600 synthetic examples
Labels 17 (O + B/I for each of 8 entity types)
Max sequence length 512 tokens

How to Use

from transformers import pipeline

ner = pipeline(
    "token-classification",
    model="WhiteRoomProdigy/amicus-ner-v2",
    aggregation_strategy="simple"
)

text = "As held in Amusa v. INEC (2023) 14 NWLR (Pt.637) 70, the Supreme Court found no merit."
results = ner(text)
for entity in results:
    print(entity['entity_group'], '|', entity['score'], '|', entity['word'])

Training Data

Trained on a combination of:

  1. Distant supervision from LawPavilion and Legalpedia Nigerian judgment databases, auto-annotated using a hand-crafted regex engine (NWLR/LPELR citation patterns, court name patterns, judge designation patterns)

  2. Synthetic augmentation - 600 entity-rich sentences covering all 8 entity types

All training data is derived from publicly available Nigerian court judgements.


Citation

@misc{amicus-ner-v2,
  title        = {amicus-ner-v2: Nigerian Legal Named Entity Recognition},
  author       = {WhiteRoomProdigy},
  year         = {2025},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/WhiteRoomProdigy/amicus-ner-v2}},
  note         = {LoRA fine-tune of amicus-ner-v1 for Nigerian legal NER}
}

License

Apache 2.0. Built by the Dockase team for the Nigerian legal technology ecosystem.

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for WhiteRoomProdigy/amicus-ner-v2

Adapter
(1)
this model