Token Classification
Transformers
ONNX
Safetensors
PEFT
English
bert
ner
legal
legal-bert
nigerian-law
lora
Instructions to use WhiteRoomProdigy/amicus-ner-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use WhiteRoomProdigy/amicus-ner-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="WhiteRoomProdigy/amicus-ner-v2")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("WhiteRoomProdigy/amicus-ner-v2") model = AutoModelForTokenClassification.from_pretrained("WhiteRoomProdigy/amicus-ner-v2") - PEFT
How to use WhiteRoomProdigy/amicus-ner-v2 with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| language: | |
| - en | |
| tags: | |
| - token-classification | |
| - ner | |
| - legal | |
| - legal-bert | |
| - nigerian-law | |
| - lora | |
| - peft | |
| - onnx | |
| base_model: WhiteRoomProdigy/amicus-ner-v1 | |
| pipeline_tag: token-classification | |
| library_name: transformers | |
| metrics: | |
| - precision | |
| - recall | |
| - f1 | |
| # Amicus NER v2 - Nigerian Legal Named Entity Recognition | |
| **amicus-ner-v2** is a production-ready Named Entity Recognition model for **Nigerian legal text**. | |
| It is a LoRA fine-tuned version of [WhiteRoomProdigy/amicus-ner-v1](https://huggingface.co/WhiteRoomProdigy/amicus-ner-v1), | |
| which is based on `nlpaueb/legal-bert-base-uncased`. | |
| This model identifies **8 legal entity types** in Nigerian court judgements, briefs, and legal documents. | |
| --- | |
| ## Entity Labels | |
| | Label | Description | Example | | |
| |---|---|---| | |
| | `CASE_NAME` | Party names in litigation | *Amusa v. INEC* | | |
| | `CITATION` | Law report references (NWLR, LPELR, SCNJ, FWLR) | *(2023) 14 NWLR (Pt.637) 70* | | |
| | `STATUTE` | Legislation, sections, constitutional provisions | *Section 137(1)(b) of CFRN 1999* | | |
| | `COURT` | Nigerian courts and tribunals | *Supreme Court of Nigeria* | | |
| | `DATE` | Judgment and filing dates | *15th March 2022* | | |
| | `JUDGE` | Judicial officers with designations | *Justice Bello JSC* | | |
| | `RATIO` | Ratio decidendi passages | - | | |
| | `HELD` | Court holding / decision text | - | | |
| --- | |
| ## What's New in v2 | |
| | Improvement | v1 | v2 | | |
| |---|---|---| | |
| | Training method | Full fine-tune | LoRA (r=16, ~0.8% params trained) | | |
| | Class imbalance | Untreated | Weighted CrossEntropy (O-weight = 0.05) | | |
| | Training data | Base legal-bert weights | Distant supervision + 600 synthetic examples | | |
| | Synthetic data | None | 600 Gemini-generated entity-rich sentences | | |
| | Export | PyTorch only | PyTorch + ONNX INT8 quantized | | |
| | Inference speed | Baseline | ~3-4x faster (ONNX INT8 on CPU) | | |
| --- | |
| ## Model Details | |
| | Property | Value | | |
| |---|---| | |
| | **Architecture** | BERT-base (nlpaueb/legal-bert-base-uncased) | | |
| | **Fine-tuning method** | PEFT LoRA - rank 16, alpha 32 | | |
| | **Target modules** | `query`, `value` (attention projection layers) | | |
| | **Training epochs** | 8 | | |
| | **Batch size** | 16 | | |
| | **Learning rate** | 3e-4 | | |
| | **Loss function** | Weighted CrossEntropyLoss (entity = 1.0, O = 0.05) | | |
| | **Dataset** | Distant supervision from LawPavilion + Legalpedia + 600 synthetic examples | | |
| | **Labels** | 17 (O + B/I for each of 8 entity types) | | |
| | **Max sequence length** | 512 tokens | | |
| --- | |
| ## How to Use | |
| ```python | |
| from transformers import pipeline | |
| ner = pipeline( | |
| "token-classification", | |
| model="WhiteRoomProdigy/amicus-ner-v2", | |
| aggregation_strategy="simple" | |
| ) | |
| text = "As held in Amusa v. INEC (2023) 14 NWLR (Pt.637) 70, the Supreme Court found no merit." | |
| results = ner(text) | |
| for entity in results: | |
| print(entity['entity_group'], '|', entity['score'], '|', entity['word']) | |
| ``` | |
| --- | |
| ## Training Data | |
| Trained on a combination of: | |
| 1. **Distant supervision** from LawPavilion and Legalpedia Nigerian judgment databases, | |
| auto-annotated using a hand-crafted regex engine (NWLR/LPELR citation patterns, | |
| court name patterns, judge designation patterns) | |
| 2. **Synthetic augmentation** - 600 entity-rich sentences covering all 8 entity types | |
| All training data is derived from publicly available Nigerian court judgements. | |
| --- | |
| ## Citation | |
| ```bibtex | |
| @misc{amicus-ner-v2, | |
| title = {amicus-ner-v2: Nigerian Legal Named Entity Recognition}, | |
| author = {WhiteRoomProdigy}, | |
| year = {2025}, | |
| publisher = {Hugging Face}, | |
| howpublished = {\url{https://huggingface.co/WhiteRoomProdigy/amicus-ner-v2}}, | |
| note = {LoRA fine-tune of amicus-ner-v1 for Nigerian legal NER} | |
| } | |
| ``` | |
| --- | |
| ## License | |
| Apache 2.0. Built by the [Dockase](https://dockase.com) team for the Nigerian legal technology ecosystem. | |