YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
LinguoNER: Yambeta Named Entity Recognition Model
Model Description
This repository contains LinguoNER, a Transformer-based Named Entity Recognition (NER) model fine-tuned for Yambeta (yat), a low-resource Bantu language spoken in Cameroon.
The model is trained using a silver-standard automatically annotated corpus derived from the Yambeta New Testament and validated through expert-in-the-loop evaluation on sampled subsets.
- Task: Named Entity Recognition (NER)
- Language: Yambeta (yat)
- Entity Types: PER, LOC, ORG
- Model Architecture: Token classification head on top of a Transformer encoder
Base Model
- Base checkpoint:
bert-base-cased - Tokenizer:
DS4H-ICTU/yat-bert-tokenizer(WordPiece, Yambeta-specific)
The token-classification head was randomly initialized and fine-tuned jointly with the encoder, following standard practice for NER adaptation.
Training Data
- Dataset:
DS4H-ICTU/yat-ner-dataset - Annotation type: Silver (dictionary-based BIO tagging)
- Gold validation: 500 sentences (annotation logs) + 200 sentences (NER output), reviewed by a domain expert
Dataset Split
- Train: 6317 sentences
- Validation: 790 sentences
- Test: 790 sentences
Evaluation Results (Test Set)
- precision: 0.9885
- recall: 0.9810
- f1: 0.9847
- accuracy: 0.9997
Metrics are reported at token level using standard Precision / Recall / F1.
Intended Use
- Proof-of-concept NER for extremely low-resource African languages
- Baseline for further expert annotation and domain expansion
- Demonstration of Hugging Face workflows under data scarcity
Limitations
- The corpus is Bible-derived, with repetitive narrative structure and limited entity inventory.
- Results should be interpreted as restricted-domain proof-of-concept performance.
- Dictionary-driven annotation may introduce label bias; expert validation mitigates but does not eliminate this.
Citation
If you use this model, please cite:
@misc{linguoner_yambeta_ner,
title = {LinguoNER: Yambeta Named Entity Recognition Model},
author = {DS4H-ICTU Research Group},
year = {2026},
publisher = {Hugging Face},
howpublished = {Model},
url = {https://huggingface.co/DS4H-ICTU/yat-linguoner-ner-model}
}
- Downloads last month
- 4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support