masakhane/masakhaner
Updated β’ 589 β’ 9
How to use Lahad/gliner_wolof_NER with GLiNER:
from gliner import GLiNER
model = GLiNER.from_pretrained("Lahad/gliner_wolof_NER")A Named Entity Recognition (NER) model for Wolof language, fine-tuned from urchade/gliner_multi_pii-v1 on the MasakhaNER dataset.
This model can identify the following entity types in Wolof text:
from gliner import GLiNER
# Load the model
model = GLiNER.from_pretrained("Lahad/gliner_wolof_NER")
# Define entity types
labels = ["PER", "ORG", "LOC", "DATE"]
# Predict entities
text = "Ousmane Sonko jΓ ngae na ci Daaray Cheikh Anta Diop ci Dakar."
entities = model.predict_entities(text, labels, threshold=0.5)
for entity in entities:
print(f"{entity['text']} => {entity['label']} (score: {entity['score']:.2f})")
β Ousmane Sonko => PER (score: 0.95)
β Daaray Cheikh Anta Diop => ORG (score: 0.89)
β Dakar => LOC (score: 0.97)
This project uses the MasakhaNER dataset, which provides high-quality NER annotations for 10 African languages including Wolof (wol).
Dataset Split:
Entity Types:
Evaluation on the test set:
| Entity Type | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| DATE | 30.77% | 22.86% | 26.23% | 70 |
| LOC | 76.75% | 84.95% | 80.65% | 206 |
| ORG | 41.89% | 56.36% | 48.06% | 55 |
| PER | 53.02% | 70.69% | 60.59% | 174 |
| GLOBAL | 58.87% | 68.32% | 63.24% | 505 |
The model was fine-tuned on a relatively limited dataset (MasakhaNER Wolof). Current performance reflects this constraint, particularly for DATE and ORG entity types which have fewer training examples.
Future Improvements:
With more annotated data, we expect to significantly improve the model's performance.
MIT
Base model
urchade/gliner_multi_pii-v1