DiacNet
Collection
5 items • Updated
DiacNetYorDB is a lightweight dot-below diacritics restorer for Yoruba (yo) text. It restores only dot-below marks (ọ, ẹ, ṣ) using a character-level k-NN backoff classifier.
yoruba_diacritizer_dot_below.json)yo)Loaded and used via the unified olaverse SDK wrapper:
from olaverse.nlp.diacritizer import Diacritizer
diacritizer = Diacritizer(model="diacnet-yor-db")
text = "Ojo lo si oja lana"
print(diacritizer.restore(text))
# Output: "Ọjọ lo si ọja lana"