DiacNet
Collection
5 items • Updated
How to use olaverse/diacnet-yor-x with Transformers:
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("olaverse/diacnet-yor-x", dtype="auto")DiacNetYorX is a state-of-the-art transformer-based sequence classifier fine-tuned on top of castorini/afriberta_large for Yoruba tonal diacritization.
Instead of classifying over the global vocabulary, it classifies the candidate index (0 to 7) of each plain word, which optimizes the search space, prevents overfitting, and handles rare tokens gracefully.
castorini/afriberta_large (125M parameters)diacnet_yor_x.pt)Loaded and used via the unified olaverse SDK wrapper (automatically downloads the weights and loads the Transformer model in the background):
from olaverse.nlp.diacritizer import Diacritizer
diacritizer = Diacritizer(model="diacnet-yor-x")
text = "Ojo lo si oja lana"
print(diacritizer.restore(text))
# Output: "Ọjọ́ ló sí ọjà lànà"
diacnet_yor_x.pt: PyTorch model weights.diacnet_yor_x_vocab.json: The word candidate list mapping.