Oolel-lit-gemma
Oolel-lit-gemma is a fine-tuned version of Gemma-3-270m-it for the Wolof language. It is part of our Oolel family of compact, on-device Wolof language models developed.
The model was trained using supervised fine-tuning (SFT) on synthetic data distilled from our larger Oolel-7B models via Oolel-translator.
Usage
Quick start with pipeline
from transformers import pipeline
generator = pipeline(
"text-generation",
model="soynade-research/oolel-lit-gemma",
device="cuda",
)
messages = [{"role": "user", "content": "Translate to Wolof: The president is 45 years old."}]
output = generator(messages, max_new_tokens=256, return_full_text=False)
print(output["generated_text"])
With AutoModel for more control
from transformers import AutoTokenizer, Gemma3ForCausalLM
import torch
model_id = "soynade-research/oolel-lit-gemma"
model = Gemma3ForCausalLM.from_pretrained(
model_id
).eval()
tokenizer = AutoTokenizer.from_pretrained(model_id)
messages = [
[
{
"role": "system",
"content": [{"type": "text", "text": "You're a Wolof AI assistant. Please always provide detailed and useful answers to the user queries."},]
},
{
"role": "user",
"content": [{"type": "text", "text": "Translate to Wolof: The president is 45 years old."},]
},
],
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device).to(torch.bfloat16)
with torch.inference_mode():
outputs = model.generate(**inputs, max_new_tokens=256,
do_sample=True,
temperature=0.7,
top_p=0.9,)
outputs = tokenizer.batch_decode(outputs)
Training
The training code and configuration are available at soynade-research/oolel-trainer.
Limitations
- Primarily optimized for Wolof; performance on other languages may vary
- As a 270M parameter model, it may struggle with complex tasks
- Outputs should be verified by a native Wolof speaker for critical applications
- Downloads last month
- 231