YORI-LLaMA-Quantized

YORI-LLaMA-Quantized is a Yoruba large language model specialized in text generation and AI-based assistance in Yoruba. It was quantized from its base model , Jacaranda/yorubaLLaMA, to achieve lighter memory usage and faster inference while maintaining strong linguistic performance.


💬 Description

YORI-LLaMA-Quantized is trained to generate and understand natural Yoruba text with contextual fluency and syntactic awareness. It can be used for:

  • Yoruba AI assistants
  • fine-tuned for Chatbot
  • Text completion, storytelling, and creative generation
  • Language preservation and computational linguistics research

yori

⚠️ Limitations

  • Code-switching weakness: The model struggles when Yoruba and English are mixed, leading to misinterpretations or incorrect tokens.
  • Numerical inaccuracies: Occasionally produces factual errors, e.g., reporting mẹ́ẹ̀dógún (15) instead of mẹ́rìndínlógójì (36) for the number of Nigerian states.
  • Ambiguous prompts: May output irrelevant or nonsensical responses when the query lacks clarity.

These limitations suggest the need for more diverse and updated training data across dialects and domains.


⚙️ Inference Notice

A key consideration when running inference is model precision. YORI was quantized for efficiency, but inference should be performed in FP16 precision for stability and performance.

Example snippet: FP16 Snippet


🧩 Intended Use

This model is intended for:

  • Academic and research purposes
  • Educational AI assistants
  • Yoruba language technology development

Do not use this model for:

  • Disinformation or impersonation
  • Generating offensive or harmful content
  • Applications violating user consent or privacy

Downloads last month
5
Safetensors
Model size
8B params
Tensor type
I32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lasisi/YORI-Llama-Quantized

Quantized
(1)
this model