YORI-LLaMA-Quantized

YORI-LLaMA-Quantized is a Yoruba large language model specialized in text generation and AI-based assistance in Yoruba. It was quantized from its base model , Jacaranda/yorubaLLaMA, to achieve lighter memory usage and faster inference while maintaining strong linguistic performance.

💬 Description

YORI-LLaMA-Quantized is trained to generate and understand natural Yoruba text with contextual fluency and syntactic awareness. It can be used for:

Yoruba AI assistants
fine-tuned for Chatbot
Text completion, storytelling, and creative generation
Language preservation and computational linguistics research

⚠️ Limitations

Code-switching weakness: The model struggles when Yoruba and English are mixed, leading to misinterpretations or incorrect tokens.
Numerical inaccuracies: Occasionally produces factual errors, e.g., reporting mẹ́ẹ̀dógún (15) instead of mẹ́rìndínlógójì (36) for the number of Nigerian states.
Ambiguous prompts: May output irrelevant or nonsensical responses when the query lacks clarity.

These limitations suggest the need for more diverse and updated training data across dialects and domains.

⚙️ Inference Notice

A key consideration when running inference is model precision. YORI was quantized for efficiency, but inference should be performed in FP16 precision for stability and performance.

Example snippet:

🧩 Intended Use

This model is intended for:

Academic and research purposes
Educational AI assistants
Yoruba language technology development

Do not use this model for:

Disinformation or impersonation
Generating offensive or harmful content
Applications violating user consent or privacy

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

I32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lasisi/YORI-Llama-Quantized

Base model

Jacaranda/YorubaLlama

Quantized

(1)

this model