YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Kikuyu-LoRA-Llama-3.2-3B-Instruct
A fine-tuned Llama 3.2 3B model for Kikuyu language generation using LoRA (Low-Rank Adaptation). This model has been trained on 120,000 Kikuyu monolingual sentences to understand and generate text in the Kikuyu language.
Model Details
- Base model: meta-llama/Llama-3.2-3B-Instruct
- Fine-tuning method: LoRA (Low-Rank Adaptation)
- Language: Kikuyu (G末k农y农)
- Dataset: 120k Kikuyu monolingual sentences
- Training epochs: 10
- Custom tokenizer: Optimized BPE tokenizer for Kikuyu text
Features
- Specialized Kikuyu language understanding
- Memory-efficient LoRA architecture
- Custom tokenizer handling Kikuyu diacritics (农, 末, etc.)
- Optimized for Kikuyu text generation tasks
Installation
pip install torch transformers peft accelerate
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load Kikuyu tokenizer
tokenizer = AutoTokenizer.from_pretrained("thirtyninetythree/kikuyu-bpe-tokenizer")
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# Load base model and resize embeddings
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.2-3B-Instruct",
torch_dtype=torch.float16,
device_map="auto"
)
base_model.resize_token_embeddings(len(tokenizer))
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"thirtyninetythree/kikuyu-lora-llama-3.2-3b-instruct"
)
# Generation function
def generate_kikuyu(prompt, max_length=100):
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
inputs.pop('token_type_ids', None) # Remove unused token types
outputs = model.generate(
**inputs,
max_length=max_length,
temperature=0.8,
do_sample=True,
top_p=0.9,
repetition_penalty=1.2,
no_repeat_ngram_size=3,
pad_token_id=tokenizer.eos_token_id
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Example usage
examples = [
"Githeri ni",
"M农nd农 农r末a",
"Ngai ni",
"We mwega"
]
for prompt in examples:
result = generate_kikuyu(prompt)
print(f"Input: {prompt}")
print(f"Output: {result}")
print("-" * 50)
Generation Parameters
You can adjust generation parameters for different outputs:
# More creative (higher temperature)
result = generate_kikuyu("Githeri ni", temperature=1.0, top_p=0.95)
# More focused (lower temperature)
result = generate_kikuyu("Githeri ni", temperature=0.5, top_p=0.8)
# Longer output
result = generate_kikuyu("Githeri ni", max_length=200)
Requirements
- Python 3.8+
- PyTorch 1.12+
- Transformers 4.21+
- PEFT library
- CUDA-capable GPU (recommended)
Hardware Requirements
- Minimum: 8GB GPU memory
- Recommended: 16GB+ GPU memory
- CPU: 4+ cores
- RAM: 16GB+
Limitations
- Trained primarily on monolingual Kikuyu text
- Performance may vary on mixed-language inputs
- Best results with culturally relevant Kikuyu contexts
- May require prompt engineering for optimal outputs
Model Architecture
- LoRA rank: 16
- LoRA alpha: 32
- Target modules: All attention and MLP layers
- Dropout: 0.1
- Vocabulary size: ~32,000 tokens (Kikuyu-optimized)
Citation
If you use this model, please cite:
@misc{kikuyu-lora-llama-3.2-3b,
title={Kikuyu-LoRA-Llama-3.2-3B-Instruct},
author={thirtyninetythree},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/thirtyninetythree/kikuyu-lora-llama-3.2-3b-instruct}
}
License
This model inherits the license from the base Llama 3.2 model. Please refer to Meta's Llama license for usage terms.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support