YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Kikuyu-LoRA-Llama-3.2-3B-Instruct

A fine-tuned Llama 3.2 3B model for Kikuyu language generation using LoRA (Low-Rank Adaptation). This model has been trained on 120,000 Kikuyu monolingual sentences to understand and generate text in the Kikuyu language.

Model Details

  • Base model: meta-llama/Llama-3.2-3B-Instruct
  • Fine-tuning method: LoRA (Low-Rank Adaptation)
  • Language: Kikuyu (G末k农y农)
  • Dataset: 120k Kikuyu monolingual sentences
  • Training epochs: 10
  • Custom tokenizer: Optimized BPE tokenizer for Kikuyu text

Features

  • Specialized Kikuyu language understanding
  • Memory-efficient LoRA architecture
  • Custom tokenizer handling Kikuyu diacritics (农, 末, etc.)
  • Optimized for Kikuyu text generation tasks

Installation

pip install torch transformers peft accelerate

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load Kikuyu tokenizer
tokenizer = AutoTokenizer.from_pretrained("thirtyninetythree/kikuyu-bpe-tokenizer")
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load base model and resize embeddings
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-3B-Instruct",
    torch_dtype=torch.float16,
    device_map="auto"
)
base_model.resize_token_embeddings(len(tokenizer))

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model, 
    "thirtyninetythree/kikuyu-lora-llama-3.2-3b-instruct"
)

# Generation function
def generate_kikuyu(prompt, max_length=100):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    inputs.pop('token_type_ids', None)  # Remove unused token types
    
    outputs = model.generate(
        **inputs,
        max_length=max_length,
        temperature=0.8,
        do_sample=True,
        top_p=0.9,
        repetition_penalty=1.2,
        no_repeat_ngram_size=3,
        pad_token_id=tokenizer.eos_token_id
    )
    
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
examples = [
    "Githeri ni",
    "M农nd农 农r末a",
    "Ngai ni",
    "We mwega"
]

for prompt in examples:
    result = generate_kikuyu(prompt)
    print(f"Input: {prompt}")
    print(f"Output: {result}")
    print("-" * 50)

Generation Parameters

You can adjust generation parameters for different outputs:

# More creative (higher temperature)
result = generate_kikuyu("Githeri ni", temperature=1.0, top_p=0.95)

# More focused (lower temperature)
result = generate_kikuyu("Githeri ni", temperature=0.5, top_p=0.8)

# Longer output
result = generate_kikuyu("Githeri ni", max_length=200)

Requirements

  • Python 3.8+
  • PyTorch 1.12+
  • Transformers 4.21+
  • PEFT library
  • CUDA-capable GPU (recommended)

Hardware Requirements

  • Minimum: 8GB GPU memory
  • Recommended: 16GB+ GPU memory
  • CPU: 4+ cores
  • RAM: 16GB+

Limitations

  • Trained primarily on monolingual Kikuyu text
  • Performance may vary on mixed-language inputs
  • Best results with culturally relevant Kikuyu contexts
  • May require prompt engineering for optimal outputs

Model Architecture

  • LoRA rank: 16
  • LoRA alpha: 32
  • Target modules: All attention and MLP layers
  • Dropout: 0.1
  • Vocabulary size: ~32,000 tokens (Kikuyu-optimized)

Citation

If you use this model, please cite:

@misc{kikuyu-lora-llama-3.2-3b,
  title={Kikuyu-LoRA-Llama-3.2-3B-Instruct},
  author={thirtyninetythree},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/thirtyninetythree/kikuyu-lora-llama-3.2-3b-instruct}
}

License

This model inherits the license from the base Llama 3.2 model. Please refer to Meta's Llama license for usage terms.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support