Africa v2 Translation Model

This is an improved fine-tuned translation model for 29 African languages, based on Qwen3-4B-Instruct-2507 with enhanced training data.

Model Description

Africa v2 is an improved version of the Africa v1 translation model. Key improvements include:

  • System prompts in training data to enforce direct translation behavior
  • Regenerated training dataset with better formatting
  • Available in MLX 4-bit format and LoRA adapters

Note: Training was interrupted at 1,000 iterations due to GPU OOM. The model represents a partial training checkpoint.

Supported Languages (29)

African Languages:

  • Afrikaans (af), Akan (ak), Amharic (am), Bambara (bm), Ewe (ee)
  • Fula (ff), Hausa (ha), Igbo (ig), Kinyarwanda (rw), Kirundi (rn)
  • Kongo (kg), Lingala (ln), Luganda (lg), Ndebele (nd), Northern Sotho (nso)
  • Chichewa/Nyanja (ny), Oromo (om), Shona (sn), Somali (so), Swahili (sw)
  • Tigrinya (ti), Tsonga (ts), Tswana (tn), Twi (tw), Venda (ve)
  • Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)

Plus English (en) for bidirectional translation.

Training Details

Base Model

  • Model: Qwen3-4B-Instruct-2507 (MLX 4-bit quantized)
  • Parameters: 4 billion
  • Architecture: Transformer-based language model

Fine-tuning

  • Method: LoRA (Low-Rank Adaptation)
  • LoRA Rank: 8
  • LoRA Alpha: 20
  • Target Layers: 16 layers
  • Training Iterations: 1,000 (interrupted, intended 10,000)
  • Learning Rate: 5e-5
  • Batch Size: 1
  • Final Train Loss: 2.116
  • Final Val Loss: 2.512

Training Data

  • Total Translation Pairs: 283,986
  • Format: Enhanced with system prompts
  • System Message: "You are a translation assistant. Output only the translation without explanation."

Available Checkpoints

  • Checkpoint at 1,000 iterations (latest)

Improvements Over v1

  1. System Prompts: Training data includes system messages to suppress thinking mode
  2. Better Formatting: Consistent prompt format for all language pairs
  3. Direct Translation: Model trained to output translation directly without explanation

Evaluation Results

Status: Model evaluation pending. Use v1 evaluation as baseline comparison.

Expected improvements over v1:

  • Reduced repetition loops
  • Less hallucination due to system prompts
  • More consistent output format

Usage

MLX with mlx-lm

# Install MLX
pip install mlx-lm

# Download model
huggingface-cli download aoiandroid/africa-v2-translation-model --local-dir africa-v2-mlx --include "mlx-4bit/*"

# Run inference with system prompt
python -m mlx_lm.generate \
  --model africa-v2-mlx/mlx-4bit \
  --prompt "<|im_start|>system
You are a translation assistant. Output only the translation without explanation.<|im_end|>
<|im_start|>user
Translate from English to Swahili:

Hello, how are you?<|im_end|>
<|im_start|>assistant" \
  --max-tokens 256 \
  --temp 0.1

LoRA Adapters

from mlx_lm import load, generate

# Load model with LoRA adapters
model, tokenizer = load("aoiandroid/africa-v2-translation-model", adapter_path="lora")

# Prepare prompt with system message
messages = [
    {"role": "system", "content": "You are a translation assistant. Output only the translation without explanation."},
    {"role": "user", "content": "Translate from English to Swahili:\n\nHello, how are you?"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Generate translation
response = generate(model, tokenizer, prompt=prompt, max_tokens=256, temp=0.1)
print(response)

GGUF Conversion

Note: GGUF export is not available for this model because:

  • mlx_lm.fuse --export-gguf only supports model_type in ["llama", "mixtral", "mistral"]
  • Qwen3 has model_type: "qwen3", which is not yet supported

To convert to GGUF:

  1. Export MLX model to HuggingFace format (if supported)
  2. Use llama.cpp's convert_hf_to_gguf.py script
  3. Or wait for mlx_lm to add Qwen3 support

Alternatively, use v1's GGUF model as a fallback.

Limitations and Biases

  • Partial Training: Only 1,000 iterations completed (10% of planned training)
  • Needs Evaluation: Translation quality not yet formally evaluated
  • Low-Resource Languages: Limited training data for some African languages
  • Experimental Model: Intended for research and experimentation

Intended Use

  • Research: Studying impact of system prompts on translation quality
  • Experimentation: Testing improved training data formatting
  • Comparison: Baseline for comparing with fully trained models

Recommended: Complete training to 10,000+ iterations before production use.

Future Work

  • Complete training to 10,000+ iterations
  • Increase LoRA rank to 16 or 32
  • Formal evaluation with BLEU, chrF, and TER metrics
  • Compare performance against v1

Citation

@software{africa_v2_translation_model,
  title = {Africa v2 Translation Model},
  author = {TranslateBlue Project},
  year = {2026},
  url = {https://huggingface.co/aoiandroid/africa-v2-translation-model}
}

License

Apache 2.0

Model Card Authors

TranslateBlue Project

Model Card Contact

For questions or issues, please open an issue in the model repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aoiandroid/africa-v2-translation-model

Adapter
(2616)
this model