Aura Family of LLMs
Collection
4 items • Updated
A 1B-parameter multilingual machine translation model fine-tuned from WakandaAI/Aura-1B on 25 languages (24 African/Arabic/French/Portuguese + English). Supports bidirectional translation between English and 24 target languages.
| Base model | WakandaAI/Aura-1B |
| Parameters | 1.01B (1,013,280,000) |
| Layers | 36 |
| Hidden dim | 1280 |
| Attention heads | 20 (KV heads: 4, GQA) |
| FFN intermediate | 5120 |
| Context length | 1024 |
| Vocab size | 64,000 |
| RoPE theta | 500,000 |
| Architecture | Llama-3 style (RMSNorm, RoPE, GQA, SwiGLU) |
| Method | Full SFT (all parameters) |
| Dataset | 4.1M parallel sentence pairs across 25 languages |
| Sources | NLLB, WMT22, LAFAND-MT, translated web data |
| Optimizer | AdamW (lr=2e-5, cosine decay, 200 warmup steps) |
| Precision | bfloat16 |
| Hardware | -- |
| Batch size | 1024 tokens/GPU, packed sequences |
| Training steps | 31,000 |
| DDP backend | Gloo over InfiniBand |
| Code | Language | Code | Language |
|---|---|---|---|
afr_Latn |
Afrikaans | plt_Latn |
Malagasy |
amh_Ethi |
Amharic | por_Latn |
Portuguese |
arb_Arab |
Arabic | sna_Latn |
Shona |
bem_Latn |
Bemba | som_Latn |
Somali |
eng_Latn |
English | sot_Latn |
Sesotho |
fon_Latn |
Fon | swh_Latn |
Swahili |
fra_Latn |
French | tir_Ethi |
Tigrinya |
hau_Latn |
Hausa | tsn_Latn |
Setswana |
ibo_Latn |
Igbo | wol_Latn |
Wolof |
kin_Latn |
Kinyarwanda | xho_Latn |
Xhosa |
lin_Latn |
Lingala | yor_Latn |
Yoruba |
lug_Latn |
Luganda | zul_Latn |
Zulu |
nya_Latn |
Chichewa |
git clone https://huggingface.co/WakandaAI/Aura-MT-1B
cd Aura-MT-1B
pip install torch tokenizers safetensors
python generate.py --text "The president announced new economic policies." \
--src eng_Latn --tgt hau_Latn
python generate.py --interactive --src eng_Latn --tgt yor_Latn
[eng_Latn->yor_Latn] >>> Good morning, how are you doing today?
Akoko ti o dara, bawo ni o ṣe n ṣiṣẹ loni?
[eng_Latn->yor_Latn] >>> /set src=arb_Arab tgt=eng_Latn
Direction: arb_Arab -> eng_Latn
[arb_Arab->eng_Latn] >>> صباح الخير، كيف حالك؟
Good morning, how are you?
python generate.py --input sentences.txt --src eng_Latn --tgt swh_Latn --output translations.txt
from inference import load_model, translate
model, tokenizer, config = load_model(".")
# English -> Swahili
result = translate(model, tokenizer,
"The president announced new economic policies.",
src_lang="eng_Latn", tgt_lang="swh_Latn")
print(result)
# French -> English
result = translate(model, tokenizer,
"Bonjour, comment allez-vous?",
src_lang="fra_Latn", tgt_lang="eng_Latn")
print(result)
# With sampling instead of beam search
result = translate(model, tokenizer,
"Hello world",
src_lang="eng_Latn", tgt_lang="yor_Latn",
num_beams=1, temperature=0.7, top_p=0.9)
print(result)
Internally, the model uses instruction-style prompts with a language token prefix:
<s><|tgt_lang|>Translate the following English text into Swahili.
English: Hello, how are you?
Swahili:
The translate() function handles prompt construction automatically. Six prompt templates are available (selectable via template_idx).
| Parameter | Default | Description |
|---|---|---|
num_beams |
4 | Beam search width (1 = greedy/sampling) |
max_new_tokens |
128 | Maximum output length |
length_penalty |
1.0 | Beam search length penalty |
no_repeat_ngram_size |
3 | Ban repeated n-grams (0 = off) |
temperature |
0.0 | Sampling temperature (>0 with num_beams=1) |
top_p |
0.9 | Nucleus sampling threshold |
| File | Purpose |
|---|---|
model.safetensors |
Model weights (preferred format) |
model.pt |
Same weights as a torch checkpoint (fallback) |
config.json |
Architecture config |
tokenizer.json |
ByteLevel BPE tokenizer (64,000 vocab) |
tokenizer_config.json |
HuggingFace tokenizer metadata |
special_tokens.json |
Language token ID mapping |
inference.py |
load_model() + translate() library |
generate.py |
CLI wrapper (single/batch/interactive) |
llama3.py |
Transformer model definition |
model_factory.py |
Model config builder |
kvcache.py |
KV cache for inference |
If you use this model, please cite WakandaAI. Details TBA.
Apache 2.0