Aura-2B

A 1.8B-parameter multilingual base LLM covering 37 African languages (plus English, French, Portuguese, Arabic for transfer). Trained from scratch with a Llama-3-style architecture (RMSNorm, RoPE, grouped-query attention, SwiGLU) on a temperature-balanced FineWeb2 mixture.

Architecture


Parameters	1.8B (1,772,242,944)
Layers	36
Hidden dim	2048
Attention heads	32 (KV heads: 8, GQA)
FFN intermediate	5120
Context length	1024
Vocab size	64000
RoPE theta	500000
Training step	99,999

Languages

afr_Latn
amh_Ethi
arb_Arab
bem_Latn
eng_Latn
fon_Latn
fra_Latn
hau_Latn
ibo_Latn
kin_Latn
lin_Latn
lug_Latn
nya_Latn
plt_Latn
por_Latn
sna_Latn
som_Latn
sot_Latn
swh_Latn
tir_Ethi
tsn_Latn
wol_Latn
xho_Latn
yor_Latn
zul_Latn

Quick start

git clone https://huggingface.co/WakandaAI/Aura-2B
cd Aura-2B
pip install torch tokenizers safetensors
python generate.py --prompt "<s><|yor_Latn|>Kaabo, mo jẹ awoṣe ede." -n 4

Or from Python:

from inference import load_model, generate

model, tokenizer, config = load_model(".")
out = generate(
    model, tokenizer,
    prompt="<s><|swh_Latn|>Habari yako rafiki?",
    max_new_tokens=128,
    temperature=0.8,
    top_p=0.9,
)
print(out[0])

Prompt format

Every prompt should start with <s> (BOS) followed by a language token of the form <|{lang}_{Script}|> to condition generation on the target language. See tokenizer.json for the full list of language tokens.

Examples:

<s><|eng_Latn|>The quick brown fox
<s><|hau_Latn|>Sannu, yaya kake?
<s><|amh_Ethi|>ሰላም

Interactive mode

For exploring the model, run the REPL:

python generate.py --interactive

Then type prompts at the >>> prompt. Empty line or Ctrl-D exits.

Example session:

>>> <s><|swh_Latn|>Nairobi ni mji mkuu wa
[sample 0]
Nairobi ni mji mkuu wa Kenya na kuna wageni wengi ambao hutembelea pia.
Unaweza kupata historia ya utalii wa ndani ya nchi hii...

Adjust sampling on the command line:

python generate.py --interactive --temperature 0.7 --top-p 0.95 --max-new-tokens 256

Files

File	Purpose
`model.safetensors`	Model weights (preferred format)
`model.pt`	Same weights as a torch checkpoint (fallback)
`config.json`	Architecture config in plain JSON
`tokenizer.json`	ByteLevel BPE tokenizer (64000 vocab)
`tokenizer_config.json`	HuggingFace tokenizer metadata
`inference.py`	`load_model()` + `generate()` library
`generate.py`	CLI wrapper
`llama3.py`, `model_factory.py`, `kvcache.py`	Model definition

Limitations

This is a base model, not chat- or instruction-tuned. It will continue text in the style of its training corpus (web text in the prompted language). For instruction following or chat, fine-tune on an instruction dataset.

Quality varies by language; lower-resource languages in the training mixture (e.g. Bemba, Tswana) will produce lower-quality continuations than higher-resource ones (English, Swahili, Yoruba).

Citation

If you use this model, please cite WakandaAI. Details TBA.

Downloads last month: 10

Safetensors

Model size

2B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including WakandaAI/Aura-2B

Aura Family of LLMs

Collection

4 items • Updated 29 days ago