Aura-2B

A 1.8B-parameter multilingual base LLM covering 37 African languages (plus English, French, Portuguese, Arabic for transfer). Trained from scratch with a Llama-3-style architecture (RMSNorm, RoPE, grouped-query attention, SwiGLU) on a temperature-balanced FineWeb2 mixture.

Architecture

Parameters 1.8B (1,772,242,944)
Layers 36
Hidden dim 2048
Attention heads 32 (KV heads: 8, GQA)
FFN intermediate 5120
Context length 1024
Vocab size 64000
RoPE theta 500000
Training step 99,999

Languages

  • afr_Latn
  • amh_Ethi
  • arb_Arab
  • bem_Latn
  • eng_Latn
  • fon_Latn
  • fra_Latn
  • hau_Latn
  • ibo_Latn
  • kin_Latn
  • lin_Latn
  • lug_Latn
  • nya_Latn
  • plt_Latn
  • por_Latn
  • sna_Latn
  • som_Latn
  • sot_Latn
  • swh_Latn
  • tir_Ethi
  • tsn_Latn
  • wol_Latn
  • xho_Latn
  • yor_Latn
  • zul_Latn

Quick start

git clone https://huggingface.co/WakandaAI/Aura-2B
cd Aura-2B
pip install torch tokenizers safetensors
python generate.py --prompt "<s><|yor_Latn|>Kaabo, mo jẹ awoṣe ede." -n 4

Or from Python:

from inference import load_model, generate

model, tokenizer, config = load_model(".")
out = generate(
    model, tokenizer,
    prompt="<s><|swh_Latn|>Habari yako rafiki?",
    max_new_tokens=128,
    temperature=0.8,
    top_p=0.9,
)
print(out[0])

Prompt format

Every prompt should start with <s> (BOS) followed by a language token of the form <|{lang}_{Script}|> to condition generation on the target language. See tokenizer.json for the full list of language tokens.

Examples:

  • <s><|eng_Latn|>The quick brown fox
  • <s><|hau_Latn|>Sannu, yaya kake?
  • <s><|amh_Ethi|>ሰላም

Interactive mode

For exploring the model, run the REPL:

python generate.py --interactive

Then type prompts at the >>> prompt. Empty line or Ctrl-D exits.

Example session:

>>> <s><|swh_Latn|>Nairobi ni mji mkuu wa
[sample 0]
Nairobi ni mji mkuu wa Kenya na kuna wageni wengi ambao hutembelea pia.
Unaweza kupata historia ya utalii wa ndani ya nchi hii...

Adjust sampling on the command line:

python generate.py --interactive --temperature 0.7 --top-p 0.95 --max-new-tokens 256

Files

File Purpose
model.safetensors Model weights (preferred format)
model.pt Same weights as a torch checkpoint (fallback)
config.json Architecture config in plain JSON
tokenizer.json ByteLevel BPE tokenizer (64000 vocab)
tokenizer_config.json HuggingFace tokenizer metadata
inference.py load_model() + generate() library
generate.py CLI wrapper
llama3.py, model_factory.py, kvcache.py Model definition

Limitations

This is a base model, not chat- or instruction-tuned. It will continue text in the style of its training corpus (web text in the prompted language). For instruction following or chat, fine-tune on an instruction dataset.

Quality varies by language; lower-resource languages in the training mixture (e.g. Bemba, Tswana) will produce lower-quality continuations than higher-resource ones (English, Swahili, Yoruba).

Citation

If you use this model, please cite WakandaAI. Details TBA.

Downloads last month
10
Safetensors
Model size
2B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including WakandaAI/Aura-2B