Instructions to use AksaraLLM/AksaraLLM-20B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AksaraLLM/AksaraLLM-20B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="AksaraLLM/AksaraLLM-20B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("AksaraLLM/AksaraLLM-20B") model = AutoModelForCausalLM.from_pretrained("AksaraLLM/AksaraLLM-20B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AksaraLLM/AksaraLLM-20B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AksaraLLM/AksaraLLM-20B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AksaraLLM/AksaraLLM-20B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/AksaraLLM/AksaraLLM-20B
- SGLang
How to use AksaraLLM/AksaraLLM-20B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AksaraLLM/AksaraLLM-20B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AksaraLLM/AksaraLLM-20B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AksaraLLM/AksaraLLM-20B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AksaraLLM/AksaraLLM-20B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use AksaraLLM/AksaraLLM-20B with Docker Model Runner:
docker model run hf.co/AksaraLLM/AksaraLLM-20B
AksaraLLM 20B (dense)
Status: architecture + tokenizer published. Weights are NOT YET trained. This repository currently holds the architecture config and tokenizer. The from-scratch pretraining run is blocked on TRC v5p-128 approval; see Roadmap below.
AksaraLLM 20B is a from-scratch, Indonesian-first decoder-only transformer
designed to serve Indonesian (id), Malay (ms), Javanese (jv), Sundanese
(su), with English (en) and source code as secondary.
Architecture
| Field | Value |
|---|---|
| Family | LLaMA-3-style decoder-only transformer |
| Parameters | 20,359,673,856 (20.36 B, with tied embeddings) |
| Hidden size | 6,144 |
| FFN inner | 20,480 (SwiGLU) |
| Layers | 42 |
| Attention heads | 48 query / 8 KV (GQA, 6:1) |
| Head dim | 128 |
| Vocab | 131,072 (BPE byte-level) |
| Positional | RoPE, θ = 1,000,000 |
| Context (pretrain) | 8,192 |
| Context (YaRN extend) | 32,768 |
| Context (inference target) | 131,072 |
| Norm | RMSNorm |
| Embeddings | tied |
Tokenizer
The tokenizer is already published at
Ezekiel999/aksara-tokenizer-20b
and mirrored here.
Fertility (held-out samples):
| Language | Source | tokens/word | Target |
|---|---|---|---|
| English | FineWeb | 1.280 | ≤ 1.40 |
| Indonesian | Wikipedia | 1.357 | ≤ 1.60 |
| Indonesian | CulturaX web | 1.215 | ≤ 1.60 |
| Malay | Wikipedia | 1.368 | ≤ 1.60 |
| Javanese | Wikipedia | 1.657 | ≤ 1.80 |
Roadmap
| Phase | Status | Compute | Target date |
|---|---|---|---|
| 1. Architecture + tokenizer | ✅ Done | CPU | 2026-04 |
| 2. Corpus build (400–600B tokens) | 🔄 in progress | v6e-8 | 2026-05 |
| 3. Pretrain phase 1 (8k context, 400B tokens) | ⏸ blocked on TRC v5p-128 | v5p-128, 4–5 weeks | 2026-06 |
| 4. YaRN context extension (32k) | pending | v5p-128, ~4 days | 2026-07 |
| 5. SFT | pending | v5p-64 or v6e-8 | 2026-07 |
| 6. DPO / ORPO | pending | v5p-64 or v6e-8 | 2026-07 |
| 7. Eval + release (GGUF) | pending | CPU | 2026-08 |
Usage (tokenizer only)
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("Ezekiel999/aksara-tokenizer-20b")
print(tok("Halo AksaraLLM", add_special_tokens=False).input_ids)
Weights will be published here once pretraining completes.
Citation
@misc{aksarallm2026,
title = {AksaraLLM 20B: A From-Scratch Indonesian-First Language Model},
author = {AksaraLLM Team},
year = {2026},
url = {https://huggingface.co/AksaraLLM/AksaraLLM-20B}
}
License
Apache-2.0. Pre-training data attribution will be documented with the final weights.
- Downloads last month
- 364