Instructions to use Sandroeth/cali-0.1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Sandroeth/cali-0.1B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Sandroeth/cali-0.1B", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Sandroeth/cali-0.1B", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Sandroeth/cali-0.1B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Sandroeth/cali-0.1B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sandroeth/cali-0.1B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Sandroeth/cali-0.1B

SGLang

How to use Sandroeth/cali-0.1B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Sandroeth/cali-0.1B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sandroeth/cali-0.1B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Sandroeth/cali-0.1B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sandroeth/cali-0.1B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Sandroeth/cali-0.1B with Docker Model Runner:
```
docker model run hf.co/Sandroeth/cali-0.1B
```

cali-0.1B / README.md

Sandroeth

Update README.md

b3d2c66 verified 1 day ago

preview code

raw

history blame contribute delete

3.15 kB

	---
	language:
	- id
	- en
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- causal-lm
	- transformer
	- indonesian
	- english
	- pytorch
	- custom-architecture
	datasets:
	- custom
	---

	# CALI

	CALI (Computer Assistant Lightweight Intelligence) adalah model bahasa ringan eksperimental yang dilatih dari nol menggunakan dataset bahasa Indonesia dan Inggris dengan skala terbatas.

	Model ini dibuat untuk eksperimen arsitektur transformer ringan, efisiensi model kecil, dan penelitian training language model menggunakan resource serta dataset terbatas.

	Model ini BUKAN foundation model skala besar dan TIDAK dilatih menggunakan huge-scale internet dataset seperti model komersial modern.

	---

	## Catatan Penting

	Karena ukuran dataset relatif kecil, model dapat memiliki bias yang cukup kuat terhadap domain terakhir atau domain yang paling dominan saat proses pretraining. Fine-tuning, alignment, atau continued pretraining sangat disarankan tergantung tujuan penggunaan model.

	---

	## Detail Model

	\| Property \| Value \|
	\|---\|---\|
	\| Parameters \| 121M \|
	\| Layers \| 11 \|
	\| Hidden Size \| 768 \|
	\| Attention Heads \| 4 \|
	\| KV Heads \| 1 \|
	\| Head Dimension \| 192 \|
	\| FFN Dimension \| 2304 \|
	\| Context Length \| 1024 \|
	\| Vocabulary Size \| 32000 \|

	---

	## Pretraining

	Model dilatih dari nol menggunakan dataset yang dipilih dan difilter sesuai kebutuhan eksperimen model, bukan untuk mengejar ukuran dataset sebesar mungkin.

	Dataset meliputi:
	- Teks bahasa Inggris
	- Teks bahasa Indonesia
	- Wikipedia
	- Berita
	- Dokumen umum
	- Kode program

	---

	Below is the performance comparison of CALI-0.1B against other prominent Small Language Models (SLMs) in the 100M+ parameter tier.

	\| Model Name \| Piqa \| MMLU Math \| ARC-Challenge \| HellaSwag \|
	\| :--- \| :---: \| :---: \| :---: \| :---: \|
	\| CALI-0.1B \| 54.19% \| 28.04% \| 24.66% \| 27.00% \|
	\| SmolLM2-135M \| 58.50% \| 29.90% \| 31.10% \| 43.20% \|
	\| GPT-X2-125M \| 51.60% \| 27.80% \| 27.80% \| 40.50% \|
	\| SmolLM-135M \| 56.30% \| 28.80% \| 28.80% \| 42.70% \|
	\| MobileLLM-R1-140M-base \| 49.90% \| 24.70% \| 24.70% \| 33.90% \|
	\| GPT-X-125M \| 50.80% \| 26.70% \| 26.70% \| 36.50% \|
	\| GPT-2 (124M) \| 39.50% \| 22.60% \| 22.60% \| 31.50% \|
	\| GPT-Neo-125M \| 39.40% \| 22.90% \| 22.90% \| 30.40% \|
	\| OPT-125M \| 40.20% \| 22.90% \| 22.90% \| 31.40% \|

	Note: For CALI-0.1B, the scores represent strict raw accuracies (`acc` / `acc_norm`) extracted directly from the evaluation tracker logs.


	## Progress Training

	\| Tokens \| Step \| Final Loss \|
	\|---\|---\|---\|
	\| 250M \| 13,564 \| 3.53 \|
	\| 350M \| 18,989 \| 3.53 \|
	\| 450M \| 24,415 \| 4.69 \|
	\| 614M \| 33,356 \| 2.71 \|

	---

	## Notes

	- Arsitektur eksperimental
	- Memerlukan custom inference implementation
	- Menggunakan Grouped-Query Attention (GQA)
	- Ditujukan untuk riset dan eksperimen

	---

	## Citation

	If you use or reference this model in your research or projects, please cite:

	```bibtex
	@article{cali2026,
	title = {CALI 0.1B},
	author = {Sandroeth},
	year = {2026},
	publisher = {Hugging Face},
	url = {https://huggingface.co/Sandroeth/cali-0.1B}
	```

	## Author

	Sandroeth