Instructions to use domofon/Domofon-v1-0.8b-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use domofon/Domofon-v1-0.8b-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="domofon/Domofon-v1-0.8b-base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("domofon/Domofon-v1-0.8b-base")
model = AutoModelForCausalLM.from_pretrained("domofon/Domofon-v1-0.8b-base", device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use domofon/Domofon-v1-0.8b-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "domofon/Domofon-v1-0.8b-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "domofon/Domofon-v1-0.8b-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/domofon/Domofon-v1-0.8b-base

SGLang

How to use domofon/Domofon-v1-0.8b-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "domofon/Domofon-v1-0.8b-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "domofon/Domofon-v1-0.8b-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "domofon/Domofon-v1-0.8b-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "domofon/Domofon-v1-0.8b-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use domofon/Domofon-v1-0.8b-base with Docker Model Runner:
```
docker model run hf.co/domofon/Domofon-v1-0.8b-base
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Domofon-v1-0.8b-base

A 0.9B parameter bilingual (Russian / English) base language model pretrained from scratch on a 660B token corpus.

Model Details


Architecture	Qwen3 (dense decoder-only transformer)
Parameters	0.9B (883M unique)
Hidden size	1024
Layers	40
Attention heads	16 (8 KV heads, GQA)
Head dim	64
FFN dim	4096
Vocab size	248,072
Context length	32,768
Precision	float16

Training

Pretrained from random initialization — no upstream weights were used
Training corpus: 660B tokens, ~50/50 English and Russian
Training framework: MaxText on Google Cloud TPU v5e-64
This is a base model — no SFT, no chat tuning, no RLHF

Intended Use

This is a base pretrained model intended for research and as a foundation for downstream fine-tuning. It is not an instruction-following or chat model.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("domofon/Domofon-v1-0.8b-base")
tokenizer = AutoTokenizer.from_pretrained("domofon/Domofon-v1-0.8b-base")

inputs = tokenizer("Москва — столица", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

Base model only — will not follow instructions or engage in dialogue without fine-tuning
Training data mix is 50/50 EN/RU; performance on other languages is not evaluated
No safety alignment has been applied

Downloads last month: 6

Safetensors

Model size

1B params

Tensor type

F16

Model tree for domofon/Domofon-v1-0.8b-base

Quantizations

1 model

Collection including domofon/Domofon-v1-0.8b-base

Domofon-temp

Collection

Temporary models of Domofon's model. (Un usable to real launch) • 3 items • Updated May 23