Instructions to use iFlytekOpenSource/Domux with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use iFlytekOpenSource/Domux with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="iFlytekOpenSource/Domux")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("iFlytekOpenSource/Domux")
model = AutoModelForMultimodalLM.from_pretrained("iFlytekOpenSource/Domux")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use iFlytekOpenSource/Domux with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "iFlytekOpenSource/Domux"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iFlytekOpenSource/Domux",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/iFlytekOpenSource/Domux

SGLang

How to use iFlytekOpenSource/Domux with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "iFlytekOpenSource/Domux" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iFlytekOpenSource/Domux",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "iFlytekOpenSource/Domux" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iFlytekOpenSource/Domux",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use iFlytekOpenSource/Domux with Docker Model Runner:
```
docker model run hf.co/iFlytekOpenSource/Domux
```

Domux / README.md

ankh2454

docs: update Hugging Face model card

6c71a32 verified 4 days ago

preview code

Raw

History Blame Contribute Delete

8.03 kB

	---
	license: gemma
	base_model:
	- google/gemma-4-E2B-it
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- smart-home
	- intent-parsing
	- slot-filling
	- command-understanding
	- gemma
	- iflytek
	- edge
	extra_gated_heading: Access Domux on Hugging Face
	extra_gated_prompt: >-
	Domux is a derivative of Google's Gemma model and its weights are governed by
	the Gemma Terms of Use. To access the weights, you must review and agree to
	the Gemma Terms of Use and Prohibited Use Policy.
	extra_gated_button_content: Acknowledge the Gemma Terms of Use
	---

	<div align="center">
	<h1>Domux</h1>
	<p><b>A lightweight, low-latency command understanding model for smart-home control.</b></p>

	<p>
	<a href="https://github.com/iflytek/domux"><img src="https://img.shields.io/badge/GitHub-Repo-181717?logo=github"></a>
	<a href="https://modelscope.cn/models/iflytek/domux"><img src="https://img.shields.io/badge/🔧%20ModelScope-Model-blue"></a>
	<a href="https://ai.google.dev/gemma/terms"><img src="https://img.shields.io/badge/License-Gemma%20Terms-green"></a>
	<img src="https://img.shields.io/badge/Inference-vLLM%20%7C%20SGLang-orange">
	</p>
	</div>

	---

	Domux (`Domux-Gemma-4-E2B-it`) is a fine-tuned language model built on Gemma-4-E2B-it. It turns natural-language smart-home commands into structured, pipe-delimited slots. Training combines supervised fine-tuning (SFT) with reinforcement learning via Group Relative Policy Optimization (GRPO) and custom reward functions.

	> 📦 Code, training scripts, evaluation suite, full benchmark report and dataset live in the GitHub repository: [github.com/iflytek/domux](https://github.com/iflytek/domux).

	## ✨ Key Features

	- Fast response — Optimized for low-latency inference on edge devices and servers.
	- Structured slot output — Parses free-form commands into a fixed 7-field pipe-delimited schema.
	- High accuracy — 98.37% result accuracy with 100% format compliance, outperforming much larger models.
	- Lightweight base — Built on the compact Gemma-4-E2B-it, suitable for on-device and edge deployment.
	- Multi-action support — Handles compound commands that map to multiple slot lines.
	- Generalizes across devices — Handles arbitrary device names within each category, not a fixed whitelist.

	## 🎬 Output Format

	The model outputs pipe-delimited slots with 7 fields. Use `*` for unspecified or don't-care fields.

	```
	action\|device\|attribute\|value\|unit\|room\|floor
	```

	### Basic Examples

	\| Input \| Output \|
	\| --- \| --- \|
	\| Turn on the living room light \| `turnOn\\|Light\\|\\|\\|\\|Living Room\\|` \|
	\| Set bedroom AC to 22 degrees \| `set\\|AC\\|temperature\\|22\\|Celsius\\|Bedroom\\|*` \|
	\| Close the curtains 20 percent \| `adjustDown\\|Curtain\\|openness\\|20\\|Percent\\|\\|` \|

	### Complex Multi-Attribute Command

	Input:
	```
	Turn on the Master Light in the Master Bedroom on the Second Floor,
	set brightness to 80%, color temperature to 4000K, color to Blue, and mode to Reading.
	```

	Output:
	```
	turnOn\|Light\|\|\|*\|Master Bedroom\|Second Floor
	set\|Light\|brightness\|80\|Percent\|Master Bedroom\|Second Floor
	set\|Light\|colorTemperature\|4000\|Kelvin\|Master Bedroom\|Second Floor
	set\|Light\|color\|Blue\|*\|Master Bedroom\|Second Floor
	set\|Light\|mode\|Reading\|*\|Master Bedroom\|Second Floor
	```

	Full specification: [Output Format Documentation](https://github.com/iflytek/domux/blob/main/docs/output-spec.md).

	## 🏠 Supported Control Capabilities

	Domux does not rely on a fixed device whitelist — it handles diverse device names through semantic understanding.

	\| Device Type \| Naming Examples \| Controllable Attributes \| Value Range \|
	\| --- \| --- \| --- \| --- \|
	\| Light \| Light, Strip Light, Spot Light, Desk Lamp \| `brightness` / `color` / `colorTemperature` / `mode` \| 0–100% / Blue, Red, Green… / 3000–6500 K / Reading, Romance, Soft… \|
	\| AC \| AC, AC 1 \| `temperature` / `mode` / `windSpeed` \| 16–30 °C / Cool, Heat, Dry, Fan, Auto / Low, Medium, High \|
	\| Curtain / Blind \| Curtain, Blind, Sheer Curtain \| `position` \| 0–100% \|
	\| Scene Mode \| Romantic Mode, Party Mode, Sleeping Mode \| — \| — \|

	Actions: `turnOn`, `turnOff`, `set`, `adjustUp`, `adjustDown`, `activate`, `deactivate`, `pause`.

	Spatial context: rooms (Living Room, Bedroom, Kitchen, Majlis, Prayer Room…) and floors (Ground Floor, Upstairs, Downstairs…), including numbered variants.

	## 📊 Benchmark

	Evaluated on a comprehensive test set of 4,057 samples across 4 dimensions (single intent, multi-intent, omitted attributes, non-standard naming), benchmarked against 11 mainstream models including Qwen3.5 series (2B-27B), Gemma 4 series, and leading closed-source APIs (DeepSeek-V4, Claude Haiku 4.5, Gemini 3.5 Flash).

	Result accuracy reaches 98.37% with 100% format compliance.

	📄 Full technical report and benchmark charts: [GitHub repository](https://github.com/iflytek/domux#-benchmark).

	The test set and evaluation script are open-sourced under [`eval/`](https://github.com/iflytek/domux/tree/main/eval) so you can reproduce the results or evaluate your own model.

	## 🚀 Quick Start

	### Hardware

	The model runs in BF16 precision and requires 20GB+ of VRAM for single-GPU deployment.

	### Download

	```bash
	# Hugging Face
	git lfs install
	git clone https://huggingface.co/iFlytekOpenSource/Domux
	```

	### Inference with vLLM

	```bash
	pip install "vllm==0.22.0"
	```

	```python
	from vllm import LLM, SamplingParams

	llm = LLM(model="iFlytekOpenSource/Domux", dtype="bfloat16")
	sampling = SamplingParams(temperature=0.0, max_tokens=256)

	prompt = "Turn on the Master Light in the Master Bedroom on the Second Floor, set brightness to 80%, color temperature to 4000K, color to Blue, and mode to Reading."
	output = llm.chat([{"role": "user", "content": prompt}], sampling)
	print(output[0].outputs[0].text)

	# Output:
	# turnOn\|Light\|\|\|*\|Master Bedroom\|Second Floor
	# set\|Light\|brightness\|80\|Percent\|Master Bedroom\|Second Floor
	# set\|Light\|colorTemperature\|4000\|Kelvin\|Master Bedroom\|Second Floor
	# set\|Light\|color\|Blue\|*\|Master Bedroom\|Second Floor
	# set\|Light\|mode\|Reading\|*\|Master Bedroom\|Second Floor
	```

	### Serve as an OpenAI-compatible API

	```bash
	# vLLM
	python -m vllm.entrypoints.openai.api_server \
	--model iFlytekOpenSource/Domux \
	--served-model-name domux \
	--host 0.0.0.0 --port 8000 \
	--dtype bfloat16 --max-model-len 2048 --gpu-memory-utilization 0.9
	```

	```bash
	# SGLang
	pip install "sglang[all]==0.5.12"
	python -m sglang.launch_server \
	--model-path iFlytekOpenSource/Domux \
	--host 0.0.0.0 --port 8000 \
	--dtype bfloat16 --context-length 2048
	```

	```python
	from openai import OpenAI

	client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
	response = client.chat.completions.create(
	model="domux",
	messages=[{"role": "user", "content": "Turn on the bedroom light and set brightness to 60%"}],
	temperature=0.0,
	)
	print(response.choices[0].message.content)
	```

	## 📄 License

	The model weights are a derivative of Google's Gemma and are made available under, and your use of them is governed by, the [Gemma Terms of Use](https://ai.google.dev/gemma/terms) and the [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy). "Gemma" is a trademark of Google LLC.

	The accompanying source code (training scripts, reward plugins, evaluation tooling) in the [GitHub repository](https://github.com/iflytek/domux) is licensed under Apache-2.0.

	## 🙏 Acknowledgments

	- Base model: [Gemma](https://ai.google.dev/gemma)
	- Training framework: [ModelScope-Swift](https://github.com/modelscope/swift)
	- Experiment tracking: [SwanLab](https://swanlab.cn/)

	## Citation

	```bibtex
	@misc{domux2026,
	title = {Domux: A Lightweight Low-Latency Command Understanding Model for Smart-Home Control},
	author = {iFLYTEK CO., LTD.},
	year = {2026},
	url = {https://github.com/iflytek/domux}
	}
	```