Instructions to use sugiken/Ordis-1.8B-V17-Multilingual with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use sugiken/Ordis-1.8B-V17-Multilingual with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="sugiken/Ordis-1.8B-V17-Multilingual")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("sugiken/Ordis-1.8B-V17-Multilingual")
model = AutoModelForCausalLM.from_pretrained("sugiken/Ordis-1.8B-V17-Multilingual")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use sugiken/Ordis-1.8B-V17-Multilingual with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sugiken/Ordis-1.8B-V17-Multilingual"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sugiken/Ordis-1.8B-V17-Multilingual",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/sugiken/Ordis-1.8B-V17-Multilingual

SGLang

How to use sugiken/Ordis-1.8B-V17-Multilingual with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "sugiken/Ordis-1.8B-V17-Multilingual" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sugiken/Ordis-1.8B-V17-Multilingual",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "sugiken/Ordis-1.8B-V17-Multilingual" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sugiken/Ordis-1.8B-V17-Multilingual",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use sugiken/Ordis-1.8B-V17-Multilingual with Docker Model Runner:
```
docker model run hf.co/sugiken/Ordis-1.8B-V17-Multilingual
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Ordis-1.8B-V17-Multilingual

Full model weights (safetensors) of Ordis-1.8B-V17-Multilingual. Powered by Tencent Hunyuan.

Ordis is a 1.8B tool-calling model fine-tuned from Hunyuan-A2B-Pretrain. It is trained to accurately call 8 practical tools (weather, calculator, stock, exchange rate, time, search, translate, knowledge) with minimal training data (~300 multilingual examples + base tool training), proving that small models can learn reliable function calling without massive datasets.

This is NOT a benchmark-optimized model. No training data was specifically created to boost any benchmark score. All results below reflect genuine generalization from practical tool-calling training.

Website | GGUF Versions | 1.5B Version | GitHub

Files

File	Size	Description
`model.safetensors`	~3.6 GB	Full precision model weights
`config.json`	—	Model configuration
`tokenizer.json`	—	Tokenizer
`tokenizer_config.json`	—	Tokenizer configuration
`special_tokens_map.json`	—	Special tokens mapping
`generation_config.json`	—	Generation parameters
`chat_template.jinja`	—	Chat template for Hunyuan format

For GGUF quantized versions (Ollama / llama.cpp), see: Ordis-1.8B-V17-Multilingual-GGUF

Quick Start (Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("sugiken/Ordis-1.8B-V17-Multilingual")
tokenizer = AutoTokenizer.from_pretrained("sugiken/Ordis-1.8B-V17-Multilingual")

Standard Benchmarks

Evaluation: lm-eval v0.4.10, A100-80GB GPU. All benchmarks run under identical conditions.

Benchmark	Ordis 1.8B V17	Base Hunyuan-A2B-Pretrain
MMLU (5-shot)	61.27%	—
GSM8K (5-shot)	69.07%	—
C-Eval (0-shot)	71.55%	—
HellaSwag (0-shot)	62.37%	—
Winogrande (0-shot)	62.90%	—
ARC-Challenge (0-shot)	44.71%	—
TruthfulQA MC2 (0-shot)	44.52%	—

Tool Calling Performance

tool50 & android50 (Ordis Internal)

Our custom tool-calling test suite: 50 questions across 3 languages (CN/EN/JP), covering all 8 trained tools. Each question requires the model to decide whether to call a tool, select the correct one, and extract the right parameters. android50 tests 22 mobile automation tools across 3 difficulty levels.

Evaluation	Score	Details
tool50	94% (47/50)	CN/EN/JP mixed, 8 information tools
android50	54% (27/50)	L1=56%, L2=72%, L3=31%, 22 Android tools

BFCL (Public Benchmark)

Berkeley Function Calling Leaderboard — industry-standard function calling benchmark (840 questions).

Category	Score	Description
Simple	54.75%	Single function call
Multiple	41.50%	Multiple parallel calls
Irrelevance	85.42%	Correctly refuses when no tool fits
Overall	60.36%

Ordis Internal Evaluation (Real-World Deployment Focus)

190pt Core (12 Dimensions, Parts A-L)

Part	Dimension	Score	What it tests
A	Identity	12/12	Self-awareness, name, creator, consistency
B	Theory of Mind	6/18	Understanding user intent and context
C	Safety	16/25	Harmful request rejection, boundary enforcement
D	IDK (Honest Refusal)	11/11	Saying "I don't know" instead of hallucinating
E	Hard Gates	12/15	Capability boundary awareness, not overstepping
F	General Knowledge	4/5	Basic factual accuracy
G	Applied Field Mastery	8/13	Domain-specific knowledge application
H	Meta-cognition	12/15	Self-correction, confidence calibration
I	Tool Calling	14/20	Correct tool selection and parameter extraction
J	Practical Tasks	14/20	Multi-step real-world task completion
K	System Prompt	12/15	Instruction following, prompt adherence
L	Adversarial	16/21	Resisting jailbreaks, manipulation, gaslighting
	Total	137/190 (72.1%)

225pt Extended (Parts A-M)

Part	Dimension	Score	What it tests
M	Deployment Readiness	22/25	Multi-turn contamination, data leakage, cross-domain pollution, temperature sensitivity, context pressure
	Grand Total	166/225 (73.8%)

Cross-Model Comparison (Same Test Suite)

Model	190pt	Training	Notes
Hunyuan-A2B-Pretrain	94	None (base)	Starting point, zero fine-tuning
Ordis 1.5B V3.5.5 (Qwen2.5-1.5B)	51/60 (85%)	LoRA, different architecture	Previous generation, different eval scale*
Ordis 1.8B V17 (this model)	137/190 (72.1%)	Full FT, tool focus	Minimal general reinforcement
Hunyuan-A2B-Instruct (Tencent official)	174/190 (91.6%)	Tencent RLHF	Target to surpass

Trained Tools (8 Tools)

Tool	Description	Parameters
`get_weather`	查询天气	`location` (string, required)
`calculator`	数学计算	`expression` (string, required)
`get_current_time`	查询当前时间	`timezone` (string, optional)
`web_search`	搜索网页	`query` (string, required)
`get_stock_price`	查询股价	`symbol` (string, required)
`get_exchange_rate`	查询汇率	`from_currency`, `to_currency` (string, required)
`knowledge_search`	知识库检索	`query` (string, required)
`translate_text`	翻译文本	`text`, `target_lang` (string, required)

About This Model

This model is a verification release — it proves that practical tool calling can be trained into a 1.8B pretrained model with minimal data and without specialized benchmark optimization.

What we did:

Full fine-tuning (not LoRA) on Hunyuan-A2B-Pretrain (1.8B MoE)
Progressive Identity Training (PIT) + Surgery method for tool-calling injection
~300 multilingual examples (CN/EN/JP) for the V17 multilingual layer
8 practical tools trained with custom evaluation

What we did NOT do:

No BFCL-specific training data
No MMLU/GSM8K/ARC-specific training
No general knowledge reinforcement
No benchmark-oriented prompt engineering

Current status:

Training has progressed to V20 internally, with scores surpassing V17 across the board
Due to funding constraints, further large language model training is temporarily paused
This release also validates the practical applicability of our research on progressive identity training and tool-calling surgery methods for small language models
Future versions will integrate the V3.5.5 (1.5B) personality and safety advantages into the 1.8B architecture

Model Details

Property	Value
Base Model	tencent/Hunyuan-A2B-Pretrain (1.8B MoE)
Parameters	1.8B (Mixture of Experts)
Fine-tuning	Full fine-tuning (NOT LoRA)
Training	PIT (Progressive Identity Training) + Tool Surgery
Training Hardware	NVIDIA A100-SXM4-80GB
Context Length	32K (base), trained at 2048-4096
Languages	Chinese (primary), English, Japanese
License	Apache 2.0

Powered by Tencent Hunyuan — This model is built upon Hunyuan-A2B-Pretrain, an open-source foundation model by Tencent.

Citation

If you use this model, please cite:

@misc{ordis-v17-2026,
  title={Ordis-1.8B-V17-Multilingual: Practical Tool Calling for Small Language Models},
  author={OrdisAI},
  year={2026},
  url={https://huggingface.co/sugiken/Ordis-1.8B-V17-Multilingual}
}

Downloads last month: 4

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for sugiken/Ordis-1.8B-V17-Multilingual

Quantizations

1 model