Instructions to use Tesslate/OmniCoder-9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Tesslate/OmniCoder-9B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Tesslate/OmniCoder-9B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Tesslate/OmniCoder-9B")
model = AutoModelForImageTextToText.from_pretrained("Tesslate/OmniCoder-9B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Tesslate/OmniCoder-9B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Tesslate/OmniCoder-9B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tesslate/OmniCoder-9B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Tesslate/OmniCoder-9B

SGLang

How to use Tesslate/OmniCoder-9B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Tesslate/OmniCoder-9B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tesslate/OmniCoder-9B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Tesslate/OmniCoder-9B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tesslate/OmniCoder-9B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Tesslate/OmniCoder-9B with Docker Model Runner:
```
docker model run hf.co/Tesslate/OmniCoder-9B
```

smirki commited on Mar 12

Commit

3c02ff8

verified ·

1 Parent(s): a4b862f

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +8 -26

README.md CHANGED Viewed

@@ -53,30 +53,21 @@ model-index:
 # OmniCoder-9B
 [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
 [![Base Model](https://img.shields.io/badge/Base-Qwen3.5--9B-purple)](https://huggingface.co/Qwen/Qwen3.5-9B)
 [![GGUF](https://img.shields.io/badge/GGUF-Available-green)](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF)
-[![Tesslate](https://img.shields.io/badge/Tesslate-Website-orange)](https://tesslate.com)
-[Get Started](#quickstart) | [Benchmarks](#benchmarks) | [GGUF Downloads](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF) | [Website](https://tesslate.com)
 ---
 </div>
-## Why OmniCoder?
-Most open coding models are trained on synthetic instruction data. OmniCoder is different. It was trained on **425,000+ real agentic coding trajectories** from the best frontier models in the world: Claude Opus 4.6, GPT-5.4, GPT-5.3-Codex, and Gemini 3.1 Pro. It learned how top-tier agents actually write code, recover from errors, use tools, and solve problems end-to-end.
-The result: a 9B model that scores **83.8 on GPQA Diamond** (above GPT-OSS-120B's 80.1 and Claude Haiku 4.5's 73), hits **90 on AIME 2025**, and solves **+8 more Terminal-Bench tasks** than its base model (25/89 vs 18/89).
-You can run it locally. Right now. On a single GPU. [Jump to Quickstart.](#quickstart)
----
 ## Overview
-**OmniCoder-9B** is built by [Tesslate](https://tesslate.com), fine-tuned on top of [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B)'s hybrid architecture (Gated Delta Networks interleaved with standard attention).
 The training data was specifically built from **Claude Opus 4.6 agentic and coding reasoning traces**, targeting scaffolding patterns from Claude Code, OpenCode, Codex, and Droid. The dataset includes successful trajectories from models like Claude Opus 4.6, GPT-5.4, GPT-5.3-Codex, and Gemini 3.1 Pro.
@@ -106,12 +97,9 @@ The model shows strong agentic behavior: it recovers from errors (read-before-wr
 </div>
-**Highlights:**
-- **GPQA Diamond pass@1: 83.8** (166/198 correct). Beats GPT-OSS-120B (80.1), Qwen3.5-9B (81.7), Qwen3-Next-80B (77.2), GPT-OSS-20B (71.5), and Claude Haiku 4.5 (73). At pass@3 it reaches **86.4** (171/198).
-- **AIME 2025 pass@5: 90** (27/30 correct). Competitive with GPT-OSS-20B (91.7) and GLM-4.7-Flash (91.6).
-- **Terminal-Bench 2.0: 28.1** (25/89 tasks solved). +8.1 points over the Qwen3.5-9B base model (20) and above Claude Haiku 4.5 (27).
-> [Try it yourself.](#quickstart) | [Run it locally with GGUF.](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF)
 ---
@@ -159,15 +147,11 @@ print(response.choices[0].message.content)
 ### llama.cpp (GGUF)
-Run it locally on your laptop:
 ```bash
 llama-cli --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -p "Your prompt" -c 8192
 ```
-The Q4_K_M quantization (5.7 GB) fits comfortably on most consumer GPUs and Apple Silicon Macs.
-**[Browse all quantizations here.](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF)**
 ---
@@ -238,6 +222,4 @@ Special thanks to the [Axolotl](https://github.com/axolotl-ai-cloud/axolotl) tea
 **Built by [Tesslate](https://tesslate.com)**
-[Get the model](https://huggingface.co/Tesslate/OmniCoder-9B) | [GGUF quantizations](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF) | [Website](https://tesslate.com)
 </div>

 # OmniCoder-9B
+### A 9B coding agent fine-tuned on 425K agentic trajectories.
 [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
 [![Base Model](https://img.shields.io/badge/Base-Qwen3.5--9B-purple)](https://huggingface.co/Qwen/Qwen3.5-9B)
 [![GGUF](https://img.shields.io/badge/GGUF-Available-green)](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF)
+[Get Started](#quickstart) | [Benchmarks](#benchmarks) | [GGUF Downloads](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF)
 ---
 </div>
 ## Overview
+**OmniCoder-9B** is a 9-billion parameter coding agent model built by [Tesslate](https://tesslate.com), fine-tuned on top of [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B)'s hybrid architecture (Gated Delta Networks interleaved with standard attention). It was trained on **425,000+ curated agentic coding trajectories** spanning real-world software engineering tasks, tool use, terminal operations, and multi-step reasoning.
 The training data was specifically built from **Claude Opus 4.6 agentic and coding reasoning traces**, targeting scaffolding patterns from Claude Code, OpenCode, Codex, and Droid. The dataset includes successful trajectories from models like Claude Opus 4.6, GPT-5.4, GPT-5.3-Codex, and Gemini 3.1 Pro.
 </div>
+- **GPQA Diamond pass@1: 83.8** (166/198). +2.1 points over the Qwen3.5-9B base model (81.7). At pass@3: **86.4** (171/198).
+- **AIME 2025 pass@5: 90** (27/30).
+- **Terminal-Bench 2.0: 28.1** (25/89). +8.1 points over the Qwen3.5-9B base model (20).
 ---
 ### llama.cpp (GGUF)
 ```bash
 llama-cli --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -p "Your prompt" -c 8192
 ```
+All quantizations: [Tesslate/OmniCoder-9B-GGUF](https://huggingface.co/Tesslate/OmniCoder-9B-GGUF)
 ---
 **Built by [Tesslate](https://tesslate.com)**
 </div>