Image-Text-to-Text
Transformers
Safetensors
English
Chinese
qwen3_5
code
instruction-tuned
software-engineering
agent
opencode
qwen
python
conversational
Instructions to use Kassadin88/Nemotron-9B-OpenCode with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Kassadin88/Nemotron-9B-OpenCode with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Kassadin88/Nemotron-9B-OpenCode") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Kassadin88/Nemotron-9B-OpenCode") model = AutoModelForImageTextToText.from_pretrained("Kassadin88/Nemotron-9B-OpenCode") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Kassadin88/Nemotron-9B-OpenCode with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Kassadin88/Nemotron-9B-OpenCode" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Kassadin88/Nemotron-9B-OpenCode", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Kassadin88/Nemotron-9B-OpenCode
- SGLang
How to use Kassadin88/Nemotron-9B-OpenCode with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Kassadin88/Nemotron-9B-OpenCode" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Kassadin88/Nemotron-9B-OpenCode", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Kassadin88/Nemotron-9B-OpenCode" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Kassadin88/Nemotron-9B-OpenCode", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use Kassadin88/Nemotron-9B-OpenCode with Docker Model Runner:
docker model run hf.co/Kassadin88/Nemotron-9B-OpenCode
File size: 11,710 Bytes
9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 8afcaa2 88ea406 8afcaa2 88ea406 8afcaa2 88ea406 8afcaa2 88ea406 8afcaa2 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 88ea406 9389c22 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 | ---
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3.5-9B/blob/main/LICENSE
pipeline_tag: image-text-to-text
base_model:
- Qwen/Qwen3.5-9B
tags:
- code
- instruction-tuned
- software-engineering
- agent
- opencode
- qwen
- python
language:
- en
- zh
---
# Nemotron-9B-OpenCode
A 9B parameter instruction-tuned model specialized for **autonomous software engineering agents**, fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) on NVIDIA's Nemotron-SFT-OpenCode-v1 dataset.
## Model Highlights
- **Specialized for Agentic Tasks**: Trained on agent trajectories for the [OpenCode](https://opencode.ai/) CLI framework, enabling autonomous code navigation, multi-step tool use, and software engineering workflows
- **Multi-Capability**: Supports general reasoning, tool calling, bash command execution, and dynamic skill loading
- **Production Ready**: Compatible with Hugging Face Transformers, vLLM, SGLang, and OpenAI-compatible APIs
## Model Description
| Property | Value |
|----------|-------|
| **Base Model** | Qwen3.5-9B |
| **Model Type** | Causal Language Model with Vision Encoder |
| **Parameters** | 9B |
| **Languages** | English, Chinese |
| **License** | Apache 2.0 |
| **Developer** | [Kassadin88](https://huggingface.co/Kassadin88) |
## Training Data
This model was fine-tuned on **[Nemotron-SFT-OpenCode-v1](https://huggingface.co/datasets/nvidia/Nemotron-SFT-OpenCode-v1)**, NVIDIA's agentic instruction tuning dataset containing **144,468 high-quality samples** derived from 459K total trajectories. The dataset enhances LLMs' ability to operate within autonomous coding environments.
### Dataset Composition
| Subset | Samples | Description |
|--------|---------|-------------|
| `general` | 90K | General agentic CLI questions with/without AGENTS.md context |
| `bash_only_tool` | 97K | Restricted tool set (todo + bash) for foundational agent capabilities |
| `bash_only_tool_skills` | 96K | Bash + skill loading for dynamic capability discovery |
| `question_tool` | 76K | Interactive clarification via user questions during task execution |
| `agent_skills` | 67K | Dynamic skill scanning and loading for task-specific capabilities |
| `agent_skills_question_tool` | 33K | Combined skill loading + user clarification for complex tasks |
### Key Capabilities Trained
- **Code Navigation**: Repository-aware reasoning and codebase traversal
- **Tool Calling**: Structured tool invocation for bash, file operations, and more
- **Skill Loading**: Dynamic discovery and loading of relevant agent skills
- **Interactive Planning**: User clarification when requirements are ambiguous
- **Multi-Step Reasoning**: SWE-Bench style problem decomposition and implementation
## Benchmark Results
The model inherits strong foundational capabilities from Qwen3.5-9B. Below are the base model's benchmark performances:
### Language Benchmarks
<div style="font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:1000px;margin:0 auto;padding:16px 0">
<table style="width:100%;border-collapse:collapse;font-size:13px">
<thead><tr>
<th style="padding:10px 7px;text-align:left;font-weight:600;border-bottom:2px solid #7c3aed;color:#7c3aed">Category</th>
<th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Benchmark</th>
<th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Qwen3.5-9B</th>
</tr></thead>
<tbody>
<tr><td rowspan="5" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Knowledge & STEM</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLU-Pro</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">82.5</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLU-Redux</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.1</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">C-Eval</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.2</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">GPQA Diamond</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.7</td></tr>
<tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Instruction Following</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">IFEval</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.5</td></tr>
<tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Long Context</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LongBench v2</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">55.2</td></tr>
<tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Reasoning & Coding</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LiveCodeBench v6</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.6</td></tr>
</tbody>
</table>
</div>
### Vision Language Benchmarks
<div style="font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:1000px;margin:0 auto;padding:16px 0">
<table style="width:100%;border-collapse:collapse;font-size:13px">
<thead><tr>
<th style="padding:10px 7px;text-align:left;font-weight:600;border-bottom:2px solid #7c3aed;color:#7c3aed">Category</th>
<th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Benchmark</th>
<th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Qwen3.5-9B</th>
</tr></thead>
<tbody>
<tr><td rowspan="4" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">STEM & Puzzle</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMMU</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.4</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MathVision</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.9</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Mathvista (mini)</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.7</td></tr>
<tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Document Understanding</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">OCRBench</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.2</td></tr>
<tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Video Understanding</td></tr>
<tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">VideoMME (w/ sub)</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.5</td></tr>
</tbody>
</table>
</div>
> **Note**: For complete benchmark results across all categories, please refer to the [Qwen3.5-9B model card](https://huggingface.co/Qwen/Qwen3.5-9B).
## Quick Start
### Using Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "Kassadin88/Nemotron-9B-OpenCode"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
messages = [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to merge two sorted arrays."}
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
do_sample=True
)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)
```
### Using vLLM (Recommended for Production)
```python
from vllm import LLM, SamplingParams
llm = LLM(
model="Kassadin88/Nemotron-9B-OpenCode",
trust_remote_code=True,
dtype="bfloat16"
)
sampling_params = SamplingParams(
max_tokens=1024
)
outputs = llm.generate(prompts, sampling_params)
```
### Using SGLang
```bash
python -m sglang.launch_server \
--model-path Kassadin88/Nemotron-9B-OpenCode \
--port 8000 \
--tp-size 1
```
### OpenAI-Compatible API
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="EMPTY"
)
response = client.chat.completions.create(
model="Kassadin88/Nemotron-9B-OpenCode",
messages=[
{"role": "user", "content": "Write a quicksort implementation in Python"}
],
max_tokens=512
)
print(response.choices[0].message.content)
```
## Usage Tips
### For Agentic Coding Tasks
```python
messages = [
{"role": "system", "content": "You are an autonomous coding agent. Use the available tools to complete tasks."},
{"role": "user", "content": "Fix the bug in src/utils/parser.py that causes incorrect JSON parsing."}
]
```
### For Code Generation
```python
outputs = model.generate(
**inputs,
max_new_tokens=1024,
do_sample=True
)
```
### For Code Explanation
```python
outputs = model.generate(
**inputs,
max_new_tokens=512,
do_sample=True
)
```
## Limitations
- The model is primarily trained on agentic coding tasks and may not perform optimally on general conversational tasks
- May occasionally generate incorrect or incomplete code
- Should not be used for malicious code generation
## Citation
```bibtex
@misc{nemotron-9b-opencode,
author = {Kassadin88},
title = {Nemotron-9B-OpenCode: An Instruction-Tuned Model for Autonomous Software Engineering},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/Kassadin88/Nemotron-9B-OpenCode}
}
```
## Acknowledgments
- **Base Model**: [Qwen Team](https://github.com/QwenLM/Qwen3) for Qwen3.5-9B
- **Training Data**: [NVIDIA](https://huggingface.co/datasets/nvidia/Nemotron-SFT-OpenCode-v1) for Nemotron-SFT-OpenCode-v1
- **Training Framework**: [MS-Swift](https://github.com/modelscope/swift)
---
**Note:** This model is intended for research and educational purposes. Please use responsibly.
|