Image-Text-to-Text
Transformers
Safetensors
English
Chinese
qwen3_5
code
instruction-tuned
software-engineering
agent
opencode
qwen
python
conversational
Instructions to use Kassadin88/Nemotron-9B-OpenCode with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Kassadin88/Nemotron-9B-OpenCode with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Kassadin88/Nemotron-9B-OpenCode") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Kassadin88/Nemotron-9B-OpenCode") model = AutoModelForImageTextToText.from_pretrained("Kassadin88/Nemotron-9B-OpenCode") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Kassadin88/Nemotron-9B-OpenCode with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Kassadin88/Nemotron-9B-OpenCode" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Kassadin88/Nemotron-9B-OpenCode", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Kassadin88/Nemotron-9B-OpenCode
- SGLang
How to use Kassadin88/Nemotron-9B-OpenCode with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Kassadin88/Nemotron-9B-OpenCode" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Kassadin88/Nemotron-9B-OpenCode", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Kassadin88/Nemotron-9B-OpenCode" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Kassadin88/Nemotron-9B-OpenCode", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use Kassadin88/Nemotron-9B-OpenCode with Docker Model Runner:
docker model run hf.co/Kassadin88/Nemotron-9B-OpenCode
| library_name: transformers | |
| license: apache-2.0 | |
| license_link: https://huggingface.co/Qwen/Qwen3.5-9B/blob/main/LICENSE | |
| pipeline_tag: image-text-to-text | |
| base_model: | |
| - Qwen/Qwen3.5-9B | |
| tags: | |
| - code | |
| - instruction-tuned | |
| - software-engineering | |
| - agent | |
| - opencode | |
| - qwen | |
| - python | |
| language: | |
| - en | |
| - zh | |
| # Nemotron-9B-OpenCode | |
| A 9B parameter instruction-tuned model specialized for **autonomous software engineering agents**, fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) on NVIDIA's Nemotron-SFT-OpenCode-v1 dataset. | |
| ## Model Highlights | |
| - **Specialized for Agentic Tasks**: Trained on agent trajectories for the [OpenCode](https://opencode.ai/) CLI framework, enabling autonomous code navigation, multi-step tool use, and software engineering workflows | |
| - **Multi-Capability**: Supports general reasoning, tool calling, bash command execution, and dynamic skill loading | |
| - **Production Ready**: Compatible with Hugging Face Transformers, vLLM, SGLang, and OpenAI-compatible APIs | |
| ## Model Description | |
| | Property | Value | | |
| |----------|-------| | |
| | **Base Model** | Qwen3.5-9B | | |
| | **Model Type** | Causal Language Model with Vision Encoder | | |
| | **Parameters** | 9B | | |
| | **Languages** | English, Chinese | | |
| | **License** | Apache 2.0 | | |
| | **Developer** | [Kassadin88](https://huggingface.co/Kassadin88) | | |
| ## Training Data | |
| This model was fine-tuned on **[Nemotron-SFT-OpenCode-v1](https://huggingface.co/datasets/nvidia/Nemotron-SFT-OpenCode-v1)**, NVIDIA's agentic instruction tuning dataset containing **144,468 high-quality samples** derived from 459K total trajectories. The dataset enhances LLMs' ability to operate within autonomous coding environments. | |
| ### Dataset Composition | |
| | Subset | Samples | Description | | |
| |--------|---------|-------------| | |
| | `general` | 90K | General agentic CLI questions with/without AGENTS.md context | | |
| | `bash_only_tool` | 97K | Restricted tool set (todo + bash) for foundational agent capabilities | | |
| | `bash_only_tool_skills` | 96K | Bash + skill loading for dynamic capability discovery | | |
| | `question_tool` | 76K | Interactive clarification via user questions during task execution | | |
| | `agent_skills` | 67K | Dynamic skill scanning and loading for task-specific capabilities | | |
| | `agent_skills_question_tool` | 33K | Combined skill loading + user clarification for complex tasks | | |
| ### Key Capabilities Trained | |
| - **Code Navigation**: Repository-aware reasoning and codebase traversal | |
| - **Tool Calling**: Structured tool invocation for bash, file operations, and more | |
| - **Skill Loading**: Dynamic discovery and loading of relevant agent skills | |
| - **Interactive Planning**: User clarification when requirements are ambiguous | |
| - **Multi-Step Reasoning**: SWE-Bench style problem decomposition and implementation | |
| ## Benchmark Results | |
| The model inherits strong foundational capabilities from Qwen3.5-9B. Below are the base model's benchmark performances: | |
| ### Language Benchmarks | |
| <div style="font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:1000px;margin:0 auto;padding:16px 0"> | |
| <table style="width:100%;border-collapse:collapse;font-size:13px"> | |
| <thead><tr> | |
| <th style="padding:10px 7px;text-align:left;font-weight:600;border-bottom:2px solid #7c3aed;color:#7c3aed">Category</th> | |
| <th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Benchmark</th> | |
| <th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Qwen3.5-9B</th> | |
| </tr></thead> | |
| <tbody> | |
| <tr><td rowspan="5" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Knowledge & STEM</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLU-Pro</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">82.5</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLU-Redux</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.1</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">C-Eval</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.2</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">GPQA Diamond</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.7</td></tr> | |
| <tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Instruction Following</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">IFEval</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.5</td></tr> | |
| <tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Long Context</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LongBench v2</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">55.2</td></tr> | |
| <tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Reasoning & Coding</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LiveCodeBench v6</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.6</td></tr> | |
| </tbody> | |
| </table> | |
| </div> | |
| ### Vision Language Benchmarks | |
| <div style="font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:1000px;margin:0 auto;padding:16px 0"> | |
| <table style="width:100%;border-collapse:collapse;font-size:13px"> | |
| <thead><tr> | |
| <th style="padding:10px 7px;text-align:left;font-weight:600;border-bottom:2px solid #7c3aed;color:#7c3aed">Category</th> | |
| <th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Benchmark</th> | |
| <th style="padding:10px 7px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed">Qwen3.5-9B</th> | |
| </tr></thead> | |
| <tbody> | |
| <tr><td rowspan="4" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">STEM & Puzzle</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMMU</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.4</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MathVision</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.9</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Mathvista (mini)</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.7</td></tr> | |
| <tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Document Understanding</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">OCRBench</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.2</td></tr> | |
| <tr><td rowspan="2" style="padding:7px 7px;border-bottom:1px solid rgba(128, 128, 128, 0.15);font-weight:600;color:#7c3aed;background:rgba(124, 58, 237, 0.1)">Video Understanding</td></tr> | |
| <tr><td style="padding:7px 7px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">VideoMME (w/ sub)</td><td style="padding:7px 7px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.5</td></tr> | |
| </tbody> | |
| </table> | |
| </div> | |
| > **Note**: For complete benchmark results across all categories, please refer to the [Qwen3.5-9B model card](https://huggingface.co/Qwen/Qwen3.5-9B). | |
| ## Quick Start | |
| ### Using Transformers | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| import torch | |
| model_name = "Kassadin88/Nemotron-9B-OpenCode" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto", | |
| trust_remote_code=True | |
| ) | |
| messages = [ | |
| {"role": "system", "content": "You are a helpful coding assistant."}, | |
| {"role": "user", "content": "Write a Python function to merge two sorted arrays."} | |
| ] | |
| input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| inputs = tokenizer(input_text, return_tensors="pt").to(model.device) | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=512, | |
| do_sample=True | |
| ) | |
| response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) | |
| print(response) | |
| ``` | |
| ### Using vLLM (Recommended for Production) | |
| ```python | |
| from vllm import LLM, SamplingParams | |
| llm = LLM( | |
| model="Kassadin88/Nemotron-9B-OpenCode", | |
| trust_remote_code=True, | |
| dtype="bfloat16" | |
| ) | |
| sampling_params = SamplingParams( | |
| max_tokens=1024 | |
| ) | |
| outputs = llm.generate(prompts, sampling_params) | |
| ``` | |
| ### Using SGLang | |
| ```bash | |
| python -m sglang.launch_server \ | |
| --model-path Kassadin88/Nemotron-9B-OpenCode \ | |
| --port 8000 \ | |
| --tp-size 1 | |
| ``` | |
| ### OpenAI-Compatible API | |
| ```python | |
| from openai import OpenAI | |
| client = OpenAI( | |
| base_url="http://localhost:8000/v1", | |
| api_key="EMPTY" | |
| ) | |
| response = client.chat.completions.create( | |
| model="Kassadin88/Nemotron-9B-OpenCode", | |
| messages=[ | |
| {"role": "user", "content": "Write a quicksort implementation in Python"} | |
| ], | |
| max_tokens=512 | |
| ) | |
| print(response.choices[0].message.content) | |
| ``` | |
| ## Usage Tips | |
| ### For Agentic Coding Tasks | |
| ```python | |
| messages = [ | |
| {"role": "system", "content": "You are an autonomous coding agent. Use the available tools to complete tasks."}, | |
| {"role": "user", "content": "Fix the bug in src/utils/parser.py that causes incorrect JSON parsing."} | |
| ] | |
| ``` | |
| ### For Code Generation | |
| ```python | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=1024, | |
| do_sample=True | |
| ) | |
| ``` | |
| ### For Code Explanation | |
| ```python | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=512, | |
| do_sample=True | |
| ) | |
| ``` | |
| ## Limitations | |
| - The model is primarily trained on agentic coding tasks and may not perform optimally on general conversational tasks | |
| - May occasionally generate incorrect or incomplete code | |
| - Should not be used for malicious code generation | |
| ## Citation | |
| ```bibtex | |
| @misc{nemotron-9b-opencode, | |
| author = {Kassadin88}, | |
| title = {Nemotron-9B-OpenCode: An Instruction-Tuned Model for Autonomous Software Engineering}, | |
| year = {2026}, | |
| publisher = {HuggingFace}, | |
| url = {https://huggingface.co/Kassadin88/Nemotron-9B-OpenCode} | |
| } | |
| ``` | |
| ## Acknowledgments | |
| - **Base Model**: [Qwen Team](https://github.com/QwenLM/Qwen3) for Qwen3.5-9B | |
| - **Training Data**: [NVIDIA](https://huggingface.co/datasets/nvidia/Nemotron-SFT-OpenCode-v1) for Nemotron-SFT-OpenCode-v1 | |
| - **Training Framework**: [MS-Swift](https://github.com/modelscope/swift) | |
| --- | |
| **Note:** This model is intended for research and educational purposes. Please use responsibly. | |