--- library_name: transformers license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3.5-9B/blob/main/LICENSE pipeline_tag: image-text-to-text base_model: - Qwen/Qwen3.5-9B tags: - code - instruction-tuned - software-engineering - agent - opencode - qwen - python language: - en - zh --- # Nemotron-9B-OpenCode A 9B parameter instruction-tuned model specialized for **autonomous software engineering agents**, fine-tuned from [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) on NVIDIA's Nemotron-SFT-OpenCode-v1 dataset. ## Model Highlights - **Specialized for Agentic Tasks**: Trained on agent trajectories for the [OpenCode](https://opencode.ai/) CLI framework, enabling autonomous code navigation, multi-step tool use, and software engineering workflows - **Multi-Capability**: Supports general reasoning, tool calling, bash command execution, and dynamic skill loading - **Production Ready**: Compatible with Hugging Face Transformers, vLLM, SGLang, and OpenAI-compatible APIs ## Model Description | Property | Value | |----------|-------| | **Base Model** | Qwen3.5-9B | | **Model Type** | Causal Language Model with Vision Encoder | | **Parameters** | 9B | | **Languages** | English, Chinese | | **License** | Apache 2.0 | | **Developer** | [Kassadin88](https://huggingface.co/Kassadin88) | ## Training Data This model was fine-tuned on **[Nemotron-SFT-OpenCode-v1](https://huggingface.co/datasets/nvidia/Nemotron-SFT-OpenCode-v1)**, NVIDIA's agentic instruction tuning dataset containing **144,468 high-quality samples** derived from 459K total trajectories. The dataset enhances LLMs' ability to operate within autonomous coding environments. ### Dataset Composition | Subset | Samples | Description | |--------|---------|-------------| | `general` | 90K | General agentic CLI questions with/without AGENTS.md context | | `bash_only_tool` | 97K | Restricted tool set (todo + bash) for foundational agent capabilities | | `bash_only_tool_skills` | 96K | Bash + skill loading for dynamic capability discovery | | `question_tool` | 76K | Interactive clarification via user questions during task execution | | `agent_skills` | 67K | Dynamic skill scanning and loading for task-specific capabilities | | `agent_skills_question_tool` | 33K | Combined skill loading + user clarification for complex tasks | ### Key Capabilities Trained - **Code Navigation**: Repository-aware reasoning and codebase traversal - **Tool Calling**: Structured tool invocation for bash, file operations, and more - **Skill Loading**: Dynamic discovery and loading of relevant agent skills - **Interactive Planning**: User clarification when requirements are ambiguous - **Multi-Step Reasoning**: SWE-Bench style problem decomposition and implementation ## Benchmark Results The model inherits strong foundational capabilities from Qwen3.5-9B. Below are the base model's benchmark performances: ### Language Benchmarks
Category Benchmark Qwen3.5-9B
Knowledge & STEM
MMLU-Pro82.5
MMLU-Redux91.1
C-Eval88.2
GPQA Diamond81.7
Instruction Following
IFEval91.5
Long Context
LongBench v255.2
Reasoning & Coding
LiveCodeBench v665.6
### Vision Language Benchmarks
Category Benchmark Qwen3.5-9B
STEM & Puzzle
MMMU78.4
MathVision78.9
Mathvista (mini)85.7
Document Understanding
OCRBench89.2
Video Understanding
VideoMME (w/ sub)84.5
> **Note**: For complete benchmark results across all categories, please refer to the [Qwen3.5-9B model card](https://huggingface.co/Qwen/Qwen3.5-9B). ## Quick Start ### Using Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "Kassadin88/Nemotron-9B-OpenCode" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) messages = [ {"role": "system", "content": "You are a helpful coding assistant."}, {"role": "user", "content": "Write a Python function to merge two sorted arrays."} ] input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(input_text, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=512, do_sample=True ) response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) print(response) ``` ### Using vLLM (Recommended for Production) ```python from vllm import LLM, SamplingParams llm = LLM( model="Kassadin88/Nemotron-9B-OpenCode", trust_remote_code=True, dtype="bfloat16" ) sampling_params = SamplingParams( max_tokens=1024 ) outputs = llm.generate(prompts, sampling_params) ``` ### Using SGLang ```bash python -m sglang.launch_server \ --model-path Kassadin88/Nemotron-9B-OpenCode \ --port 8000 \ --tp-size 1 ``` ### OpenAI-Compatible API ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:8000/v1", api_key="EMPTY" ) response = client.chat.completions.create( model="Kassadin88/Nemotron-9B-OpenCode", messages=[ {"role": "user", "content": "Write a quicksort implementation in Python"} ], max_tokens=512 ) print(response.choices[0].message.content) ``` ## Usage Tips ### For Agentic Coding Tasks ```python messages = [ {"role": "system", "content": "You are an autonomous coding agent. Use the available tools to complete tasks."}, {"role": "user", "content": "Fix the bug in src/utils/parser.py that causes incorrect JSON parsing."} ] ``` ### For Code Generation ```python outputs = model.generate( **inputs, max_new_tokens=1024, do_sample=True ) ``` ### For Code Explanation ```python outputs = model.generate( **inputs, max_new_tokens=512, do_sample=True ) ``` ## Limitations - The model is primarily trained on agentic coding tasks and may not perform optimally on general conversational tasks - May occasionally generate incorrect or incomplete code - Should not be used for malicious code generation ## Citation ```bibtex @misc{nemotron-9b-opencode, author = {Kassadin88}, title = {Nemotron-9B-OpenCode: An Instruction-Tuned Model for Autonomous Software Engineering}, year = {2026}, publisher = {HuggingFace}, url = {https://huggingface.co/Kassadin88/Nemotron-9B-OpenCode} } ``` ## Acknowledgments - **Base Model**: [Qwen Team](https://github.com/QwenLM/Qwen3) for Qwen3.5-9B - **Training Data**: [NVIDIA](https://huggingface.co/datasets/nvidia/Nemotron-SFT-OpenCode-v1) for Nemotron-SFT-OpenCode-v1 - **Training Framework**: [MS-Swift](https://github.com/modelscope/swift) --- **Note:** This model is intended for research and educational purposes. Please use responsibly.