smirki commited on
Commit
fde9b80
·
verified ·
1 Parent(s): 9f0c62c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +9 -19
README.md CHANGED
@@ -23,7 +23,7 @@ model-index:
23
  metrics:
24
  - name: pass@5
25
  type: accuracy
26
- value: 90
27
  - task:
28
  type: text-generation
29
  dataset:
@@ -44,7 +44,7 @@ model-index:
44
  metrics:
45
  - name: Pass Rate
46
  type: accuracy
47
- value: 28
48
  ---
49
 
50
  <div align="center">
@@ -86,12 +86,12 @@ The model shows strong agentic behavior: it recovers from errors (read-before-wr
86
 
87
  <div align="center">
88
 
89
- | Benchmark | **OmniCoder-9B** | Qwen3.5-9B | Qwen3-Next-80B | GPT-OSS-120B | GPT-OSS-20B | GLM 4.7 |
90
- |:---|:---:|:---:|:---:|:---:|:---:|:---:|
91
- | **AIME 2025** (pass@5) | 90 | | | | | |
92
- | **GPQA Diamond** (pass@1) | **83.8** | 81.7 | 77.2 | 80.1 | 71.5 | |
93
- | **GPQA Diamond** (pass@3) | **86.4** | | | | | |
94
- | **Terminal-Bench 2.0** | **28** | 20 | | | | 33.4 |
95
 
96
  </div>
97
 
@@ -164,16 +164,6 @@ See all quantizations: [Tesslate/OmniCoder-9B-GGUF](https://huggingface.co/Tessl
164
  | **Precision** | bf16 |
165
  | **Optimizer** | AdamW (lr=2e-4, cosine schedule) |
166
 
167
- ### Training Data Sources
168
-
169
- | Source | Samples | Description |
170
- |:---|---:|:---|
171
- | NVIDIA Nemotron-Terminal-Corpus | 226K | Terminal agent trajectories |
172
- | CoderForge-Preview (reward >= 0.5) | 155K | SWE-bench style coding trajectories |
173
- | Nemotron Skill-Based | 24K | Skill-based coding tasks |
174
- | Scale-SWE | 20K | Real GitHub issue patches (synthesized trajectories) |
175
- | Opus Reasoning | 2.3K | Chain-of-thought reasoning |
176
-
177
  ---
178
 
179
  ## Architecture
@@ -181,7 +171,7 @@ See all quantizations: [Tesslate/OmniCoder-9B-GGUF](https://huggingface.co/Tessl
181
  OmniCoder inherits Qwen3.5-9B's hybrid architecture:
182
 
183
  - **Gated Delta Networks** : Linear attention layers interleaved with standard attention for efficient long-range dependencies
184
- - **VLM Backbone** : Built on `Qwen3_5ForConditionalGeneration` (supports future multimodal extensions)
185
 
186
  ---
187
 
 
23
  metrics:
24
  - name: pass@5
25
  type: accuracy
26
+ value: 90.0
27
  - task:
28
  type: text-generation
29
  dataset:
 
44
  metrics:
45
  - name: Pass Rate
46
  type: accuracy
47
+ value: 28.0
48
  ---
49
 
50
  <div align="center">
 
86
 
87
  <div align="center">
88
 
89
+ | Benchmark | **OmniCoder-9B** | Qwen3.5-9B | Qwen3-Next-80B | GPT-OSS-120B | GPT-OSS-20B | GLM-4.7-Flash | GLM 4.7 | Claude Haiku 4.5 |
90
+ |:---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
91
+ | **AIME 2025** (pass@5) | 90 | | | | 91.7 | 91.6 | | |
92
+ | **GPQA Diamond** (pass@1) | **83.8** | 81.7 | 77.2 | 80.1 | 71.5 | | | 73 |
93
+ | **GPQA Diamond** (pass@3) | **86.4** | | | | | | | |
94
+ | **Terminal-Bench 2.0** | **28** | 20 | | | | | 33.4 | 27 |
95
 
96
  </div>
97
 
 
164
  | **Precision** | bf16 |
165
  | **Optimizer** | AdamW (lr=2e-4, cosine schedule) |
166
 
 
 
 
 
 
 
 
 
 
 
167
  ---
168
 
169
  ## Architecture
 
171
  OmniCoder inherits Qwen3.5-9B's hybrid architecture:
172
 
173
  - **Gated Delta Networks** : Linear attention layers interleaved with standard attention for efficient long-range dependencies
174
+ - **VLM Backbone** : Built on `Qwen3_5ForConditionalGeneration`
175
 
176
  ---
177