File size: 8,698 Bytes
bc24c5a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bad5cb1
bc24c5a
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
---
base_model:
- Qwen/Qwen2.5-Coder-32B-Instruct
language:
- en
license: mit
pipeline_tag: text-generation
tags:
- code
- coding-agent
- SWE-agent
- distillation
- agent
library_name: transformers
---

<h1 style="
  font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Helvetica,Arial,sans-serif;
  font-size:48px;
  font-weight:700;
  line-height:1.25;
  text-align:center;
  margin:0 0 24px;">
  Mocha-Coder-32B
</h1>

<p style="text-align:center; margin:0 0 8px; font-size:16px;">
  <a href="https://junliwang.tech/">Junli Wang</a><sup>*</sup> &nbsp;
  <a href="https://blankcheng.github.io/">Zhoujun Cheng</a><sup>*</sup> &nbsp;
  <a href="https://yuxuan-zhang-dexter.github.io/">Yuxuan Zhang</a><sup>*</sup> &nbsp;
  <a href="https://ber666.github.io/">Shibo Hao</a> &nbsp;
  <a href="https://yaotang23.github.io/">Yao Tang</a> &nbsp;
  <br>
  <a href="https://zhiting.ucsd.edu/">Zhiting Hu</a> &nbsp;
  <a href="https://prithvirajva.com/">Prithviraj Ammanabrolu</a> &nbsp;
  <a href="https://haozhang.ai/">Hao Zhang</a><sup></sup>
</p>

<p style="text-align:center; margin:0 0 24px; font-size:14px; color:#555;">
  University of California, San Diego &nbsp;·&nbsp;
  <sup>*</sup>Equal Contribution &nbsp;·&nbsp;
  <sup></sup>Corresponding Author
</p>

<div style="
  display:flex;
  justify-content:center;
  gap:12px;
  flex-wrap:wrap;
  margin-bottom:28px;">

  <a href="https://github.com/cocoa-org/NanoRollout" style="
     display:inline-block;
     padding:8px 24px;
     background:#2b2b2b;
     color:#ffffff;
     border-radius:36px;
     text-decoration:none;
     font-weight:600;
     font-size:16px;">
    🧑‍💻 NanoRollout Code
  </a>

  <a href="https://huggingface.co/ZeonLap/Mocha-Coder-32B" style="
     display:inline-block;
     padding:8px 24px;
     background:#2b2b2b;
     color:#ffffff;
     border-radius:36px;
     text-decoration:none;
     font-weight:600;
     font-size:16px;">
    🤗 Mocha-Coder-32B Model
  </a>

  <a href="https://cocoa-org.notion.site/nanorollout" style="
     display:inline-block;
     padding:8px 24px;
     background:#2b2b2b;
     color:#ffffff;
     border-radius:36px;
     text-decoration:none;
     font-weight:600;
     font-size:16px;">
    📒 Blog
  </a>
</div>

<div style="max-width:900px;margin:0 auto;">

# Introduction
<div style="
  max-width: 880px;
  margin: 0 auto;
  text-align: justify;
  text-justify: inter-word;
  line-height: 1.6;">

**Mocha-Coder-32B** is a strong open-data coding agent built on top of **Qwen2.5-Coder-32B-Instruct**. It is trained entirely through distillation on a 300K+ trajectory mixture sampled with our lightweight agent-rollout infrastructure, **NanoRollout**, with no reinforcement learning. The full training signal comes from frontier open-source teacher models (Qwen3-Coder-480B-A35B, Kimi-K2.5, Qwen3-Coder-Next, DeepSeek-V3.2) generating trajectories across multiple agent harnesses (OpenHands, mini-swe-agent, Terminus-2 JSON) on SWE-Rebench, SWE-Smith, and SETA.

The result is a simple but strong baseline coding agent: at the ≤32B scale, Mocha-Coder-32B is the state-of-the-art among open-data models and is competitive with much larger open-source models on agentic SWE benchmarks.
</div>

### Key Features

- **Strong agentic SWE performance**: 62.6 Pass@1 on SWE-Bench Verified, 35.3 on SWE-Bench Pro, 23.6 on Terminal-Bench 2.0, competitive with Qwen3-Coder-480B-A35B-Instruct.
- **Multi-harness training**: Trajectories cover OpenHands, mini-swe-agent, and Terminus-2 JSON, mitigating harness-specific overfitting.
- **Open data**: Distilled from a fully released 300K+ trajectory mixture (`ZeonLap/Mocha-trajectories`).

# Performance

### SWE-Bench Verified
<div align="center">

| **Model**                        | **Max Iteration** | **SWE-Bench Verified (Pass@1)** |
|----------------------------------|:-----------------:|:-------------------------------:|
| Qwen3-Coder-480B-A35B-Instruct   | 100               | 67.0                            |
| **Mocha-Coder-32B**              | 100               | **62.6**                        |
| SWE-Master-32B-RL                | 150               | 61.4                            |
| Kimi-Dev-72B                     | Agentless, TTS@40 | 60.4                            |
| CoderForge-Preview-32B           | 100               | 59.4                            |
| GLM-4.7-Flash                    | 100               | 59.2                            |
| daVinci-Dev-72B                  | 100               | 58.5                            |
| daVinci-Dev-32B                  | 100               | 56.1                            |
| SERA-32B                         | 100               | 54.2                            |
| Qwen3-Coder-30B-A3B-Instruct     | 100               | 51.6                            |
| Qwen2.5-Coder-32B-Instruct (Base)| 100               | 6.2                             |
</div>

### SWE-Bench Pro
<div align="center">

| **Model**                        | **Max Iteration** | **SWE-Bench Pro (Pass@1)** |
|----------------------------------|:-----------------:|:--------------------------:|
| Qwen3-Coder-480B-A35B-Instruct   | 250               | 38.7                       |
| **Mocha-Coder-32B**              | 250               | **35.3**                   |
| Gemini-3-flash                   | 250               | 34.6                       |
| Kimi-K2-Instruct                 | 250               | 27.7                       |
| DeepSeek-V3.2                    | 250               | 15.6                       |
| Qwen2.5-Coder-32B-Instruct (Base)| 250               | 0.0                        |
</div>

### Terminal-Bench 2.0
<div align="center">

| **Model**                        | **Terminal-Bench 2.0** |
|----------------------------------|:----------------------:|
| Qwen3-Coder-480B-A35B-Instruct   | 23.9                   |
| **Mocha-Coder-32B**              | **23.6**               |
| Qwen3-Coder-30B-A3B-Instruct     | 13.5                   |
| Qwen2.5-Coder-32B-Instruct (Base)| 3.4                    |
</div>

# Training Data

Mocha-Coder-32B is trained on a **300K+ trajectory** distillation mixture, drawn from previously released distillation sets (120K) and trajectories newly generated with NanoRollout (~180K).

| **Dataset**     | **Teacher Model**           | **Harness**       | **# Trajectories (K)** | **Source**        |
|-----------------|-----------------------------|-------------------|:----------------------:|-------------------|
| SWE-Rebench     | Qwen3-Coder-480B-A35B       | OpenHands         | 32.2                   | Nebius            |
| SWE-Smith       | Qwen3-Coder-480B-A35B       | OpenHands         | 89.5                   | CoderForge        |
| SWE-Rebench     | Kimi-K2.5                   | mini-swe-agent    | 83.6                   | NanoRollout      |
| SWE-Rebench     | Qwen3-Coder-Next            | mini-swe-agent    | 11.5                   | NanoRollout      |
| SWE-Smith       | Qwen3-Coder-480B-A35B       | mini-swe-agent    | 12.8                   | NanoRollout      |
| SWE-Smith       | Qwen3-Coder-Next            | mini-swe-agent    | 9.1                    | NanoRollout      |
| SETA            | Kimi-K2.5 / DeepSeek-V3.2   | Terminus-2 JSON   | 14.0                   | NanoRollout      |

The full mixture is released at [`ZeonLap/Mocha-trajectories`](https://huggingface.co/datasets/ZeonLap/Mocha-trajectories).

# Running as an Agent

Mocha-Coder-32B is trained as an agent and is most useful when paired with a coding-agent harness. We have validated it with:

- **mini-swe-agent** — minimal SWE agent loop, recommended for SWE-Bench Verified / Pro evaluation.
- **OpenHands** — full-featured SWE harness; the model was trained on OpenHands trajectories.
- **Terminus-2 JSON** — for Terminal-Bench 2.0 style shell tasks.

Point each harness's model endpoint at the vLLM server above. For SWE-Bench Verified we report numbers at a 100-iteration budget; for SWE-Bench Pro at 250 iterations.

# License

Mocha-Coder-32B (model weights, training trajectories, and code) is released under the **MIT License** (see `LICENSE`) for research, educational, and commercial use.

# Citation

If you use Mocha-Coder-32B or NanoRollout in your research, please cite NanoRollout:

```bibtex
@misc{nanorollout,
  title  = {NanoRollout: A Lightweight Infra for Digital Agent Rollout at Scale},
  author = {Wang, Junli and Cheng, Zhoujun and Zhang, Yuxuan and Hao, Shibo
            and Tang, Yao and Hu, Zhiting and Ammanabrolu, Prithviraj
            and Zhang, Hao},
  year   = {2026},
  howpublished = {\url{https://github.com/cocoa-org/NanoRollout}},
}
```

</div>