File size: 2,873 Bytes

b9065d5
 
8e2d6c4
 
 
 
 
 
 
 
b9065d5
 
793df28
b9065d5
0e911e9
b9065d5
793df28
 
 
 
 
 
 
 
 
 
b9065d5
793df28
b9065d5
793df28
b9065d5
793df28
b9065d5
793df28
b9065d5
793df28
b9065d5
793df28
b9065d5
793df28
8daa80b
793df28
b9065d5
793df28
b9065d5
793df28
b9065d5
793df28
 
8daa80b
793df28
 
 
b9065d5
793df28
b9065d5
793df28
 
 
8daa80b
 
793df28
 
 
 
 
 
 
 
b9065d5
793df28
b9065d5
793df28
 
 
b9065d5
793df28

---
library_name: transformers
tags:
- agent
- code
license: mit
datasets:
- ricdomolm/mini-coder-trajs-400k
base_model:
- Qwen/Qwen3-4B-Instruct-2507
---

# mini-coder-4b

`mini-coder-4b` is a 4B parameter model distilled from Qwen 3 Coder 30B A3B. It punches well above its weight, outperforming gpt-oss-120b on [SWE-bench Verified Bash only](https://www.swebench.com/):

<div align="center">
  
| Model                  | pass@1 | pass@100 |
|-------------------------|--------|----------|
| Qwen 3 Coder 30B-A3B    | 33.2   | 67.4     |
| mini-swe-4b             | 26.8   | 60.2     |
| gpt-oss-120b            | 26.0   | –        |
| mini-swe-1.7b           | 18.6   | 50.4     |
| SWE-agent-LM 7B         | 15.2   | –        |
| Qwen 3 4B Instruct 2507 | 4.0    | 25.1     |

</div>

It is trained on 400k training trajectories using the lightweight [mini-swe-agent](https://mini-swe-agent.com/latest/) scaffolding and the [SWE-smith](https://huggingface.co/datasets/SWE-bench/SWE-smith) dataset of GitHub issues.

Unlike existing agentic SWE models, the `mini-coder` models can be post-trained on a single 80GB GPU—or smaller. They work seamlessly with mini-swe-agent, a lightweight, scalable, and developer-friendly agentic framework well-suited for RL fine-tuning. And because they are dense rather than MoE models, they benefit from a more mature fine-tuning ecosystem.

## Example usage: Generating SWE-bench trajectories with mini-swe-agent and vLLM

This example shows how to generate SWE-bench trajectories using [mini-swe-agent](https://mini-swe-agent.com/latest/) as the agentic scaffolding (recommended) and [vLLM](https://docs.vllm.ai/en/latest/) as the local inference engine.  

First, launch a vLLM server with your chosen model. For example:  

```bash
vllm serve ricdomolm/mini-coder-4b &
```

By default, the server will be available at `http://localhost:8000`.

Second, edit the mini-swe-agent SWE-bench config file located in `src/minisweagent/config/extra/swebench.yaml` to include your local vLLM model:  

```yaml
model:
  model_name: "hosted_vllm/ricdomolm/mini-coder-4b"  # or hosted_vllm/path/to/local/model
  model_kwargs:
    api_base: "http://localhost:8000/v1"  # adjust if using a non-default port/address
```

Create a litellm `registry.json` file:  

```bash
cat > registry.json <<'EOF'
{
  "ricdomolm/mini-coder-4b": {
    "max_tokens": 64000,
    "input_cost_per_token": 0.0,
    "output_cost_per_token": 0.0,
    "litellm_provider": "hosted_vllm",
    "mode": "chat"
  }
}
EOF
```

Now you’re ready to generate trajectories! Let's solve the `django__django-11099` instance of SWE-bench Verified:

```bash
LITELLM_MODEL_REGISTRY_PATH=registry.json mini-extra swebench --output test/ --subset verified --split test --filter '^(django__django-11099)$'
```

You should now see the generated trajectory in the `test/` directory.