|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- agent |
|
|
- code |
|
|
license: mit |
|
|
datasets: |
|
|
- ricdomolm/mini-coder-trajs-400k |
|
|
base_model: |
|
|
- Qwen/Qwen3-4B-Instruct-2507 |
|
|
--- |
|
|
|
|
|
# mini-coder-4b |
|
|
|
|
|
`mini-coder-4b` is a 4B parameter model distilled from Qwen 3 Coder 30B A3B. It punches well above its weight, outperforming gpt-oss-120b on [SWE-bench Verified Bash only](https://www.swebench.com/): |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
| Model | pass@1 | pass@100 | |
|
|
|-------------------------|--------|----------| |
|
|
| Qwen 3 Coder 30B-A3B | 33.2 | 67.4 | |
|
|
| mini-swe-4b | 26.8 | 60.2 | |
|
|
| gpt-oss-120b | 26.0 | – | |
|
|
| mini-swe-1.7b | 18.6 | 50.4 | |
|
|
| SWE-agent-LM 7B | 15.2 | – | |
|
|
| Qwen 3 4B Instruct 2507 | 4.0 | 25.1 | |
|
|
|
|
|
</div> |
|
|
|
|
|
It is trained on 400k training trajectories using the lightweight [mini-swe-agent](https://mini-swe-agent.com/latest/) scaffolding and the [SWE-smith](https://huggingface.co/datasets/SWE-bench/SWE-smith) dataset of GitHub issues. |
|
|
|
|
|
Unlike existing agentic SWE models, the `mini-coder` models can be post-trained on a single 80GB GPU—or smaller. They work seamlessly with mini-swe-agent, a lightweight, scalable, and developer-friendly agentic framework well-suited for RL fine-tuning. And because they are dense rather than MoE models, they benefit from a more mature fine-tuning ecosystem. |
|
|
|
|
|
## Example usage: Generating SWE-bench trajectories with mini-swe-agent and vLLM |
|
|
|
|
|
This example shows how to generate SWE-bench trajectories using [mini-swe-agent](https://mini-swe-agent.com/latest/) as the agentic scaffolding (recommended) and [vLLM](https://docs.vllm.ai/en/latest/) as the local inference engine. |
|
|
|
|
|
First, launch a vLLM server with your chosen model. For example: |
|
|
|
|
|
```bash |
|
|
vllm serve ricdomolm/mini-coder-4b & |
|
|
``` |
|
|
|
|
|
By default, the server will be available at `http://localhost:8000`. |
|
|
|
|
|
Second, edit the mini-swe-agent SWE-bench config file located in `src/minisweagent/config/extra/swebench.yaml` to include your local vLLM model: |
|
|
|
|
|
```yaml |
|
|
model: |
|
|
model_name: "hosted_vllm/ricdomolm/mini-coder-4b" # or hosted_vllm/path/to/local/model |
|
|
model_kwargs: |
|
|
api_base: "http://localhost:8000/v1" # adjust if using a non-default port/address |
|
|
``` |
|
|
|
|
|
Create a litellm `registry.json` file: |
|
|
|
|
|
```bash |
|
|
cat > registry.json <<'EOF' |
|
|
{ |
|
|
"ricdomolm/mini-coder-4b": { |
|
|
"max_tokens": 64000, |
|
|
"input_cost_per_token": 0.0, |
|
|
"output_cost_per_token": 0.0, |
|
|
"litellm_provider": "hosted_vllm", |
|
|
"mode": "chat" |
|
|
} |
|
|
} |
|
|
EOF |
|
|
``` |
|
|
|
|
|
Now you’re ready to generate trajectories! Let's solve the `django__django-11099` instance of SWE-bench Verified: |
|
|
|
|
|
```bash |
|
|
LITELLM_MODEL_REGISTRY_PATH=registry.json mini-extra swebench --output test/ --subset verified --split test --filter '^(django__django-11099)$' |
|
|
``` |
|
|
|
|
|
You should now see the generated trajectory in the `test/` directory. |