ricdomolm
/

mini-coder-4b

Text Generation

text-generation-inference

Model card Files Files and versions

mini-coder-4b / README.md

ricdomolm's picture

Update README.md

8daa80b verified 3 months ago

|

history blame contribute delete

2.87 kB

	---
	library_name: transformers
	tags:
	- agent
	- code
	license: mit
	datasets:
	- ricdomolm/mini-coder-trajs-400k
	base_model:
	- Qwen/Qwen3-4B-Instruct-2507
	---

	# mini-coder-4b

	`mini-coder-4b` is a 4B parameter model distilled from Qwen 3 Coder 30B A3B. It punches well above its weight, outperforming gpt-oss-120b on [SWE-bench Verified Bash only](https://www.swebench.com/):

	<div align="center">

	\| Model \| pass@1 \| pass@100 \|
	\|-------------------------\|--------\|----------\|
	\| Qwen 3 Coder 30B-A3B \| 33.2 \| 67.4 \|
	\| mini-swe-4b \| 26.8 \| 60.2 \|
	\| gpt-oss-120b \| 26.0 \| – \|
	\| mini-swe-1.7b \| 18.6 \| 50.4 \|
	\| SWE-agent-LM 7B \| 15.2 \| – \|
	\| Qwen 3 4B Instruct 2507 \| 4.0 \| 25.1 \|

	</div>

	It is trained on 400k training trajectories using the lightweight [mini-swe-agent](https://mini-swe-agent.com/latest/) scaffolding and the [SWE-smith](https://huggingface.co/datasets/SWE-bench/SWE-smith) dataset of GitHub issues.

	Unlike existing agentic SWE models, the `mini-coder` models can be post-trained on a single 80GB GPU—or smaller. They work seamlessly with mini-swe-agent, a lightweight, scalable, and developer-friendly agentic framework well-suited for RL fine-tuning. And because they are dense rather than MoE models, they benefit from a more mature fine-tuning ecosystem.

	## Example usage: Generating SWE-bench trajectories with mini-swe-agent and vLLM

	This example shows how to generate SWE-bench trajectories using [mini-swe-agent](https://mini-swe-agent.com/latest/) as the agentic scaffolding (recommended) and [vLLM](https://docs.vllm.ai/en/latest/) as the local inference engine.

	First, launch a vLLM server with your chosen model. For example:

	```bash
	vllm serve ricdomolm/mini-coder-4b &
	```

	By default, the server will be available at `http://localhost:8000`.

	Second, edit the mini-swe-agent SWE-bench config file located in `src/minisweagent/config/extra/swebench.yaml` to include your local vLLM model:

	```yaml
	model:
	model_name: "hosted_vllm/ricdomolm/mini-coder-4b" # or hosted_vllm/path/to/local/model
	model_kwargs:
	api_base: "http://localhost:8000/v1" # adjust if using a non-default port/address
	```

	Create a litellm `registry.json` file:

	```bash
	cat > registry.json <<'EOF'
	{
	"ricdomolm/mini-coder-4b": {
	"max_tokens": 64000,
	"input_cost_per_token": 0.0,
	"output_cost_per_token": 0.0,
	"litellm_provider": "hosted_vllm",
	"mode": "chat"
	}
	}
	EOF
	```

	Now you’re ready to generate trajectories! Let's solve the `django__django-11099` instance of SWE-bench Verified:

	```bash
	LITELLM_MODEL_REGISTRY_PATH=registry.json mini-extra swebench --output test/ --subset verified --split test --filter '^(django__django-11099)$'
	```

	You should now see the generated trajectory in the `test/` directory.