Arche-Codium-500M

A compact, instruction-finetuned code generation model built on Qwen2.5-Coder-0.5B-Instruct. Designed for fast, local code completion with minimal resource requirements.

TL;DR

500M parameters — runs on CPU, MPS, or low-VRAM GPU
80% pass rate on HumanEval (16/20 tasks)
Apache 2.0 — fully open, commercially usable
CLI-ready — plug into arche-code and start generating

Quick Start

With `arche-code` CLI (recommended)

pip install arche-code
arche-code --provider arche --model ./models/arche-codium-500m write   "def fibonacci(n):" --max-tokens 256

Full CLI docs: github.com/OpenSynapseLabs/arche-code

With Hugging Face `transformers`

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "OpenSynapseLabs/arche-codium-500m",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("OpenSynapseLabs/arche-codium-500m")

prompt = '''def has_close_elements(numbers: list[float], threshold: float) -> bool:
    """Check if any two numbers are closer than threshold."""'''

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Benchmarks

Benchmark	Result
HumanEval	16/20 (80%)

Tested on 20 selected HumanEval tasks covering list operations, string manipulation, arithmetic, and basic algorithms. See evaluation logs in the model files.

What This Model Is

Lead magnet — a free, capable entry point into the Arche ecosystem
Edge-friendly — 0.5B means it runs on laptops, Raspberry Pi, and mobile devices
Real code — generates executable Python, not just snippets

What This Model Is Not

A replacement for 7B+ models on complex architecture tasks
A chat model — it is instruction-tuned for code generation only
The final word — larger Arche coding models are shipping this month

Model Details

Property	Value
Base model	Qwen2.5-Coder-0.5B-Instruct
Parameters	0.49B
Context length	32,768 tokens
License	Apache 2.0
Training	Instruction fine-tuning on code-completion tasks

Hardware Requirements

Device	VRAM/RAM	Speed
Apple Silicon (MPS)	2 GB unified	~50 tok/s
NVIDIA GPU (CUDA)	2 GB	~80 tok/s
CPU only	4 GB RAM	~10 tok/s

Limitations

Struggles with multi-step reasoning (e.g., LRU cache with TTL)
May truncate output at max_tokens limits — increase if code cuts off
Hallucinates imports occasionally — always verify generated code
Best for functions under 50 lines; breaks down on large classes

Citation

@software{arche_codium_500m,
  author = {Open Synapse Labs},
  title = {Arche-Codium-500M: Compact Code Generation Model},
  year = {2026},
  url = {https://huggingface.co/OpenSynapseLabs/arche-codium-500m}
}

Contact

📧 opensynapselabs@proton.me
🐙 github.com/OpenSynapseLabs

Built by Open Synapse Labs. Base model: Qwen2.5-Coder-0.5B-Instruct (Apache 2.0).

Downloads last month: -

Safetensors

Model size

0.5B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support