Arche-Codium-500M
A compact, instruction-finetuned code generation model built on Qwen2.5-Coder-0.5B-Instruct. Designed for fast, local code completion with minimal resource requirements.
TL;DR
- 500M parameters β runs on CPU, MPS, or low-VRAM GPU
- 80% pass rate on HumanEval (16/20 tasks)
- Apache 2.0 β fully open, commercially usable
- CLI-ready β plug into
arche-codeand start generating
Quick Start
With arche-code CLI (recommended)
pip install arche-code
arche-code --provider arche --model ./models/arche-codium-500m write "def fibonacci(n):" --max-tokens 256
Full CLI docs: github.com/OpenSynapseLabs/arche-code
With Hugging Face transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"OpenSynapseLabs/arche-codium-500m",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("OpenSynapseLabs/arche-codium-500m")
prompt = '''def has_close_elements(numbers: list[float], threshold: float) -> bool:
"""Check if any two numbers are closer than threshold."""'''
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Benchmarks
| Benchmark | Result |
|---|---|
| HumanEval | 16/20 (80%) |
Tested on 20 selected HumanEval tasks covering list operations, string manipulation, arithmetic, and basic algorithms. See evaluation logs in the model files.
What This Model Is
- Lead magnet β a free, capable entry point into the Arche ecosystem
- Edge-friendly β 0.5B means it runs on laptops, Raspberry Pi, and mobile devices
- Real code β generates executable Python, not just snippets
What This Model Is Not
- A replacement for 7B+ models on complex architecture tasks
- A chat model β it is instruction-tuned for code generation only
- The final word β larger Arche coding models are shipping this month
Model Details
| Property | Value |
|---|---|
| Base model | Qwen2.5-Coder-0.5B-Instruct |
| Parameters | 0.49B |
| Context length | 32,768 tokens |
| License | Apache 2.0 |
| Training | Instruction fine-tuning on code-completion tasks |
Hardware Requirements
| Device | VRAM/RAM | Speed |
|---|---|---|
| Apple Silicon (MPS) | 2 GB unified | ~50 tok/s |
| NVIDIA GPU (CUDA) | 2 GB | ~80 tok/s |
| CPU only | 4 GB RAM | ~10 tok/s |
Limitations
- Struggles with multi-step reasoning (e.g., LRU cache with TTL)
- May truncate output at
max_tokenslimits β increase if code cuts off - Hallucinates imports occasionally β always verify generated code
- Best for functions under 50 lines; breaks down on large classes
Citation
@software{arche_codium_500m,
author = {Open Synapse Labs},
title = {Arche-Codium-500M: Compact Code Generation Model},
year = {2026},
url = {https://huggingface.co/OpenSynapseLabs/arche-codium-500m}
}
Contact
π§ opensynapselabs@proton.me
π github.com/OpenSynapseLabs
Built by Open Synapse Labs. Base model: Qwen2.5-Coder-0.5B-Instruct (Apache 2.0).
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support