🔥 +22.7% better at code

Qwen3.5-4B forged for code generation through Experiential Plasticity
3.04 → 2.35 perplexity · 3 cycles · RTX 5090 · 45 minutes

Verify Chain of Custody

Every claim on this card is verified
ForgeAlloy chain of custody · Download alloy · Merkle-chained · Self-attested


Runs On

Device Format Size Status
iPhone / Android Q4_K_M 2.6GB GGUF available
MacBook Air 8GB Q4_K_M 2.6GB GGUF available
MacBook Air 16GB Q8_0 4.2GB GGUF available
MacBook Pro 32GB fp16 8.0GB Native
RTX 3090/4090 fp16 8.0GB Native
RTX 5090 fp16 8.0GB Forged here

Benchmarks

HumanEval evaluation in progress. Prior forge: 74.4% (63/85) on partial run. Full results will be added with proof via ForgeAlloy.

Metric Baseline Forged Change
Perplexity (code) 3.04 2.35 +22.7%
Parameters 4.1B 4.1B —
Domain general code specialized

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("continuum-ai/qwen3.5-4b-code-forged",
    torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("continuum-ai/qwen3.5-4b-code-forged")

inputs = tokenizer("def merge_sort(arr):", return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Forge Your Own

git clone https://github.com/CambrianTech/sentinel-ai && cd sentinel-ai && ./setup.sh
source .venv/bin/activate
python scripts/forge_model.py Qwen/Qwen3.5-4B --domain code

Or use the ForgeAlloy recipe — portable, typed, verifiable:

python scripts/alloy_executor.py qwen3.5-4b-code-forged.alloy.json

Chain of Custody

Every claim above is backed by the alloy file. Scan the QR or click to verify.

What Proof
Model weights unchanged sha256:f6b777... model hash in alloy
Code that ran sha256:464680... → alloy_executor.py
Forged on RTX 5090, fp16, 2026-03-31
Published to This repo, receipted
Trust level self-attested → what this means
Spec ForgeAlloy — Rust/Python/TypeScript SDK

The Science

Experiential Plasticity — not compression, architectural optimization:

  1. Train on code data (LoRA + AMP)
  2. Measure each attention head's contribution (entropy)
  3. Prune heads that don't contribute
  4. Retrain — surviving heads specialize
  5. Repeat — each cycle improves

Scaling law: improvement increases with model size. Domain-specific training (code) amplifies the effect.

Model Domain Improvement
Qwen2.5-0.5B general -3.2%
Qwen2.5-7B general +11.8%
Qwen3.5-4B code +22.7%
Qwen3.5-27B code +3.5%

Transfer function: 1.45 × exp(-0.18 × cycle) - 0.03

Papers


sentinel-ai · continuum · forge-alloy · all models

Forged with ForgeAlloy — every claim verified by cryptographic chain of custody

Downloads last month
2,559
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for continuum-ai/qwen3.5-4b-code-forged

Finetuned
Qwen/Qwen3.5-4B
Finetuned
(90)
this model
Quantizations
1 model