prism-coder:32b — AAC Tool Router + Coder (32B)

Fine-tuned from Qwen3-32B for tool routing and advanced code assistance in the Prism AAC system.

BFCL accuracy: 99% on 100-case routing benchmark. Quality escalation tier in the desktop cascade — catches the ~1-3% of cases where 14B is uncertain.

What it does

Perfect tool routing on all tested categories
Advanced code generation and architecture assistance
Complex multi-step session management
Final local quality gate before cloud Claude

Deployment

Available on Ollama Hub (recommended — avoids 18GB download for Ollama users):

ollama run dcostenco/prism-coder:32b

Or pull manually with the GGUF from this repo when available.

Cascade position

Desktop cascade: 14B → 32B (escalation) → cloud Claude

When 14B returns low-confidence or fails, 32B is invoked automatically. Users with Ollama running get 32B as their local ceiling before cloud.

Training

Base: Qwen3-32B
Method: MLX LoRA fine-tuning (v28-codebase + routing)
Hardware: Apple Silicon (M-series, 64GB RAM)
Eval: BFCL routing 99% (11/11 on manual benchmark)

Note on GGUF

The full Q4_K_M GGUF is 18GB. It is distributed via Ollama Hub at dcostenco/prism-coder:32b to avoid large download overhead. Direct GGUF will be added here in a future release.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dcostenco/prism-coder-32b

Base model

Qwen/Qwen3-32B

Finetuned

(511)

this model