prism-coder:32b β€” AAC Tool Router + Coder (32B)

Fine-tuned from Qwen3-32B for tool routing and advanced code assistance in the Prism AAC system.

BFCL accuracy: 99% on 100-case routing benchmark. Quality escalation tier in the desktop cascade β€” catches the ~1-3% of cases where 14B is uncertain.

What it does

  • Perfect tool routing on all tested categories
  • Advanced code generation and architecture assistance
  • Complex multi-step session management
  • Final local quality gate before cloud Claude

Deployment

Available on Ollama Hub (recommended β€” avoids 18GB download for Ollama users):

ollama run dcostenco/prism-coder:32b

Or pull manually with the GGUF from this repo when available.

Cascade position

Desktop cascade: 14B β†’ 32B (escalation) β†’ cloud Claude

When 14B returns low-confidence or fails, 32B is invoked automatically. Users with Ollama running get 32B as their local ceiling before cloud.

Training

  • Base: Qwen3-32B
  • Method: MLX LoRA fine-tuning (v28-codebase + routing)
  • Hardware: Apple Silicon (M-series, 64GB RAM)
  • Eval: BFCL routing 99% (11/11 on manual benchmark)

Note on GGUF

The full Q4_K_M GGUF is 18GB. It is distributed via Ollama Hub at dcostenco/prism-coder:32b to avoid large download overhead. Direct GGUF will be added here in a future release.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for dcostenco/prism-coder-32b

Base model

Qwen/Qwen3-32B
Finetuned
(511)
this model