prism-coder-14b / README.md
dcostenco's picture
Update README: add v42 coding eval 22/22, cascade scorecard
acfc854 verified
metadata
language: en
license: apache-2.0
base_model: Qwen/Qwen3-14B
tags:
  - tool-calling
  - routing
  - code-generation
  - typescript
  - healthcare
  - aac
  - qwen3
  - gguf

prism-coder:14b β€” Dual-Purpose: Tool Routing + Healthcare TypeScript Coder

Fine-tuned Qwen3-14B for the Prism AAC / Synalux healthcare platform.

Two trained capabilities in one model family:

  • Routing (v36): 6-tool routing for Prism MCP sessions β€” 100% BFCL
  • Coding (v42): Synalux-pattern TypeScript code generation β€” 22/22 checks (100%)

Coding Eval β€” v42 (Current Production Coder)

22/22 (100%) on the Synalux healthcare TypeScript eval.

Task: write a production Next.js API route for X12 835 ERA reconciliation against existing 837P claims.

Check Pass
withAudit wrapper βœ“
authenticateRequest βœ“
supabaseAdmin (not client) βœ“
cross-tenant guard (workspace_members + BILLING_ROLES) βœ“
UUID_RX validation βœ“
decryptPhi before PHI access βœ“
HIPAA audit (hipaa_access_log) βœ“
HIPAA non-blocking (.then) βœ“
409 already-reconciled guard βœ“
422 no CLP segments βœ“
parse CLP segment βœ“
parse SVC segment βœ“
parse CAS CO (contractual) adjustment βœ“
parse CAS PR (patient responsibility) βœ“
GL cash_received entry βœ“
GL contractual_adjustment entry βœ“
GL patient_ar entry βœ“
claim status map (1=paid) βœ“
claim status map (4=denied) βœ“
no postgres detail in 500 βœ“
belt-and-suspenders workspace_id eq on update βœ“
marks ERA file reconciled βœ“

Training chain: Qwen3-14B β†’ v34 (1000-iter routing, 18/22) β†’ v39 (HIPAA+CAS patch, 20/22) β†’ v42 (claim status patch, 22/22).

v42 Training Details

  • Base: Qwen/Qwen3-14B (BF16)
  • Corpus: v28 Synalux codebase SFT + targeted patch (claim status Γ— 50 examples, resume from v39)
  • Training: MLX LoRA, rank=16, 8 layers, 100 iters, LR=5e-7
  • Final loss: 0.036 (converged)
  • Merge: direct safetensors LoRA merge β†’ GGUF F16 β†’ Q4_K_M

BFCL Routing Benchmark β€” v36

Mean: 100.0% PERFECT (3-seed average, seeds 2027/2028/2029, 102 cases each)

Category Accuracy
aac (AAC phrase requests) 100%
cmpct (ledger compaction) 100%
edge (multi-step compound) 100%
hand (agent handoff) 100%
info (general facts) 100%
irrel (irrelevant/live queries) 100%
know (knowledge base search) 100%
load (session context loading) 100%
pred (factual queries) 100%
save (session ledger save) 100%
smem (session memory search) 100%
tran (translation) 100%

Tools (routing model)

Tool Trigger
session_load_context Load/resume project context
session_save_ledger Note/log/record/remember
session_save_handoff Pass state to next agent/session
session_compact_ledger Shrink/prune ledger
session_search_memory Recall prior session discussions
knowledge_search Search stored knowledge base

Version History

Version Eval Type Notes
v42 22/22 coding (100%) Coder Claim status patch on v39; zero tolerance policy
v39 20/22 coding Coder HIPAA non-blocking + CAS CO/PR fixes
v36 100% BFCL routing Router smem boundary + hand trigger fixes
v34 98.0% BFCL routing Router hand/save/smem fixes
v33 97.1% BFCL routing Router irrel/tran/smem fixes

GGUF Files

File Use Size
qwen3-14b-v42-q4km.gguf Coding β€” production Synalux TypeScript ~9 GB
prism-coder-14b-v36-q4km.gguf Routing β€” Prism MCP tool routing ~9 GB
qwen3-14b-v34-q4km.gguf Routing (prior) ~9 GB

Usage

# Load as coding model
ollama pull dcostenco/prism-coder-14b
# Then use qwen3-14b-v42-q4km.gguf Modelfile

# Load as routing model
# Use prism-coder-14b-v36-q4km.gguf Modelfile