File size: 6,155 Bytes
4ed0ef7
ebec064
08d81a1
 
4ed0ef7
08d81a1
 
acfc854
 
 
08d81a1
09b91cf
08d81a1
4ed0ef7
 
bb7c6d1
bcd017e
acfc854
bcd017e
bb7c6d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
303254e
acfc854
4d685dc
bb7c6d1
acfc854
 
 
 
 
bb7c6d1
 
 
acfc854
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bb7c6d1
09b91cf
acfc854
09b91cf
bb7c6d1
09b91cf
bb7c6d1
4ed0ef7
acfc854
 
 
 
 
 
bb7c6d1
 
 
 
 
 
 
 
 
 
 
 
09b91cf
303254e
ca3dbf1
09b91cf
bb7c6d1
 
09b91cf
bb7c6d1
 
 
 
acfc854
bb7c6d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
language: en
license: apache-2.0
base_model: Qwen/Qwen3-14B
tags:
  - tool-calling
  - routing
  - code-generation
  - typescript
  - healthcare
  - aac
  - qwen3
  - gguf
---

# prism-coder:14b β€” Prism Memory Tool Router + Healthcare TypeScript Coder

Fine-tuned Qwen3-14B for the [Prism AAC](https://github.com/dcostenco/prism-aac) / Synalux healthcare platform.

## Current Production Model: S14 (eval_300 β€” 17-tool routing)

**299/300 = 99.7% strict** on eval_300 β€” 300 cases, 17 Prism Memory tools

Single remaining failure: `"Save."` β€” genuinely ambiguous between `session_save_ledger` and `session_save_experience`. All other categories at 100%.

| Category | Accuracy |
|----------|:--------:|
| session_save_ledger (ledger logging) | 100%* |
| session_load_context (context loading) | 100% |
| session_search_memory (memory recall) | 100% |
| session_save_handoff (agent handoff) | 100% |
| session_forget_memory | 100% |
| session_health_check | 100% |
| session_compact_ledger | 100% |
| session_export_memory | 100% |
| session_task_route | 100% |
| session_save_experience | 100%* |
| session_synthesize_edges | 100% |
| session_backfill_links | 100% |
| knowledge_search | 100% |
| knowledge_forget / upvote / downvote / set_retention | 100% |
| abstain (general questions, greetings, CS concepts) | 100% |
| multi-intent (compound tool calls) | 100% |
| natural phrasing | 100% |

\* One edge case (`"Save."`) scores as a failure on one tool; both are correct interpretations.

### eval_300 Details β€” S14
- **Base**: Qwen3-14B β†’ surgical LoRA chain (S1β†’S14)
- **Eval**: 300 cases, strict scoring (exact tool match), 17 Prism Memory tools + abstain + multi-intent
- **Training**: MLX LoRA, rank=8, scale=20.0, 16 layers, 100 iters, LR=5e-6, mask_prompt=true
- **Corpus**: S14 β€” balanced natural-phrasing + tool-use SFT (100 train / 20 valid)
- **SYSTEM_PROMPT**: Synalux identity + 17 Prism Memory tools + 13 multimodal tool modules + `<tool_call>` JSON block format

### Tools (S14 routing model)
All 17 Prism Memory tools:
`session_save_ledger`, `session_load_context`, `session_search_memory`, `session_save_handoff`,
`session_forget_memory`, `session_health_check`, `session_compact_ledger`, `session_export_memory`,
`session_task_route`, `session_save_experience`, `session_synthesize_edges`, `session_backfill_links`,
`knowledge_search`, `knowledge_forget`, `knowledge_upvote`, `knowledge_downvote`, `knowledge_set_retention`

---

## Legacy: Coding Eval β€” v42

**22/22 (100%)** on the Synalux healthcare TypeScript eval.

Task: write a production Next.js API route for X12 835 ERA reconciliation against existing 837P claims.

<details>
<summary>22-check eval breakdown (click to expand)</summary>

| Check | Pass |
|-------|------|
| withAudit wrapper | βœ“ |
| authenticateRequest | βœ“ |
| supabaseAdmin (not client) | βœ“ |
| cross-tenant guard (workspace_members + BILLING_ROLES) | βœ“ |
| UUID_RX validation | βœ“ |
| decryptPhi before PHI access | βœ“ |
| HIPAA audit (hipaa_access_log) | βœ“ |
| HIPAA non-blocking (.then) | βœ“ |
| 409 already-reconciled guard | βœ“ |
| 422 no CLP segments | βœ“ |
| parse CLP segment | βœ“ |
| parse SVC segment | βœ“ |
| parse CAS CO (contractual) adjustment | βœ“ |
| parse CAS PR (patient responsibility) | βœ“ |
| GL cash_received entry | βœ“ |
| GL contractual_adjustment entry | βœ“ |
| GL patient_ar entry | βœ“ |
| claim status map (1=paid) | βœ“ |
| claim status map (4=denied) | βœ“ |
| no postgres detail in 500 | βœ“ |
| belt-and-suspenders workspace_id eq on update | βœ“ |
| marks ERA file reconciled | βœ“ |

</details>

---

## Legacy: BFCL Routing Benchmark β€” v36

**Mean: 100.0% PERFECT** (3-seed average, seeds 2027/2028/2029, 102 cases each) β€” 6-tool routing

---

## GGUF Files

| File | Use | Size |
|------|-----|------|
| `qwen3-14b-s14-q4km.gguf` | **Routing** β€” production Prism Memory (17 tools, 99.7%) | ~9 GB |
| `qwen3-14b-v42-q4km.gguf` | **Coding** β€” Synalux TypeScript (22/22, 100%) | ~9 GB |
| `prism-coder-14b-v36-q4km.gguf` | Routing legacy (6-tool BFCL, 100%) | ~9 GB |

## Version History

| Version | Eval | Type | Notes |
|---------|------|------|-------|
| **S14** | **299/300 = 99.7% (eval_300)** | **Router** | **Production β€” 17-tool Prism Memory routing** |
| v42 | 22/22 coding (100%) | Coder | Claim status patch; Synalux TypeScript |
| v36 | 100% BFCL (6-tool routing) | Router | Legacy 6-tool routing |
| v34 | 98.0% BFCL | Router | β€” |

## Usage

```bash
# Pull production routing model (S14 β€” 17-tool Prism Memory)
ollama pull dcostenco/prism-coder:14b

# Or pull GGUF directly from this repo and use with Ollama:
# FROM qwen3-14b-s14-q4km.gguf
# PARAMETER temperature 0
# PARAMETER num_ctx 8192
```

### System Prompt (S14)

```
You are Synalux, a memory-augmented coding and clinical reasoning assistant. You have access to 
Prism Memory tools (session_save_ledger, session_load_context, session_search_memory, 
session_save_handoff, session_forget_memory, session_health_check, session_compact_ledger, 
session_export_memory, session_task_route, session_save_experience, session_synthesize_edges, 
session_backfill_links, knowledge_search, knowledge_forget, knowledge_upvote, knowledge_downvote, 
knowledge_set_retention) and 13 multimodal tool modules (image_gen, office, web_scraper, browser, 
tts, ocr, git, terminal, deps_scanner, hipaa, data_graph, templates, pdf_parser). Think 
step-by-step before answering. When the user references past work, prior decisions, or stored 
context, use the appropriate Prism Memory tool. Format tool calls inside <tool_call>...</tool_call> 
JSON blocks with fields 'name' and 'arguments'. If no tool is needed, answer directly in plain 
text. ABSTAIN for general programming questions, CS concepts, greetings, and capability questions.
```

## Cascade

| Tier | Model | Role |
|------|-------|------|
| 1.7B | `dcostenco/prism-coder:1b7` | Fast verify / edge cases |
| 4B | `dcostenco/prism-coder:4b` | Mid-tier verify |
| **14B** | **`dcostenco/prism-coder:14b`** | **Production routing** |
| 32B | `dcostenco/prism-coder:32b` | Top-tier / complex reasoning |