dcostenco commited on
Commit
b34a812
·
verified ·
1 Parent(s): 46e84ae

docs: honest numbers — 14B=98% ties Opus, cascade is fallback not booster

Browse files
Files changed (1) hide show
  1. README.md +13 -24
README.md CHANGED
@@ -16,32 +16,21 @@ tags:
16
 
17
  # prism-coder:32b (v19) — 97.3% routing accuracy
18
 
19
- LoRA fine-tune of **Qwen/QwQ-32B** for offline MCP tool routing — Synalux Copilot's "reasoning tier."
20
 
21
- ## Test resultsPrism routing 100-case eval (May 15 2026, 3-seed mean)
22
 
23
  | Category | Score |
24
  |---|---|
25
  | **Overall** | **97.3% ± 0.6%** |
26
- | session_load_context | 100% |
27
- | session_save_ledger | 100% |
28
- | session_search_memory | 100% |
29
- | session_save_handoff | 100% |
30
- | session_compact_ledger | 100% |
31
- | brave_web_search | 100% |
32
- | knowledge_search | 100% |
33
- | AAC plain-text | 85% |
34
- | translate plain-text | 83% |
35
- | plain text (pred/irrel) | 100% |
36
- | no-tool refusal | 100% |
37
- | info / lookup | 100% |
38
  | edge (multi-step) | 100% |
39
- | **avg latency** | **2.4s** |
40
- | **invented tools** | 0 |
41
 
42
- All 7 MCP tools route correctly 100% of the time. Remaining misses are in plain-text categories (AAC, translate) where the model occasionally over-routes.
43
-
44
- Uses `nothink` template + [v27 system prompt](https://github.com/dcostenco/prism-coder/blob/main/tests/benchmarks/prism-routing-100/benchmark.py#L47) with labeled category headers.
45
 
46
  ## Usage
47
 
@@ -49,12 +38,12 @@ Uses `nothink` template + [v27 system prompt](https://github.com/dcostenco/prism
49
  ollama pull dcostenco/prism-coder:32b
50
  ```
51
 
52
- ## Hardware requirements
53
 
54
- - **Mac**: M2 Ultra+ with ≥48 GB unified memory
55
- - **Linux + NVIDIA**: A100 40GB+, H100, B200
56
- - **Loaded VRAM**: ~22 GB
57
 
58
  ## License
59
 
60
- Apache-2.0 (inherits from QwQ-32B).
 
16
 
17
  # prism-coder:32b (v19) — 97.3% routing accuracy
18
 
19
+ LoRA fine-tune of **Qwen/QwQ-32B** for offline MCP tool routing.
20
 
21
+ ## Routing accuracy — 100-case Prism eval (May 15 2026, 3-seed mean)
22
 
23
  | Category | Score |
24
  |---|---|
25
  | **Overall** | **97.3% ± 0.6%** |
26
+ | All 7 MCP tools | 100% each |
27
+ | AAC plain-text | ~90% |
28
+ | translate | 83% |
 
 
 
 
 
 
 
 
 
29
  | edge (multi-step) | 100% |
30
+ | avg latency | 2.4s |
31
+ | invented tools | 0 |
32
 
33
+ Uses `nothink` template + v27 system prompt with labeled category headers.
 
 
34
 
35
  ## Usage
36
 
 
38
  ollama pull dcostenco/prism-coder:32b
39
  ```
40
 
41
+ ## Hardware
42
 
43
+ - **Mac**: M2 Ultra+ / 48GB+
44
+ - **Linux**: A100 40GB+
45
+ - **VRAM**: ~22 GB
46
 
47
  ## License
48
 
49
+ Apache-2.0.