dcostenco commited on
Commit
f5ef1d0
·
verified ·
1 Parent(s): ce550f2

docs: honest numbers — 14B=98% ties Opus, cascade is fallback not booster

Browse files
Files changed (1) hide show
  1. README.md +17 -16
README.md CHANGED
@@ -13,31 +13,32 @@ tags:
13
  - prism-coder
14
  ---
15
 
16
- # prism-coder:1.7b (v19) — 86% on-device routing
17
 
18
- On-device MCP tool router based on **Qwen/Qwen3-1.7B**. Runs in 1.6 GB RAM at Q4_K_M. Built for iOS / Android / older Macs where larger tiers can't fit.
19
 
20
- ## Test resultsPrism routing 100-case eval (May 15 2026, 3-seed mean)
21
 
22
  | Category | Score |
23
  |---|---|
24
- | **Overall** | **86.3% ± 0.6%** |
 
25
  | session_load_context | 100% |
26
  | session_search_memory | 100% |
27
- | brave_web_search | 100% |
28
- | AAC plain-text | **100%** |
29
- | translate plain-text | 100% |
30
- | knowledge_search | 43% |
31
- | session_save_ledger | 71% |
32
- | session_save_handoff | 87% |
33
- | **avg latency** | **2.3s** |
34
- | **invented tools** | 1 |
35
 
36
- **Below the 90% gate** published for the on-device / cost-sensitive use case, not accuracy-critical work. AAC routing is 100% (life-critical path).
37
 
38
  ## iOS deployment
39
 
40
- GGUF: `prism-aac-1b7-q4km.gguf` (1.0 GB, ~1.6 GB RAM). Integrated via llama.cpp Swift SPM in [Prism AAC](https://github.com/dcostenco/prism-aac).
41
 
42
  ## Usage
43
 
@@ -47,9 +48,9 @@ ollama pull dcostenco/prism-coder:1b7
47
 
48
  ## Hardware
49
 
50
- - **iPhone**: A14 Bionic+ (iPhone 12+), ~1.6 GB free RAM
51
  - **Mac**: any M-series
52
 
53
  ## License
54
 
55
- Apache-2.0 (inherits from Qwen3-1.7B).
 
13
  - prism-coder
14
  ---
15
 
16
+ # prism-coder:1.7b (v19) — 88% on-device routing
17
 
18
+ On-device MCP tool router. Runs in 1.6 GB RAM at Q4_K_M. Built for iPhone / low-memory devices where the 14B can't load.
19
 
20
+ ## Routing accuracy — 100-case Prism eval (May 15 2026, 3-seed mean)
21
 
22
  | Category | Score |
23
  |---|---|
24
+ | **Overall** | **88%** |
25
+ | AAC plain-text | **100%** |
26
  | session_load_context | 100% |
27
  | session_search_memory | 100% |
28
+ | knowledge_search | 71% |
29
+ | session_save_ledger | 77% |
30
+ | avg latency | 1.6s |
31
+ | invented tools | 0-2 |
32
+
33
+ **Below the 90% gate** — this is the on-device fallback, not the accuracy tier. In production, the [Prism AAC](https://github.com/dcostenco/prism-aac) cascade tries the 14B (98%) first and only falls back to the 1.7B when the 14B can't be loaded.
34
+
35
+ AAC routing is 100% the life-critical path (expressing pain, asking for help) never fails.
36
 
37
+ Uses system-prompt engineering only (no LoRA Q4_K_M quantization erases fine-tuning signal at 1.7B scale).
38
 
39
  ## iOS deployment
40
 
41
+ GGUF: `prism-aac-1b7-q4km.gguf` (1.0 GB, ~1.6 GB RAM). Integrated via llama.cpp Swift SPM.
42
 
43
  ## Usage
44
 
 
48
 
49
  ## Hardware
50
 
51
+ - **iPhone**: A14+ (iPhone 12+), ~1.6 GB RAM
52
  - **Mac**: any M-series
53
 
54
  ## License
55
 
56
+ Apache-2.0.