dcostenco commited on
Commit
a586bd2
·
verified ·
1 Parent(s): d6a986b

Update model card with training details, cascade position, and file table

Browse files
Files changed (1) hide show
  1. README.md +37 -81
README.md CHANGED
@@ -1,102 +1,58 @@
1
  ---
2
- base_model: Qwen/Qwen3-1.7B
3
- pipeline_tag: text-generation
4
- license: apache-2.0
5
  language: en
 
 
6
  tags:
7
- - function-calling
8
- - tool-use
9
- - mcp
10
- - aac
11
- - on-device
12
- - iOS
13
- - prism-coder
14
  ---
15
 
16
- # prism-coder:1.7b (v36) 100% on-device routing
17
-
18
- On-device MCP tool router. Runs in ~1.4 GB RAM at Q4_K_M. Built for iPhone / low-memory devices where the 14B can't load.
19
-
20
- ## Routing accuracy — 100-case Prism eval (May 16 2026, 3-seed mean)
21
-
22
- | Category | Score |
23
- |---|---|
24
- | **Overall** | **100%** |
25
- | AAC plain-text | 100% |
26
- | session_compact_ledger | 100% |
27
- | session_load_context | 100% |
28
- | session_save_handoff | 100% |
29
- | session_save_ledger | 100% |
30
- | session_search_memory | 100% |
31
- | knowledge_search | 100% |
32
- | edge (compound/ambiguous) | 100% |
33
- | irrel (no-tool) | 100% |
34
- | avg latency | 0.34s |
35
- | invented tools | 0 |
36
-
37
- **100% across all 3 eval seeds (2027 / 2028 / 2029) and all 12 categories.**
38
-
39
- Fine-tuned via MLX LoRA (8 layers, 0.145% trainable params) on 414 targeted routing examples. Training: v36 corpus, LR 5e-6, 900 iters, val loss 0.056.
40
-
41
- ## iOS deployment
42
-
43
- GGUF: `prism-aac-1b7-q4km.gguf` (1.1 GB, ~1.4 GB RAM). Integrated via llama.cpp Swift SPM into [prism-aac](https://github.com/dcostenco/prism-aac).
44
-
45
- ## Usage
46
 
47
- ```bash
48
- ollama pull dcostenco/prism-coder:1b7
49
- ```
50
-
51
- ## Hardware
52
 
53
- - **iPhone**: A14+ (iPhone 12+), ~1.4 GB RAM
54
- - **Mac**: any M-series
55
 
56
- ---
57
-
58
- ### All Prism Coder models
59
 
60
- | Model | Accuracy | Size | Device | HuggingFace |
61
- |---|---|---|---|---|
62
- | **prism-coder:14b** | **98%** | 8.4 GB | Mac / iPad Pro 16GB | [dcostenco/prism-coder-14b](https://huggingface.co/dcostenco/prism-coder-14b) |
63
- | **prism-coder:8b** | **97%** | 4.7 GB | iPhone / iPad 8GB | [dcostenco/prism-coder-8b](https://huggingface.co/dcostenco/prism-coder-8b) |
64
- | **prism-coder:32b** | **97.3%** | 19 GB | Mac M2 Ultra+ | [dcostenco/prism-coder-32b](https://huggingface.co/dcostenco/prism-coder-32b) |
65
- | **prism-coder:1.7b** | **100%** | 1.1 GB | Any device / iPhone | [dcostenco/prism-coder-1.7b](https://huggingface.co/dcostenco/prism-coder-1.7b) |
66
 
67
- GitHub: [dcostenco/prism-coder](https://github.com/dcostenco/prism-coder) · AAC app: [dcostenco/prism-aac](https://github.com/dcostenco/prism-aac) · Portal: [synalux.ai](https://synalux.ai)
 
 
 
 
 
 
 
 
68
 
69
- ## Get the full stack
70
 
71
- The model routes tool calls but needs a backend to route TO:
72
 
73
  ```bash
74
- # Install the memory server (free, local, no API keys)
75
- npm install -g prism-mcp-server
76
-
77
- # Pull the model
78
- ollama pull dcostenco/prism-coder:1b7
79
-
80
- # Done — your AI agent now has persistent memory + 100% tool routing
81
  ```
82
 
83
- **Free tier:** local SQLite, no cloud, no account needed.
84
- **Synalux portal:** cloud sync, HIPAA dashboard, team access, Claude fallback → [synalux.ai](https://synalux.ai)
85
-
86
- ---
87
-
88
- ## Prism Routing Benchmark
89
 
90
- This model is evaluated on the [Prism Routing Benchmark](https://github.com/dcostenco/prism-coder/tree/main/tests/benchmarks/prism-routing-100) — a 100-case, 12-category eval for MCP tool routing. Run it yourself:
 
 
 
91
 
92
- ```bash
93
- git clone https://github.com/dcostenco/prism-coder
94
- cd prism-coder
95
- python3 tests/benchmarks/prism-routing-100/benchmark.py --models 1b7 --seed 2027
96
- ```
97
 
98
- Not a general function-calling benchmark (BFCL). This measures routing precision on 6 specific MCP tools — the task these models were built for. The value is **offline reliability at zero cost**, not competing with frontier models on arbitrary APIs.
 
 
 
 
99
 
100
- ## License
101
 
102
- Apache-2.0.
 
1
  ---
 
 
 
2
  language: en
3
+ license: apache-2.0
4
+ base_model: Qwen/Qwen3-1.7B
5
  tags:
6
+ - tool-calling
7
+ - routing
8
+ - aac
9
+ - gguf
10
+ - mlx
 
 
11
  ---
12
 
13
+ # prism-coder:1b7AAC Tool Router (1.7B)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
+ Fine-tuned from **Qwen3-1.7B** for deterministic tool routing in the [Prism AAC](https://github.com/dcostenco/prism-aac) system.
 
 
 
 
16
 
17
+ **BFCL accuracy: 100%** on 100-case × 3 seeds routing benchmark (v36 corpus).
 
18
 
19
+ ## What it does
 
 
20
 
21
+ Routes user messages to one of 6 tools or plain text with zero hallucination:
 
 
 
 
 
22
 
23
+ | Tool | Trigger |
24
+ |------|---------|
25
+ | `session_load_context` | Load/fetch context for project X |
26
+ | `session_save_ledger` | Note / jot down / log / remember |
27
+ | `session_save_handoff` | Handoff to next agent / pass on |
28
+ | `session_compact_ledger` | Compact/archive/trim the ledger |
29
+ | `session_search_memory` | What did we discuss / recall session |
30
+ | `knowledge_search` | What do I know / stored notes |
31
+ | *(plain text)* | AAC phrases, math, facts, translation, time |
32
 
33
+ ## Deployment
34
 
35
+ **iOS / edge**runs on-device via llama.cpp (1.0 GB, Q4_K_M):
36
 
37
  ```bash
38
+ ollama run dcostenco/prism-coder:1b7
 
 
 
 
 
 
39
  ```
40
 
41
+ ## Files
 
 
 
 
 
42
 
43
+ | File | Size | Format |
44
+ |------|------|--------|
45
+ | `prism-coder-1b7-v36-q4km.gguf` | 1.0 GB | Q4_K_M GGUF (recommended) |
46
+ | `prism-aac-1b7-q4km.gguf` | 1.0 GB | Q4_K_M GGUF (legacy name) |
47
 
48
+ ## Training
 
 
 
 
49
 
50
+ - **Base**: Qwen3-1.7B
51
+ - **Method**: MLX LoRA fine-tuning (mlx_lm.lora)
52
+ - **Dataset**: v36_1b7 routing corpus (414 examples, 6-tool system prompt)
53
+ - **Hardware**: Apple Silicon (M-series), ~4GB RAM
54
+ - **Eval**: BFCL 100-case benchmark × 3 seeds → **100%**
55
 
56
+ ## System prompt
57
 
58
+ Uses the 13-rule routing system prompt. See [Prism AAC](https://github.com/dcostenco/prism-aac) for the canonical prompt used in training and inference.