dcostenco commited on
Commit
395cd09
Β·
verified Β·
1 Parent(s): 8cfe73e

Update model card: add v31 (95.1% BFCL, smem/know boundary fix)

Browse files
Files changed (1) hide show
  1. README.md +38 -36
README.md CHANGED
@@ -1,56 +1,58 @@
1
  ---
2
  language: en
3
  license: apache-2.0
4
- base_model: Qwen/Qwen3-8B
5
  tags:
6
- - tool-calling
7
- - routing
8
- - aac
 
9
  - gguf
 
10
  ---
11
 
12
- # prism-coder:8b β€” AAC Tool Router (8B)
13
 
14
- Fine-tuned from **Qwen3-8B** for deterministic tool routing in the [Prism AAC](https://github.com/dcostenco/prism-aac) system.
 
15
 
16
- **BFCL accuracy: 95%** on 100-case Γ— 3 seeds routing benchmark (v30 corpus).
17
 
18
- ## What it does
 
 
 
19
 
20
- Routes user messages to one of 6 tools or plain text. Intermediate quality tier in the desktop cascade (14B β†’ 32B β†’ cloud). 8B is retained as an **iOS/edge** tier for 8GB-RAM devices where 14B does not fit.
21
 
22
- | Tool | Trigger |
23
- |------|---------|
24
- | `session_load_context` | Load/fetch context for project X |
25
- | `session_save_ledger` | Note / jot down / log / remember |
26
- | `session_save_handoff` | Handoff to next agent / pass on |
27
- | `session_compact_ledger` | Compact/archive/trim the ledger |
28
- | `session_search_memory` | What did we discuss / recall session |
29
- | `knowledge_search` | What do I know / stored notes |
30
- | *(plain text)* | AAC phrases, math, facts, translation, time |
31
 
32
- ## Deployment
33
 
34
- ```bash
35
- ollama run dcostenco/prism-coder:8b-v30
36
- ```
 
 
 
37
 
38
- ## Files
39
 
40
- | File | Size | Format |
41
- |------|------|--------|
42
- | `prism-coder-8b-v30-q4km.gguf` | 4.7 GB | Q4_K_M GGUF (v30, recommended) |
43
- | `prism-aac-8b-q4km.gguf` | 5.0 GB | Q4_K_M GGUF (legacy v29) |
44
 
45
- ## Training
46
 
47
- - **Base**: Qwen3-8B
48
- - **Method**: MLX LoRA fine-tuning (mlx_lm.lora)
49
- - **Dataset**: v36_1b7 routing corpus (6-tool system prompt)
50
- - **Hardware**: Apple Silicon (M-series)
51
- - **Eval**: BFCL 100-case benchmark β†’ **95%**
52
 
53
- ## Cascade position
54
 
55
- Desktop cascade: **14B β†’ 32B β†’ cloud Claude**
56
- iOS cascade: **8B** (primary offline tier for 8GB-RAM devices)
 
 
 
 
1
  ---
2
  language: en
3
  license: apache-2.0
 
4
  tags:
5
+ - tool-routing
6
+ - function-calling
7
+ - prism-aac
8
+ - qwen3
9
  - gguf
10
+ base_model: Qwen/Qwen3-8B
11
  ---
12
 
13
+ # prism-coder:8b β€” Tool Routing Model (iOS Tier)
14
 
15
+ Fine-tuned Qwen3-8B for 6-tool routing in the Prism AAC system.
16
+ Primary deployment: **iOS/edge** via llama.cpp GGUF.
17
 
18
+ ## Versions
19
 
20
+ | Version | File | BFCL | Notes |
21
+ |---------|------|------|-------|
22
+ | v31 | `qwen3-8b-v31-q4km.gguf` | 95.1% | Surgical smem/know boundary + save fixes |
23
+ | v30 | `qwen3-8b-v30-q4km.gguf` | 95.0% | Routing corpus v36_1b7 |
24
 
25
+ ## BFCL Routing Benchmark (v31)
26
 
27
+ - **95.1%** β€” 3-seed mean (seeds 2027/2028/2029), 100 cases each
28
+ - Eval: MLX inference, greedy (temp=0), Qwen3 thinking suppressed
29
+ - Gate: β‰₯90% = deploy
 
 
 
 
 
 
30
 
31
+ ## Tools
32
 
33
+ 1. `session_load_context` β€” load/fetch/resume project context
34
+ 2. `session_save_ledger` β€” note/log/remember/record
35
+ 3. `session_save_handoff` β€” handoff/relay/next-agent transition
36
+ 4. `session_compact_ledger` β€” compact/archive ledger
37
+ 5. `session_search_memory` β€” recall past sessions/conversations
38
+ 6. `knowledge_search` β€” search stored notes/knowledge base
39
 
40
+ ## Cascade Role
41
 
42
+ iOS fallback tier. Desktop cascade uses 14B β†’ 32B β†’ cloud Claude.
43
+ 8B handles edge/offline scenarios where RAM < 6GB.
 
 
44
 
45
+ ## Usage (Ollama)
46
 
47
+ ```bash
48
+ ollama pull dcostenco/prism-coder:8b-v30
49
+ ollama run dcostenco/prism-coder:8b-v30
50
+ ```
 
51
 
52
+ ## Training
53
 
54
+ - Base: `Qwen3-8B` (MLX 4-bit)
55
+ - Framework: MLX-LM LoRA (8 layers, batch 2, grad-checkpoint)
56
+ - v31 data: 361 train / 41 valid (targeted smem/know boundary augmentations)
57
+ - v31 LR: 3e-6 (surgical, 200 iters)
58
+ - Peak memory: 7.0 GB