dcostenco commited on
Commit
4ed0ef7
Β·
verified Β·
1 Parent(s): 3277b4e

Add model card: 14B v18coder-base, BFCL V4 in progress, sibling to 7B

Browse files
Files changed (1) hide show
  1. README.md +129 -0
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - es
5
+ - fr
6
+ - pt
7
+ - de
8
+ - zh
9
+ - ja
10
+ - ko
11
+ - ru
12
+ - ar
13
+ - ro
14
+ - uk
15
+ license: apache-2.0
16
+ base_model: Qwen/Qwen2.5-Coder-14B-Instruct
17
+ pipeline_tag: text-generation
18
+ library_name: transformers
19
+ tags:
20
+ - qwen2
21
+ - function-calling
22
+ - tool-use
23
+ - aac
24
+ - accessibility
25
+ - prism
26
+ - synalux
27
+ - bfcl
28
+ - conversational
29
+ ---
30
+
31
+ # Prism-Coder 14B β€” Function Calling + AAC Sibling (32K context)
32
+
33
+ A fine-tune of **Qwen2.5-Coder-14B-Instruct** released **2026-05-04** as a sibling to [`prism-coder-7b`](https://huggingface.co/dcostenco/prism-coder-7b). Auto-routed for paid-tier medium-length AAC queries via the Synalux portal β€” keeps inference local on cloud GPU pool, $0 marginal cost vs Claude/Gemini.
34
+
35
+ ## Sibling positioning
36
+
37
+ | Model | Use case | Context | RAM (Q4) |
38
+ |---|---|---|---|
39
+ | `prism-coder-7b` | iPad consumer AAC, free portal tier | 32K | ~5 GB |
40
+ | **`prism-coder-14b`** | **Mac/desktop AAC, paid portal tier (medium queries)** | **32K** | **~9 GB** |
41
+ | `prism-coder-32b` (in flight, Phase 1) | Synalux cloud paid-tier complex queries | 32K | ~20 GB |
42
+
43
+ ## Eval (Prism internal, 3-run StdDev 0%)
44
+
45
+ | Metric | Score |
46
+ |---|---|
47
+ | BFCL (Prism 64-test) | 85.9% |
48
+ | AAC realigned | 46/48 (95.8%) |
49
+ | Caregiver targeted | 18/20 |
50
+ | Emergency QA | 13/13 |
51
+ | Text correction | 14/15 |
52
+ | Translation | 8/8 |
53
+ | Ask AI | 5/5 |
54
+
55
+ The 14B is NOT explicitly AAC-trained (data was BFCL/tool-calling focused) β€” its high AAC scores are emergent from Qwen2.5-Coder-14B-Instruct's strong instruct-tuning + format transfer from BFCL training. The 7B sibling explicitly includes AAC SFT data and edges out 14B on caregiver targeted (20/20 vs 18/20) but not on general reasoning.
56
+
57
+ ## Berkeley BFCL V4 (in progress)
58
+
59
+ Handler integration PR open at [`ShishirPatil/gorilla#1332`](https://github.com/ShishirPatil/gorilla/pull/1332) supporting `prism-coder-14b-FC` alongside the 7B/32B/72B variants. Self-run with the official Berkeley toolkit is in progress; numbers will be appended once complete.
60
+
61
+ ## Use cases
62
+
63
+ ### Synalux portal β€” paid tier
64
+ Tier-aware routing dispatches:
65
+ - **Simple AAC queries** β†’ 7B local (cheap, fast)
66
+ - **Medium queries (5-40 words)** β†’ **14B local (this model)** β€” stronger reasoning, $0 marginal
67
+ - **Complex queries** β†’ Claude Opus / Haiku per tier
68
+
69
+ This routing alone is estimated to save $190K-210K/year at 10K-user scale vs all-cloud routing.
70
+
71
+ ### Self-hosted Mac / desktop AAC
72
+ Q4_K_M GGUF (~9 GB) fits on Mac M2/M3/M4 with β‰₯16 GB RAM. Runs at 15-30 tok/s β€” comfortable for AAC turns.
73
+
74
+ ## Format
75
+
76
+ ```python
77
+ from transformers import AutoModelForCausalLM, AutoTokenizer
78
+ import torch
79
+
80
+ tok = AutoTokenizer.from_pretrained("dcostenco/prism-coder-14b")
81
+ m = AutoModelForCausalLM.from_pretrained(
82
+ "dcostenco/prism-coder-14b",
83
+ torch_dtype=torch.bfloat16,
84
+ device_map="auto",
85
+ )
86
+ prompt = tok.apply_chat_template(
87
+ [{"role": "user", "content": "Add 'eat apples' to the food category."}],
88
+ tokenize=False,
89
+ add_generation_prompt=True,
90
+ )
91
+ inputs = tok(prompt, return_tensors="pt").to(m.device)
92
+ out = m.generate(**inputs, max_new_tokens=160, temperature=0.3)
93
+ print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
94
+ ```
95
+
96
+ For Ollama users, a Q4_K_M GGUF is available via the `prism-coder:14b` tag in the Synalux ops fleet.
97
+
98
+ ## Training
99
+
100
+ - Base: `Qwen/Qwen2.5-Coder-14B-Instruct`
101
+ - Method: DoRA SFT (resumed from base 14B SFT checkpoint-5000)
102
+ - Adapter: r=128, alpha=256, lora_dropout=0.05
103
+ - Schedule: 1 epoch, LR 1e-5 cosine, warmup 5%
104
+ - Data: glaive-function-calling-v2 + ToolACE + xlam-function-calling-60k + internal v17.1 BFCL (60K rows subsampled, Hammer-style 24% function-masked)
105
+ - Compute: H100Γ—2 on Modal, ~10h total
106
+
107
+ ## License
108
+
109
+ Apache 2.0. Free for research and commercial use.
110
+
111
+ ## Citation
112
+
113
+ ```bibtex
114
+ @misc{prism-coder-14b-2026,
115
+ title = {Prism-Coder 14B: Function Calling + AAC Sibling Fine-Tune of Qwen2.5-Coder-14B},
116
+ author = {Synalux AI / Dmitri Costenco},
117
+ year = {2026},
118
+ month = {May},
119
+ url = {https://huggingface.co/dcostenco/prism-coder-14b},
120
+ note = {Sibling 7B model: https://huggingface.co/dcostenco/prism-coder-7b. PR: https://github.com/ShishirPatil/gorilla/pull/1332.}
121
+ }
122
+ ```
123
+
124
+ ## Related
125
+
126
+ - 7B sibling: [`dcostenco/prism-coder-7b`](https://huggingface.co/dcostenco/prism-coder-7b)
127
+ - Berkeley BFCL V4 PR: [`ShishirPatil/gorilla#1332`](https://github.com/ShishirPatil/gorilla/pull/1332)
128
+ - Synalux portal: [synalux.ai](https://synalux.ai)
129
+ - PrismAAC consumer app: [github.com/dcostenco/prism-aac](https://github.com/dcostenco/prism-aac)