dcostenco commited on
Commit
9388767
·
verified ·
1 Parent(s): 4eb22d2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +32 -26
README.md CHANGED
@@ -5,6 +5,7 @@ language:
5
  tags:
6
  - tool-calling
7
  - function-calling
 
8
  - prism
9
  - synalux
10
  - memory-augmented
@@ -14,22 +15,23 @@ base_model: Qwen/Qwen3-32B
14
  pipeline_tag: text-generation
15
  ---
16
 
17
- # Prism Coder 32B — Tool-Routing Model
18
 
19
  **100% strict accuracy** on eval_300 (300 cases, 3-seed validated, zero failures).
20
 
21
- Prism Coder 32B is a fine-tuned Qwen3-32B model specialized for routing user requests to the correct Prism Memory tool. It handles 17 distinct tools plus NO_TOOL abstention across natural phrasing, adversarial traps, disambiguation, edge cases, multi-intent, cascades, parameter extraction, and verification categories.
22
 
23
  ## Performance
24
 
25
  | Metric | Score |
26
  |--------|-------|
27
  | **eval_300 strict** | **300/300 (100%)** |
28
- | 3-seed validation | 300/300 × 3 |
29
  | avg latency | 1.4s (M5 Max) |
30
  | hallucinations | 0 |
 
31
 
32
- ### Per-Category Breakdown
33
 
34
  | Category | Score |
35
  |----------|-------|
@@ -43,17 +45,9 @@ Prism Coder 32B is a fine-tuned Qwen3-32B model specialized for routing user req
43
  | param_extraction | 25/25 |
44
  | verifier | 25/25 |
45
 
46
- ## Tools Supported
47
 
48
- 17 Prism Memory tools: `session_load_context`, `session_save_ledger`, `session_save_handoff`, `session_search_memory`, `session_forget_memory`, `session_health_check`, `session_compact_ledger`, `session_export_memory`, `session_task_route`, `session_save_experience`, `session_synthesize_edges`, `session_backfill_links`, `knowledge_search`, `knowledge_forget`, `knowledge_upvote`, `knowledge_downvote`, `knowledge_set_retention`.
49
-
50
- ## Training
51
-
52
- - **Base model**: Qwen/Qwen3-32B (4-bit quantized for training)
53
- - **Method**: MLX LoRA SFT (rank=16, 8 layers, scale=20.0) × 14 iterative rounds
54
- - **Training data**: 300 eval-aligned prompts + targeted failure remediation per round
55
- - **Quantization**: Q4_K_M via llama.cpp (18 GB)
56
- - **Hardware**: Apple M5 Max 48 GB unified memory
57
 
58
  ## Usage
59
 
@@ -61,24 +55,36 @@ Prism Coder 32B is a fine-tuned Qwen3-32B model specialized for routing user req
61
 
62
  ```bash
63
  ollama pull dcostenco/prism-coder:32b
64
- ollama run dcostenco/prism-coder:32b "Load context for the billing-service project."
 
65
  ```
66
 
67
- ### llama.cpp
68
 
69
- ```bash
70
- llama-cli -m prism-coder-32b-q4km.gguf \
71
- -p "<|im_start|>system\nYou are Synalux...<|im_end|>\n<|im_start|>user\nLoad context for billing.<|im_end|>\n<|im_start|>assistant\n"
 
 
 
 
72
  ```
73
 
74
  ## Model Family
75
 
76
- | Model | Size | eval_300 |
77
- |-------|------|----------|
78
- | prism-coder:1b7 | 2.2 GB | 100% |
79
- | prism-coder:4b | 2.5 GB | 100% |
80
- | prism-coder:14b | 9.0 GB | 99.7% |
81
- | **prism-coder:32b** | **18 GB** | **100%** |
 
 
 
 
 
 
 
82
 
83
  ## License
84
 
@@ -86,4 +92,4 @@ Apache 2.0
86
 
87
  ## Author
88
 
89
- [Synalux](https://synalux.com) — AI-powered clinical and development tools.
 
5
  tags:
6
  - tool-calling
7
  - function-calling
8
+ - code-generation
9
  - prism
10
  - synalux
11
  - memory-augmented
 
15
  pipeline_tag: text-generation
16
  ---
17
 
18
+ # Prism Coder 32B — Unified Tool-Routing & Code Generation Model
19
 
20
  **100% strict accuracy** on eval_300 (300 cases, 3-seed validated, zero failures).
21
 
22
+ Prism Coder 32B is a fine-tuned Qwen3-32B model that handles both Prism Memory tool routing (17 tools + NO_TOOL abstention) and general code generation. One model, two jobs no need for separate routing and IDE models.
23
 
24
  ## Performance
25
 
26
  | Metric | Score |
27
  |--------|-------|
28
  | **eval_300 strict** | **300/300 (100%)** |
29
+ | 3-seed validation | 300/300 x 3 |
30
  | avg latency | 1.4s (M5 Max) |
31
  | hallucinations | 0 |
32
+ | context window | **16,384 tokens** |
33
 
34
+ ### Per-Category Breakdown (eval_300)
35
 
36
  | Category | Score |
37
  |----------|-------|
 
45
  | param_extraction | 25/25 |
46
  | verifier | 25/25 |
47
 
48
+ ## Unified Model
49
 
50
+ This model replaces both `prism-coder:32b` (routing) and `prism-ide:32b` (code generation). The LoRA fine-tuning only affects 8 of 64 layers, preserving the base model's general coding capability while adding 100% accurate tool routing.
 
 
 
 
 
 
 
 
51
 
52
  ## Usage
53
 
 
55
 
56
  ```bash
57
  ollama pull dcostenco/prism-coder:32b
58
+ # Same model also available as:
59
+ ollama pull dcostenco/prism-ide:32b
60
  ```
61
 
62
+ ### Modelfile
63
 
64
+ ```
65
+ FROM prism-coder-32b-q4km.gguf
66
+ PARAMETER temperature 0
67
+ PARAMETER num_ctx 16384
68
+ PARAMETER num_predict 512
69
+ PARAMETER stop "<|im_end|>"
70
+ PARAMETER stop "<|endoftext|>"
71
  ```
72
 
73
  ## Model Family
74
 
75
+ | Model | Size | eval_300 | Context |
76
+ |-------|------|----------|---------|
77
+ | prism-coder:1b7 | 2.2 GB | 100% | 8K |
78
+ | prism-coder:4b | 2.5 GB | 100% | 8K |
79
+ | prism-coder:14b | 9.0 GB | 99.7% | 16K |
80
+ | **prism-coder:32b** | **18 GB** | **100%** | **16K** |
81
+
82
+ ## Training
83
+
84
+ - **Base**: Qwen/Qwen3-32B (4-bit quantized for training)
85
+ - **Method**: MLX LoRA SFT (rank=16, 8 layers, scale=20.0) x 14 rounds
86
+ - **Quantization**: Q4_K_M via llama.cpp (18 GB)
87
+ - **Hardware**: Apple M5 Max 48 GB
88
 
89
  ## License
90
 
 
92
 
93
  ## Author
94
 
95
+ [Synalux](https://synalux.com)