rockypod commited on
Commit
8b867da
Β·
verified Β·
1 Parent(s): 44044f7

Pivot main to family hub: 8B and 4B now in standalone repos rockypod/neotoi-coder-8b and -4b

Browse files
Files changed (1) hide show
  1. README.md +20 -27
README.md CHANGED
@@ -4,10 +4,7 @@ license_name: neotoi-coder-community-license
4
  language:
5
  - en
6
  - vi
7
- base_model:
8
- - Qwen/Qwen3-Coder-14B
9
- - Qwen/Qwen3-8B
10
- - Qwen/Qwen3-4B
11
  tags:
12
  - dioxus
13
  - rust
@@ -19,29 +16,24 @@ tags:
19
  - server-functions
20
  - gguf
21
  - qwen3
 
22
  pipeline_tag: text-generation
23
  ---
24
 
25
- # Neotoi Coder
26
 
27
- A Rust / Dioxus 0.7 specialist LLM. v3.1 ships in **three sizes** β€”
28
- 8B, 4B, and 14B β€” all fine-tuned via RAFT (Retrieval-Augmented
29
- Fine-Tuning) on Qwen3 base models. Optimized for production-quality
30
- Dioxus 0.7 components with Tailwind v4 and WCAG 2.2 AAA accessibility.
31
 
32
- ## Variants
33
-
34
- | Variant | Base | Params | Q4_K_M | Spec exam (104Q weighted, max 144.5) | Files |
35
  |---|---|---|---|---|---|
36
- | **8B** (flagship) | Qwen3-8B | 8.2B (6.95B non-embed) | 4.68 GB | **144.5 / 144.5 β€” 100.00%** | [`v3.1.0-8b` branch](https://huggingface.co/rockypod/neotoi-coder/tree/v3.1.0-8b) |
37
- | 4B | Qwen3-4B | 4.0B (3.6B non-embed, tied) | 2.33 GB | 143.5 / 144.5 β€” 99.31% | [`v3.1.0-4b` branch](https://huggingface.co/rockypod/neotoi-coder/tree/v3.1.0-4b) |
38
- | 14B (legacy) | Qwen3-Coder-14B | 14.8B (13.2B non-embed) | 8.40 GB | 137.0 / 144.5 β€” 94.81% | this branch (`main`) |
39
 
40
- All three clear the 90% publication bar **and** the 95% release bar with all per-tier floors PASS. The 8B is the recommended default; pick the 4B if disk / RAM is tight, pick the 14B for the broadest coverage.
41
 
42
- > **The 8B and 4B GGUFs live on separate branches** β€” switch the branch
43
- > dropdown at the top of this page (currently showing `main`) to
44
- > `v3.1.0-8b` or `v3.1.0-4b` to see and download them.
45
 
46
  ## Install via Ollama
47
 
@@ -79,8 +71,6 @@ Re-graded 2026-04-26 with the patched `run_grade_v31.py` (Q87 now also accepts `
79
 
80
  Tier floors (82% on weight-1.0 / 1.5 tiers, 88% on weight-2.0 tiers): all PASS for all three variants.
81
 
82
- The 4B's only miss is Q8 (T1 RSX conversion) β€” generation truncated mid-`<think>` block. The 14B drops on RSX-heavy questions (Q17, Q22, Q30, Q37, Q39, Q43); v3.2 target.
83
-
84
  ## What's new in v3.1 (vs v3.0)
85
 
86
  - **Two new sizes**: 8B and 4B alongside the 14B base, both surpassing the 14B's score.
@@ -102,16 +92,19 @@ The 4B's only miss is Q8 (T1 RSX conversion) β€” generation truncated mid-`<thin
102
  | **v3.1 8B** | **Qwen3-8B (8.2B)** | **144.5/144.5 (100.00%)** | **103Q weighted** | **4,880** |
103
  | v3.1 4B | Qwen3-4B (4.0B, tied) | 143.5/144.5 (99.31%) | 103Q weighted | 4,880 |
104
 
105
- ## Files in this branch (`main`, 14B)
106
 
107
  | File | Format | Size | Use case |
108
  |---|---|---|---|
109
- | `neotoi-coder-v3.1-q4_k_m.gguf` | GGUF Q4_K_M | 8.4 GB | LM Studio, llama.cpp, Ollama (current) |
110
  | `neotoi-coder-v3-q4_k_m_patched.gguf` | GGUF Q4_K_M | 9 GB | v3.0 legacy |
111
  | `neotoi-coder-v2.0-q4_k_m.gguf` | GGUF Q4_K_M | 9 GB | v2.0 legacy |
112
  | `neotoi-coder-v1-q4_k_m_final.gguf` | GGUF Q4_K_M | 9 GB | v1.0 legacy |
113
 
114
- For the 8B and 4B Q4_K_M GGUFs (with and without the `qwen3.thinking=true` patch), switch to the `v3.1.0-8b` or `v3.1.0-4b` branch via the dropdown above.
 
 
 
115
 
116
  ## Enabling Thinking Mode
117
 
@@ -128,7 +121,7 @@ This model emits Qwen3 native `<think>...</think>` blocks. Thinking is on by def
128
  | Before Assistant | `<\|im_start\|>assistant\n<think>` |
129
  | After Assistant | `<\|im_end\|>` |
130
 
131
- ### Ollama (custom Modelfile)
132
 
133
  ```Modelfile
134
  FROM neotoi-coder-v3.1-q4_k_m.gguf
@@ -145,7 +138,7 @@ TEMPLATE """{{- if .System }}<|im_start|>system
145
  SYSTEM You are Neotoi, an expert Rust and Dioxus 0.7 developer.
146
  ```
147
 
148
- Or simply pull the published model:
149
 
150
  ```
151
  ollama pull rockypod/neotoi-coder:15b
@@ -183,7 +176,7 @@ ollama pull rockypod/neotoi-coder:15b
183
 
184
  ## Transparency
185
 
186
- - **Weights:** [HuggingFace β€” rockypod/neotoi-coder](https://huggingface.co/rockypod/neotoi-coder)
187
  - **Exam runner, grader, per-question results:** [GitHub β€” rockypod/neotoi-coder](https://github.com/rockypod/neotoi-coder)
188
  - **Ollama:** `ollama pull rockypod/neotoi-coder:8b` (or `:4b`, or `:15b`)
189
 
 
4
  language:
5
  - en
6
  - vi
7
+ base_model: Qwen/Qwen3-Coder-14B
 
 
 
8
  tags:
9
  - dioxus
10
  - rust
 
16
  - server-functions
17
  - gguf
18
  - qwen3
19
+ - family-hub
20
  pipeline_tag: text-generation
21
  ---
22
 
23
+ # Neotoi Coder β€” Family Hub
24
 
25
+ A Rust / Dioxus 0.7 specialist LLM. v3.1 ships in **three sizes**, each as
26
+ its own standalone repo:
 
 
27
 
28
+ | Variant | Repo | Base | Params | Q4_K_M | Spec exam (104Q weighted, max 144.5) |
 
 
29
  |---|---|---|---|---|---|
30
+ | **8B** (flagship) | [`rockypod/neotoi-coder-8b`](https://huggingface.co/rockypod/neotoi-coder-8b) | Qwen3-8B | 8.2B (6.95B non-embed) | 4.68 GB | **144.5 / 144.5 β€” 100.00%** |
31
+ | 4B | [`rockypod/neotoi-coder-4b`](https://huggingface.co/rockypod/neotoi-coder-4b) | Qwen3-4B | 4.0B (3.6B non-embed, tied) | 2.33 GB | 143.5 / 144.5 β€” 99.31% |
32
+ | 14B (legacy) | this repo (`rockypod/neotoi-coder`) | Qwen3-Coder-14B | 14.8B (13.2B non-embed) | 8.40 GB | 137.0 / 144.5 β€” 94.81% |
33
 
34
+ All three clear the 90% publication bar **and** the 95% release bar with all per-tier floors PASS. The **8B is the recommended default**; pick the **4B** if disk / RAM is tight (or for ~40% faster generation), pick the **14B** for the broadest coverage of legacy material.
35
 
36
+ > Each variant lives in its **own model repo** so it's separately searchable and discoverable on HuggingFace. This page (`rockypod/neotoi-coder`) is the family hub *and* still hosts the legacy 14B GGUFs.
 
 
37
 
38
  ## Install via Ollama
39
 
 
71
 
72
  Tier floors (82% on weight-1.0 / 1.5 tiers, 88% on weight-2.0 tiers): all PASS for all three variants.
73
 
 
 
74
  ## What's new in v3.1 (vs v3.0)
75
 
76
  - **Two new sizes**: 8B and 4B alongside the 14B base, both surpassing the 14B's score.
 
92
  | **v3.1 8B** | **Qwen3-8B (8.2B)** | **144.5/144.5 (100.00%)** | **103Q weighted** | **4,880** |
93
  | v3.1 4B | Qwen3-4B (4.0B, tied) | 143.5/144.5 (99.31%) | 103Q weighted | 4,880 |
94
 
95
+ ## Files in this repo (`rockypod/neotoi-coder`, 14B legacy GGUFs)
96
 
97
  | File | Format | Size | Use case |
98
  |---|---|---|---|
99
+ | `neotoi-coder-v3.1-q4_k_m.gguf` | GGUF Q4_K_M | 8.4 GB | LM Studio, llama.cpp, Ollama (current 14B) |
100
  | `neotoi-coder-v3-q4_k_m_patched.gguf` | GGUF Q4_K_M | 9 GB | v3.0 legacy |
101
  | `neotoi-coder-v2.0-q4_k_m.gguf` | GGUF Q4_K_M | 9 GB | v2.0 legacy |
102
  | `neotoi-coder-v1-q4_k_m_final.gguf` | GGUF Q4_K_M | 9 GB | v1.0 legacy |
103
 
104
+ For the **8B** and **4B** Q4_K_M GGUFs, go to their dedicated repos:
105
+
106
+ - https://huggingface.co/rockypod/neotoi-coder-8b
107
+ - https://huggingface.co/rockypod/neotoi-coder-4b
108
 
109
  ## Enabling Thinking Mode
110
 
 
121
  | Before Assistant | `<\|im_start\|>assistant\n<think>` |
122
  | After Assistant | `<\|im_end\|>` |
123
 
124
+ ### Ollama (custom Modelfile, 14B)
125
 
126
  ```Modelfile
127
  FROM neotoi-coder-v3.1-q4_k_m.gguf
 
138
  SYSTEM You are Neotoi, an expert Rust and Dioxus 0.7 developer.
139
  ```
140
 
141
+ Or simply:
142
 
143
  ```
144
  ollama pull rockypod/neotoi-coder:15b
 
176
 
177
  ## Transparency
178
 
179
+ - **Per-variant weights:** [`-8b`](https://huggingface.co/rockypod/neotoi-coder-8b) Β· [`-4b`](https://huggingface.co/rockypod/neotoi-coder-4b) Β· this repo (14B)
180
  - **Exam runner, grader, per-question results:** [GitHub β€” rockypod/neotoi-coder](https://github.com/rockypod/neotoi-coder)
181
  - **Ollama:** `ollama pull rockypod/neotoi-coder:8b` (or `:4b`, or `:15b`)
182