Instructions to use rockypod/neotoi-coder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rockypod/neotoi-coder with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="rockypod/neotoi-coder",
	filename="neotoi-coder-v1-q4_k_m_final.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use rockypod/neotoi-coder with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Use Docker

docker model run hf.co/rockypod/neotoi-coder:Q4_K_M

LM Studio
Jan

vLLM

How to use rockypod/neotoi-coder with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rockypod/neotoi-coder"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rockypod/neotoi-coder",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/rockypod/neotoi-coder:Q4_K_M

Ollama
How to use rockypod/neotoi-coder with Ollama:
```
ollama run hf.co/rockypod/neotoi-coder:Q4_K_M
```

Unsloth Studio new

How to use rockypod/neotoi-coder with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rockypod/neotoi-coder to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rockypod/neotoi-coder to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for rockypod/neotoi-coder to start chatting

Pi new

How to use rockypod/neotoi-coder with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rockypod/neotoi-coder:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "rockypod/neotoi-coder:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use rockypod/neotoi-coder with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rockypod/neotoi-coder:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default rockypod/neotoi-coder:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use rockypod/neotoi-coder with Docker Model Runner:
```
docker model run hf.co/rockypod/neotoi-coder:Q4_K_M
```

Lemonade

How to use rockypod/neotoi-coder with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull rockypod/neotoi-coder:Q4_K_M

Run and chat with the model

lemonade run user.neotoi-coder-Q4_K_M

List all available models

lemonade list

rockypod commited on 4 days ago

Commit

32d0ca9

verified ·

1 Parent(s): 93caec9

docs: add 8B + 4B v3.2 scorecards, mark all three variants published

Browse files

Files changed (1) hide show

README.md +130 -83

README.md CHANGED Viewed

@@ -25,90 +25,64 @@ pipeline_tag: text-generation
 # Neotoi Coder
-A Rust / Dioxus 0.7 specialist LLM. **v3.2 ships the 15B variant first**; v3.2
-8B and 4B are trained and pending exam review. v3.1 8B and 4B remain the
-current published smaller variants — pick by hardware, not currency.
 All variants are fine-tuned via RAFT (Retrieval-Augmented Fine-Tuning) on
-Qwen3 base models. Optimized for production-quality Dioxus 0.7 components
-with Tailwind v4 styling and WCAG 2.2 AAA accessibility.
-## Variants (current)
 | Variant | Repo | Base | Params | Q4_K_M | Spec exam |
 |---|---|---|---|---|---|
-| **15B v3.2** (this repo, new) | `rockypod/neotoi-coder` | Qwen3-Coder-14B | 14.8B | 8.4 GB | **156.0 / 164.0 — 95.12%** (114Q, 13 tiers) |
-| 8B v3.1 (flagship score) | [`rockypod/neotoi-coder-8b`](https://huggingface.co/rockypod/neotoi-coder-8b) | Qwen3-8B | 8.2B | 4.68 GB | 144.5 / 144.5 — 100.00% (103Q, 11 tiers) |
-| 4B v3.1 | [`rockypod/neotoi-coder-4b`](https://huggingface.co/rockypod/neotoi-coder-4b) | Qwen3-4B | 4.0B | 2.33 GB | 143.5 / 144.5 — 99.31% (103Q, 11 tiers) |
-> v3.2 8B and 4B are trained and staged on hardware, pending exam review;
-> their HF repos will update shortly after.
-> **MLX format for v3.2 is available now** at `mlx-v3.2/` in this repo
 > (7.7 GB, 4-bit quantized, 2 shards). v3.1 MLX remains at `mlx-v3.1/`.
 ## Install via Ollama
 ```bash
-# 15B v3.2 — new (broadest coverage, 0.7.4–0.7.9 surface)
 ollama pull rockypod/neotoi-coder:latest
 ollama pull rockypod/neotoi-coder:15b      # explicit size tag
-# 8B v3.1 — best score on the v3.1 exam, ~40% faster than 15B
 ollama pull rockypod/neotoi-coder:8b
-# 4B v3.1 — disk / RAM constrained
 ollama pull rockypod/neotoi-coder:4b
 ```
-## What's new in v3.2 (15B vs v3.1 15B)
-### Score deltas
-- **Overall:** 94.81% → **95.12%** on a harder, longer exam (114Q vs 103Q,
-  max 164 vs 144.5, two new tiers).
-- **T4 WCAG / ARIA: 78.6% → 100.0%** — the biggest single jump. v3.1's
-  largest weakness (drops `rsx!` macro on ARIA-heavy components) is fixed
-  in v3.2 by the dedicated WCAG correction set in T55 training.
-- **All 11 original v3.1 tiers stay at ≥87.5%**; nine of them at 100%.
-- 113/114 questions parse cleanly through the patched grader; 1 question
-  hit a generation-side degeneration loop (Q77 in T7 Primitives+CSS).
-### New Dioxus 0.7 surface
-v3.2 expands coverage from Dioxus 0.7.0 through **Dioxus 0.7.9** (full 0.7
-series, with 0.7.6 being the final 0.7 release before 0.8). New training
-topics added:
-- **T44 Scoped CSS and CSS modules** (Dioxus 0.7.3, PR #5087)
-- **T45 SyncStore + `use_store_sync`** (Dioxus 0.7.2, cross-thread reactive state)
-- **T46 New events:** `onauxclick`, `onscrollend` (Dioxus 0.7.3)
-- **T47 Server-only extractors** + `serde_qs` query string support (0.7.1 + 0.7.3)
-- **T48 0.7.2 bug-fix awareness** — optional callback props, child router layouts, drag/drop serialisation, `use_drop` in prelude
-- **T49 0.7.4 APIs:** `WritableResultExt`, WebSocket `Stream + Sink`, FFI for Kotlin/Java/Swift, iOS widget bundling
-- **T50 0.7.6 RSX additions:** `inert` global attribute, web panic resilience, blanket `IntoAttributeValue` for borrowed values, `Action::PartialEq`
-- **T51 `use_context` vs `consume_context` semantics** — panic-on-missing-provider (returns `T`, not `Option<T>`)
-### Eval-driven corrections
-Step 2 of the v3.2 pipeline added correction-style training across six
-failure axes identified in v3.1 evaluation:
-- **T52 Format Compliance** (90 examples) — fenced-code-only outputs, no
-  prose preamble, no orphan `</think>` tags.
-- **T53 Preserve-and-Append** (45) — `.ftl` catalogs, `Cargo.toml`
-  dependencies, Route enums: add to existing files, don't regenerate.
-- **T54 Dioxus 0.7 idiom reinforcement** (35) — `Outlet::<Route>`,
-  `use_init_i18n`, DaisyUI v5 / Tailwind v4 CSS-first patterns.
-- **T55 WCAG / ARIA corrections** (22) — drives the 78.6% → 100% jump.
-- **T56 `dioxus-i18n` + Fluent** (22) — `LanguageIdentifier`, `t!()`,
-  append-not-replace catalog edits.
-- **T57 Scope discipline** (12) — answer exactly what was asked.
-## v3.2 spec-exam scorecard (15B)
-114 questions across 13 tiers, max 164.0 weighted points. Publication bar
-90% (147.6 / 164.0), release bar 95% (155.8 / 164.0).
 | Tier | Count | Max wt | Raw | Wtd | Rate | Floor | Status |
 |---|---|---|---|---|---|---|---|
@@ -127,26 +101,88 @@ failure axes identified in v3.1 evaluation:
 | **T13 SyncStore (NEW)** | 2 | 3.0 | 0 | 0.0 | 0.0% | 82% | ⚠️ |
 | **Total** | **114** | **164.0** | **109** | **156.0** | **95.12%** | — | — |
-T12 and T13 are new in v3.2 and below tier-floor on a strict reading. The
-v3.2 release is published anyway because the **overall score clears both
-the 90% publication bar and the 95% release bar**, and the residual misses
-fall in the two newly-added tiers (T13 has only two questions, which makes
-any single real miss a floor failure).
-## Dataset
-- **5,287 curated examples** (up from v3.1's 4,880; +407 net after gates).
-- **57 topics total** (T1–T57). v3.2 added T44–T51 (new Dioxus 0.7.2–0.7.9
-  surface, ~190 examples) and T52–T57 (eval-driven corrections, ~225
-  examples).
-- All examples grounded in canonical `llms.txt` / `llms-full.txt` from
-  `dioxuslabs.com/learn/0.7/` (frozen with 0.7.6, the final 0.7 release).
-- Dataset is **Dioxus-only**: no standalone Tailwind CSS examples, no
-  Svelte / React / Vue / Python examples. Tailwind v4 is used as a
-  styling tool inside Dioxus components, not as a separate stack.
-- Cross-stack contamination scan during v3.2 build removed 489 rows of
-  `fn app(` → `fn App(` PascalCase fixes, `launch(app)` → `launch(App)`,
-  and three `useEffect(` → `use_effect(` React leaks.
 ## Version History
@@ -158,7 +194,9 @@ any single real miss a floor failure).
 | v3.1 15B | Qwen3-Coder-14B (14.8B) | 137.0/144.5 (94.81%) | 103Q weighted, 11 tiers | 4,880 |
 | v3.1 8B | Qwen3-8B (8.2B) | 144.5/144.5 (100.00%) | 103Q weighted, 11 tiers | 4,880 |
 | v3.1 4B | Qwen3-4B (4.0B, tied) | 143.5/144.5 (99.31%) | 103Q weighted, 11 tiers | 4,880 |
-| **v3.2 15B** | **Qwen3-Coder-14B (14.8B)** | **156.0 / 164.0 (95.12%)** | **114Q weighted, 13 tiers** | **5,287** |
 ## Files in this repo (15B and historical)
@@ -170,11 +208,10 @@ any single real miss a floor failure).
 | `neotoi-coder-v3-q4_k_m_patched.gguf` | GGUF Q4_K_M | 9 GB | v3.0 archive |
 | `neotoi-coder-v2.0-q4_k_m.gguf` | GGUF Q4_K_M | 9 GB | v2.0 archive |
 | `neotoi-coder-v1-q4_k_m_final.gguf` | GGUF Q4_K_M | 9 GB | v1.0 archive |
-| `mlx-v3.1/` | MLX safetensors | — | v3.1 MLX (Apple Silicon, Pro/Max/Ultra GPUs) |
 | `mlx-v3/` | MLX safetensors | — | v3.0 MLX archive |
-For the **8B** and **4B** Q4_K_M GGUFs (currently v3.1), see their
-dedicated repos:
 - https://huggingface.co/rockypod/neotoi-coder-8b
 - https://huggingface.co/rockypod/neotoi-coder-4b
@@ -184,3 +221,13 @@ dedicated repos:
 This model emits Qwen3 native `<think>...</think>` blocks. Thinking is on
 by default with the `_patched.gguf` quants on inference backends that
 honor `qwen3.thinking`.

 # Neotoi Coder
+A Rust / Dioxus 0.7 specialist LLM fine-tuned on 5,287 curated examples
+covering the full Dioxus 0.7 series (0.7.0–0.7.9), Tailwind v4, and
+WCAG 2.2 AAA accessibility. All three v3.2 variants are published.
 All variants are fine-tuned via RAFT (Retrieval-Augmented Fine-Tuning) on
+Qwen3 base models using LoRA adapters (Unsloth), optimized for
+production-quality Dioxus 0.7 components.
+## Variants
 | Variant | Repo | Base | Params | Q4_K_M | Spec exam |
 |---|---|---|---|---|---|
+| **15B v3.2** (this repo) | `rockypod/neotoi-coder` | Qwen3-Coder-14B | 14.8B | 8.4 GB | **156.0 / 164.0 — 95.12%** (114Q, 13 tiers) |
+| **8B v3.2** | [`rockypod/neotoi-coder-8b`](https://huggingface.co/rockypod/neotoi-coder-8b) | Qwen3-8B | 8.2B | 4.68 GB | **160.0 / 164.0 — 97.56%** (114Q, 13 tiers) |
+| **4B v3.2** | [`rockypod/neotoi-coder-4b`](https://huggingface.co/rockypod/neotoi-coder-4b) | Qwen3-4B | 4.0B | 2.33 GB | **160.0 / 164.0 — 97.56%** (114Q, 13 tiers) |
+All three clear the 90% publication bar and the 95% release bar.
+The **8B and 4B tie at 97.56%** with complementary failure patterns:
+- 4B scores **100% on T13 SyncStore** (8B scored 50%) and **100% on T8 GlobalSignal/i18n** (8B scored 87.5%)
+- 8B scores **100% on T12 Format Compliance** (4B scored 66.7%)
+Pick by hardware: 4B (2.3 GB) if disk/RAM is tight with perfect SyncStore;
+8B (4.7 GB) for best format compliance at moderate size; 15B (8.4 GB) for
+the broadest Dioxus 0.7.4–0.7.9 surface coverage.
+> **MLX format for v3.2** is available at `mlx-v3.2/` in this repo
 > (7.7 GB, 4-bit quantized, 2 shards). v3.1 MLX remains at `mlx-v3.1/`.
 ## Install via Ollama
 ```bash
+# 15B v3.2 — broadest Dioxus 0.7.4–0.7.9 surface
 ollama pull rockypod/neotoi-coder:latest
 ollama pull rockypod/neotoi-coder:15b      # explicit size tag
+# 8B v3.2 — highest raw score, ~40% faster than 15B, perfect format compliance
 ollama pull rockypod/neotoi-coder:8b
+# 4B v3.2 — disk / RAM constrained, perfect SyncStore
 ollama pull rockypod/neotoi-coder:4b
 ```
+Tags: `:latest` / `:15b`, `:8b`, `:4b`, `:v3.1` (archive). Each Modelfile
+sets `num_ctx 8192`, `temperature 0.2`, and prefills `<think>` on the
+assistant turn so Qwen3 native chain-of-thought emits by default.
+## v3.2 Scorecards (114Q, max 164.0)
+### All-variant summary
+| Variant | Score | Weighted | Raw | T12 Format | T13 SyncStore |
+|---|---|---|---|---|---|
+| **8B** | **97.56%** | 160.0 / 164.0 | 111 / 114 | ✅ 100.0% | ⚠️ 50.0% |
+| **4B** | **97.56%** | 160.0 / 164.0 | 112 / 114 | ⚠️ 66.7% | ✅ 100.0% |
+| **15B** | **95.12%** | 156.0 / 164.0 | 109 / 114 | ⚠️ 83.3% | ⚠️ 0.0% |
+### 15B scorecard
 | Tier | Count | Max wt | Raw | Wtd | Rate | Floor | Status |
 |---|---|---|---|---|---|---|---|
 | **T13 SyncStore (NEW)** | 2 | 3.0 | 0 | 0.0 | 0.0% | 82% | ⚠️ |
 | **Total** | **114** | **164.0** | **109** | **156.0** | **95.12%** | — | — |
+### 8B scorecard
+| Tier | Count | Max wt | Raw | Wtd | Rate | Floor | Status |
+|---|---|---|---|---|---|---|---|
+| T1 Fundamentals | 12 | 12.0 | 12 | 12.0 | 100.0% | 82% | ✅ |
+| T2 RSX Syntax | 12 | 12.0 | 11 | 11.0 | 91.7% | 82% | ✅ |
+| T3 Signal Hygiene | 12 | 12.0 | 12 | 12.0 | 100.0% | 82% | ✅ |
+| T4 WCAG / ARIA | 15 | 22.5 | 15 | 22.5 | **100.0%** | 82% | ✅ |
+| T5 use_resource | 8 | 12.0 | 8 | 12.0 | 100.0% | 82% | ✅ |
+| T6 Hard Reasoning | 10 | 20.0 | 10 | 20.0 | 100.0% | 88% | ✅ |
+| T7 Primitives + CSS | 13 | 19.5 | 13 | 19.5 | **100.0%** | 82% | ✅ |
+| T8 GlobalSignal / i18n | 8 | 12.0 | 7 | 10.5 | 87.5% | 82% | ✅ |
+| T9 Static Navigator | 6 | 9.0 | 6 | 9.0 | 100.0% | 82% | ✅ |
+| T10 Dioxus 0.7.4 | 6 | 12.0 | 6 | 12.0 | 100.0% | 88% | ✅ |
+| T11 Server Functions | 4 | 6.0 | 4 | 6.0 | 100.0% | 82% | ✅ |
+| **T12 Format Compliance** | 6 | 12.0 | 6 | 12.0 | **100.0%** | 88% | ✅ |
+| **T13 SyncStore** | 2 | 3.0 | 1 | 1.5 | 50.0% | 82% | ⚠️ |
+| **Total** | **114** | **164.0** | **111** | **160.0** | **97.56%** | — | — |
+T13 floor failure is structural — only 2 questions means any single miss = 50%.
+### 4B scorecard
+| Tier | Count | Max wt | Raw | Wtd | Rate | Floor | Status |
+|---|---|---|---|---|---|---|---|
+| T1 Fundamentals | 12 | 12.0 | 12 | 12.0 | 100.0% | 82% | ✅ |
+| T2 RSX Syntax | 12 | 12.0 | 12 | 12.0 | 100.0% | 82% | ✅ |
+| T3 Signal Hygiene | 12 | 12.0 | 12 | 12.0 | 100.0% | 82% | ✅ |
+| T4 WCAG / ARIA | 15 | 22.5 | 15 | 22.5 | **100.0%** | 82% | ✅ |
+| T5 use_resource | 8 | 12.0 | 8 | 12.0 | 100.0% | 82% | ✅ |
+| T6 Hard Reasoning | 10 | 20.0 | 10 | 20.0 | 100.0% | 88% | ✅ |
+| T7 Primitives + CSS | 13 | 19.5 | 13 | 19.5 | **100.0%** | 82% | ✅ |
+| T8 GlobalSignal / i18n | 8 | 12.0 | 8 | 12.0 | **100.0%** | 82% | ✅ |
+| T9 Static Navigator | 6 | 9.0 | 6 | 9.0 | 100.0% | 82% | ✅ |
+| T10 Dioxus 0.7.4 | 6 | 12.0 | 6 | 12.0 | 100.0% | 88% | ✅ |
+| T11 Server Functions | 4 | 6.0 | 4 | 6.0 | 100.0% | 82% | ✅ |
+| **T12 Format Compliance** | 6 | 12.0 | 4 | 8.0 | 66.7% | 88% | ⚠️ |
+| **T13 SyncStore** | 2 | 3.0 | 2 | 3.0 | **100.0%** | 82% | ✅ |
+| **Total** | **114** | **164.0** | **112** | **160.0** | **97.56%** | — | — |
+T12 misses: q111 (old `cx.render` idiom + orphan `</think>`), q112 (missing `rsx!`).
+The 4B also scores **100% on T8 GlobalSignal/i18n** where the 8B scored 87.5%.
+## What's new in v3.2
+### Score deltas vs v3.1
+- **15B:** 94.81% → **95.12%** on a harder, longer exam (114Q vs 103Q,
+  max 164 vs 144.5, two new tiers). T4 WCAG/ARIA: **78.6% → 100.0%**.
+- **8B:** 100.00% → **97.56%** — exam is harder (two new tiers added; both
+  are fresh weaknesses). T7 Primitives+CSS and T12 Format Compliance both hit
+  100% where the 15B scored 92.3% and 83.3%.
+- **4B:** 99.31% → **97.56%** — same exam difficulty note. T13 SyncStore
+  hits 100% (a new tier the 8B misses entirely).
+### New Dioxus 0.7 surface
+v3.2 expands coverage from Dioxus 0.7.0 through **Dioxus 0.7.9** (full 0.7
+series). New training topics:
+- **T44 Scoped CSS and CSS modules** (Dioxus 0.7.3)
+- **T45 SyncStore + `use_store_sync`** (Dioxus 0.7.2, cross-thread reactive state)
+- **T46 New events:** `onauxclick`, `onscrollend` (0.7.3)
+- **T47 Server-only extractors** + `serde_qs` query string support
+- **T48 0.7.2 bug-fix awareness** — optional callback props, child router layouts, `use_drop` in prelude
+- **T49 0.7.4 APIs:** `WritableResultExt`, WebSocket `Stream + Sink`, FFI for Kotlin/Java/Swift, iOS widget bundling
+- **T50 0.7.6 RSX additions:** `inert` attribute, web panic resilience, `IntoAttributeValue` for `&T`, `Action::PartialEq`
+- **T51 `use_context` vs `consume_context`** — panic-on-missing-provider semantics
+### Eval-driven corrections (T52–T57)
+- **T52 Format Compliance** — fenced-code-only outputs, no prose preamble, no orphan `</think>`
+- **T53 Preserve-and-Append** — `.ftl` catalogs, `Cargo.toml`, route enums: add without replacing
+- **T54 Dioxus 0.7 idiom reinforcement** — `Outlet::<Route>`, `t!()`, DaisyUI v5 / Tailwind v4
+- **T55 WCAG / ARIA corrections** — drives the 78.6% → 100% jump on the 15B
+- **T56 `dioxus-i18n` + Fluent** — `LanguageIdentifier`, catalog append
+- **T57 Scope discipline** — answer exactly what was asked
+### Dataset
+- **5,287 curated examples** across **57 topics** (up from 4,880 / 43 in v3.1)
+- Cross-stack contamination scan removed 489 rows: `fn app(` → `fn App(`, `launch(app)` → `launch(App)`, three `useEffect(` → `use_effect(` React leaks
 ## Version History
 | v3.1 15B | Qwen3-Coder-14B (14.8B) | 137.0/144.5 (94.81%) | 103Q weighted, 11 tiers | 4,880 |
 | v3.1 8B | Qwen3-8B (8.2B) | 144.5/144.5 (100.00%) | 103Q weighted, 11 tiers | 4,880 |
 | v3.1 4B | Qwen3-4B (4.0B, tied) | 143.5/144.5 (99.31%) | 103Q weighted, 11 tiers | 4,880 |
+| v3.2 15B | Qwen3-Coder-14B (14.8B) | 156.0/164.0 (95.12%) | 114Q weighted, 13 tiers | 5,287 |
+| v3.2 8B | Qwen3-8B (8.2B) | 160.0/164.0 (97.56%) | 114Q weighted, 13 tiers | 5,287 |
+| **v3.2 4B** | **Qwen3-4B (4.0B, tied)** | **160.0/164.0 (97.56%)** | **114Q weighted, 13 tiers** | **5,287** |
 ## Files in this repo (15B and historical)
 | `neotoi-coder-v3-q4_k_m_patched.gguf` | GGUF Q4_K_M | 9 GB | v3.0 archive |
 | `neotoi-coder-v2.0-q4_k_m.gguf` | GGUF Q4_K_M | 9 GB | v2.0 archive |
 | `neotoi-coder-v1-q4_k_m_final.gguf` | GGUF Q4_K_M | 9 GB | v1.0 archive |
+| `mlx-v3.1/` | MLX safetensors | — | v3.1 MLX archive |
 | `mlx-v3/` | MLX safetensors | — | v3.0 MLX archive |
+For the **8B v3.2** and **4B v3.2** Q4_K_M GGUFs, see their dedicated repos:
 - https://huggingface.co/rockypod/neotoi-coder-8b
 - https://huggingface.co/rockypod/neotoi-coder-4b
 This model emits Qwen3 native `<think>...</think>` blocks. Thinking is on
 by default with the `_patched.gguf` quants on inference backends that
 honor `qwen3.thinking`.
+## License
+**Fine-tuned weights:** Neotoi Coder Community License v1.0 — commercial use
+of outputs permitted, weight redistribution prohibited, mental health deployment
+requires written permission. See [LICENSE](LICENSE).
+**Base model:** [Qwen3-Coder-14B](https://huggingface.co/Qwen/Qwen3-Coder-14B) — Apache 2.0 © Alibaba Cloud.
+Built on a homelab RTX 3090 Ti in Washington State.