Instructions to use rockypod/neotoi-coder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rockypod/neotoi-coder with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="rockypod/neotoi-coder",
	filename="neotoi-coder-v1-q4_k_m_final.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use rockypod/neotoi-coder with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf rockypod/neotoi-coder:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf rockypod/neotoi-coder:Q4_K_M

Use Docker

docker model run hf.co/rockypod/neotoi-coder:Q4_K_M

LM Studio
Jan

vLLM

How to use rockypod/neotoi-coder with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rockypod/neotoi-coder"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rockypod/neotoi-coder",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/rockypod/neotoi-coder:Q4_K_M

Ollama
How to use rockypod/neotoi-coder with Ollama:
```
ollama run hf.co/rockypod/neotoi-coder:Q4_K_M
```

Unsloth Studio new

How to use rockypod/neotoi-coder with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rockypod/neotoi-coder to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rockypod/neotoi-coder to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for rockypod/neotoi-coder to start chatting

Pi new

How to use rockypod/neotoi-coder with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rockypod/neotoi-coder:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "rockypod/neotoi-coder:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use rockypod/neotoi-coder with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf rockypod/neotoi-coder:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default rockypod/neotoi-coder:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use rockypod/neotoi-coder with Docker Model Runner:
```
docker model run hf.co/rockypod/neotoi-coder:Q4_K_M
```

Lemonade

How to use rockypod/neotoi-coder with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull rockypod/neotoi-coder:Q4_K_M

Run and chat with the model

lemonade run user.neotoi-coder-Q4_K_M

List all available models

lemonade list

rockypod commited on Apr 21

Commit

8bc3864

verified ·

1 Parent(s): d2f663d

v3.0.0: README update (103Q scorecard, T11 server functions, Q8/Q16/Q21 fixes)

Browse files

Files changed (1) hide show

README.md +176 -104

README.md CHANGED Viewed

@@ -6,113 +6,127 @@ language:
 - vi
 base_model: Qwen/Qwen3-Coder-14B
 tags:
-- rust
 - dioxus
-- dioxus-0.7
-- rsx
 - fine-tuned
 - raft
 - code
-- mlx
-- gguf
-- wcag
-- accessibility
-- tailwind
-- unsloth
-- qwen3
-- local-llm
-- continue-dev
 pipeline_tag: text-generation
-model-index:
-- name: neotoi-coder-v2
-  results:
-  - task:
-      type: text-generation
-    metrics:
-    - type: custom
-      name: Dioxus 0.7 Weighted Exam (100Q)
-      value: 96.8
 ---
-# Neotoi Coder v2.0
-A Rust/Dioxus 0.7 specialist LLM — 96.8% on a 100-question weighted exam.
-Built with RAFT on a homelab RTX 3090 Ti. No cloud GPUs.
-**[Read the whole story on RockyPod.com →](https://rockypod.com/blog/neotoi-coder-v2-release)**
-**[Companion GitHub repo — benchmarks and integration guides](https://github.com/rockypod/neotoi-coder)**
-## Direct Download
-No account or approval required.
-| File | Size | Format |
-|---|---|---|
-| [neotoi-coder-v2.0-q4_k_m.gguf](https://huggingface.co/rockypod/neotoi-coder/resolve/main/neotoi-coder-v2.0-q4_k_m.gguf) | 8.4GB | GGUF Q4_K_M |
-| [mlx/ weights](https://huggingface.co/rockypod/neotoi-coder/tree/main/mlx) | 7.8GB | MLX 4-bit |
-## Quick Start
-### Apple Silicon (mlx_lm)
-```bash
-pip install mlx-lm
-mlx_lm server --model /path/to/neotoi-v2.0-mlx --port 8081
-```
-### Linux (Ollama)
-```bash
-ollama create neotoi-coder-v2 -f Modelfile
-ollama run neotoi-coder-v2
-```
-### LM Studio
-Download `neotoi-coder-v2.0-q4_k_m.gguf` above.
-See the [LM Studio setup guide](https://github.com/rockypod/neotoi-coder/blob/main/integration/lm_studio.md).
-### Continue.dev
-See the [Continue.dev config](https://github.com/rockypod/neotoi-coder/blob/main/integration/continue_dev.json).
-### Zed Editor
-See the [Zed setup guide](https://github.com/rockypod/neotoi-coder/blob/main/integration/zed.md).
-## Exam Results
-| Tier | Score | Max | Status |
-|---|---|---|---|
-| T1 Fundamentals | 11/12 | 12 | ✅ |
-| T2 RSX Syntax | 10/12 | 12 | ✅ |
-| T3 Signal Hygiene | 12/12 | 12 | ✅ Perfect |
-| T4 WCAG/ARIA | 14/14 × 1.5 | 21 | ✅ Perfect |
-| T5 use_resource | 8/8 × 1.5 | 12 | ✅ Perfect |
-| T6 Hard Reasoning | 10/10 × 2.0 | 20 | ✅ Perfect |
-| T7 Primitives+CSS | 11/12 × 1.5 | 18 | ✅ |
-| T8 GlobalSignal/i18n | 8/8 × 1.5 | 12 | ✅ Perfect |
-| T9 Static Navigator | 6/6 × 1.5 | 9 | ✅ Perfect |
-| T10 Dioxus 0.7.4 | 6/6 × 2.0 | 12 | ✅ Perfect |
-| **Total** | **135.5/140** | **140** | **96.8%** |
-## What It Knows
-- Dioxus 0.7 RSX brace syntax — never function-call style
-- `use_signal`, `use_resource` with correct three-arm match
-- `r#for` on label elements only, never inputs
-- `GlobalSignal` — `.write()` not `.set()` for statics
-- WCAG 2.2 AAA: tooltip always in DOM, listbox/option nesting,
-  `aria_labelledby` on all role containers
-- dioxus-primitives — no manual ARIA on managed components
-- `styles!()` macro for CSS modules
-- Tailwind v4 utility classes and semantic tokens
-- EN/VI i18n via pre-rsx! let bindings
-- Dark mode via `document::eval` + CSS custom properties
-- Static content navigation with `use_memo` filtering
-- `use_context` panics without provider — never returns None
-- `WritableResultExt` from Dioxus 0.7.4
-## What It Does Not Know
-- Playwright/E2E testing (out of scope)
-- Non-Dioxus web frameworks
-- WebSocket Stream+Sink real patterns (v2.1 target)
 ## Enabling Thinking Mode
@@ -120,40 +134,98 @@ See the [Zed setup guide](https://github.com/rockypod/neotoi-coder/blob/main/int
 | Field | Value |
 |---|---|
-| Before System | `<|im_start|>system` |
-| After System | `<|im_end|>` |
-| Before User | `<|im_start|>user` |
-| After User | `<|im_end|>` |
-| Before Assistant | `<|im_start|>assistant\n<think>` |
 ### llama.cpp
 ```bash
 ./llama-cli \
-  -m neotoi-coder-v2.0-q4_k_m.gguf \
   -ngl 99 \
   --temp 0.2 \
   -p "<|im_start|>user\nYour question<|im_end|>\n<|im_start|>assistant\n<think>"
 ```
-## Model Details
-- **Base model:** Qwen3-Coder-14B
-- **Method:** RAFT (Retrieval-Augmented Fine-Tuning)
-- **Dataset:** 4,185 curated Dioxus 0.7 examples
-- **Training:** 4 epochs, RTX 3090 Ti, ~4 hours
-- **Train loss:** 0.3727 (from clean Qwen3-14B base)
-- **Quantization:** Q4_K_M (8.4 GB) and MLX 4-bit (7.8 GB)
 ## License
-Neotoi Coder Community License v1.0.
-Commercial use of outputs permitted.
 Weight redistribution prohibited.
 ## Credits
-Built with [Unsloth](https://github.com/unslothai/unsloth),
-[Qwen3-Coder-14B](https://huggingface.co/Qwen/Qwen3-Coder-14B),
-[MLX](https://github.com/ml-explore/mlx), and
-[Claude Code](https://claude.ai/code).

 - vi
 base_model: Qwen/Qwen3-Coder-14B
 tags:
 - dioxus
+- rust
+- accessibility
+- wcag
 - fine-tuned
 - raft
 - code
+- server-functions
 pipeline_tag: text-generation
 ---
+# Neotoi Coder v3.0
+A Rust/Dioxus 0.7 specialist fine-tuned from Qwen3-Coder-14B using RAFT
+(Retrieval-Augmented Fine-Tuning). Expanded for Dioxus 0.7.3–0.7.5:
+scoped CSS, new event handlers, real WebSocket Stream+Sink, GlobalSignal
+cache rebuilds, and fullstack server functions — on top of v2.0's
+Tailwind v4 + WCAG 2.2 AAA + i18n coverage.
+## What's New in v3.0
+- **New Tier 11 — Server Functions:** Clean sweep 4.5/4.5.
+  `#[server]` with server-only extractors, fullstack WebSocket
+  one-line syntax, `ServerFnError` with custom HTTP status codes
+- **Scoped CSS (0.7.3):** `css!()` macro for inline scoped styles,
+  native `.module.css` imports
+- **New event handlers (0.7.3):** `onauxclick` (middle-click),
+  `onscrollend` (scroll-end detection)
+- **WebSocket Stream+Sink (0.7.4):** Real `stream.next()` and
+  `sink.send()` — no more `tokio::sleep` simulation (v2.0 gap closed)
+- **GlobalSignal cache rebuild:** Idiomatic `.write()` on pre-rsx! lets
+- **v2.0 regression fixes:** Q8 (button `r#type:`), Q16 and Q21
+  (RSX details/summary fidelity)
+- **Dataset:** 4,535 curated examples — 4,185 v2.0 base plus 350
+  new cross-topic pairs covering the surface above
+## Exam Results
+### v3.0 — 103 Question Weighted Exam
+| Tier | Questions | Weight | Score | Max | Status |
+|---|---|---|---|---|---|
+| T1 Fundamentals | Q1–12 | 1.0 | 11.0/12 | 12 | ✅ |
+| T2 RSX Syntax | Q13–24 | 1.0 | 8.0/12 | 12 | ⚠️ Regression |
+| T3 Signal Hygiene | Q25–36 | 1.0 | 9.5/12 | 12 | ✅ |
+| T4 WCAG/ARIA | Q37–50 | 1.5 | 19.5/21 | 21 | ✅ |
+| T5 use_resource | Q51–58 | 1.5 | 12.0/12 | 12 | ✅ Perfect |
+| T6 Hard Reasoning | Q59–68 | 2.0 | 15.0/20 | 20 | ✅ |
+| T7 Primitives+CSS | Q69–80 | 1.5 | 15.0/18 | 18 | ✅ |
+| T8 GlobalSignal/i18n | Q81–88 | 1.5 | 10.5/12 | 12 | ✅ |
+| T9 Static Navigator | Q89–94 | 1.5 | 9.0/9 | 9 | ✅ Perfect |
+| T10 Dioxus 0.7.4 | Q95–100 | 2.0 | 10.0/12 | 12 | ✅ |
+| T11 Server Functions | Q101–103 | 1.5 | 4.5/4.5 | 4.5 | ✅ Perfect |
+| **Overall** | **Q1–103** | | **124.0/144.5** | **144.5** | **✅ 85.8%** |
+Release threshold: 85% (123.0/144.5). v3.0 clears it with 1.0 point to spare.
+### Version History
+| Version | Score | Exam | Status |
+|---|---|---|---|
+| v1.0 | 51/60 (85.0%) | 60Q standard | Published |
+| v2.0 | 135.5/140 (96.8%) | 100Q weighted | Published |
+| v3.0 | 124.0/144.5 (85.8%) | 103Q weighted | Published |
+v3.0's overall percentage is lower than v2.0 because the 103Q exam is
+meaningfully harder: T11 Server Functions was added, T6 Hard Reasoning
+and T10 Dioxus 0.7.4 got new items, and T2 RSX Syntax has regressed
+against v2.0. The clean sweeps on T5 / T9 / T11 and the perfect fixes
+at Q8 / Q16 / Q21 reflect the intended v3 surface expansion.
+### Improvements Over v2.0
+- **Q8** — button `r#type:` attribute precision — fixed
+- **Q16** — RSX details/summary fidelity — fixed
+- **Q21** — semantic tag preservation — fixed
+- **T11 clean sweep** — 4.5/4.5 on the brand-new Server Functions tier
+- **T5 use_resource** — held perfect (12.0/12)
+- **T9 Static Navigator** — held perfect (9.0/9)
+### Known Regressions — v3.1 Targets
+T2 RSX Syntax dropped from 10/12 (v2.0) to 8.0/12 (v3.0):
+- **Q14** — RSX attribute placement precision
+- **Q15** — RSX attribute placement precision
+- **Q22** — RSX attribute placement precision
+Root cause under investigation. Targeted for v3.1.
+## Model Details
+- **Base model:** Qwen3-Coder-14B (fresh base — never fine-tune a fine-tune)
+- **Method:** RAFT (Retrieval-Augmented Fine-Tuning), Unsloth LoRA
+- **Epochs:** 4
+- **Training hardware:** RTX 3090 Ti (homelab)
+- **Dataset:** 4,535 curated examples (4,185 v2.0 base + 350 new)
+- **Scope:** Rust + Dioxus 0.7.5 + Tailwind v4 + WCAG 2.2 AAA +
+  fullstack server functions
+- **Quantization:** GGUF Q4_K_M (9 GB). MLX 4-bit: coming in v3.1
+- **Author:** Kevin Miller, Jr.
+## Install via Ollama
+```
+ollama pull rockypod/neotoi-coder
+```
+## Read the Full Story
+**[Read the whole story on RockyPod.com →](https://rockypod.com/blog/neotoi-coder-v2-release)**
+---
+## Files
+| File | Format | Size | Use case |
+|---|---|---|---|
+| `neotoi-coder-v3-q4_k_m_patched.gguf` | GGUF Q4_K_M | 9 GB | LM Studio, llama.cpp, Ollama |
+| `mlx/` | MLX 4-bit | coming in v3.1 | Apple Silicon via mlx_lm / Ollama 0.19+ |
+| `neotoi-coder-v2-q4_k_m.gguf` | GGUF Q4_K_M | 8.4 GB | v2.0 legacy |
 ## Enabling Thinking Mode
 | Field | Value |
 |---|---|
+| Before System | `<\|im_start\|>system` |
+| After System | `<\|im_end\|>` |
+| Before User | `<\|im_start\|>user` |
+| After User | `<\|im_end\|>` |
+| Before Assistant | `<\|im_start\|>assistant\n<think>` |
+| After Assistant | `<\|im_end\|>` |
+### Ollama (GGUF)
+```
+FROM neotoi-coder-v3-q4_k_m_patched.gguf
+PARAMETER temperature 0.2
+PARAMETER num_ctx 16384
+PARAMETER stop "<|im_end|>"
+TEMPLATE """{{- if .System }}<|im_start|>system
+{{ .System }}<|im_end|>
+{{ end }}<|im_start|>user
+{{ .Prompt }}<|im_end|>
+<|im_start|>assistant
+<think>
+"""
+SYSTEM You are Neotoi, an expert Rust and Dioxus 0.7 developer.
+```
+Or simply pull the published model:
+```
+ollama pull rockypod/neotoi-coder
+```
 ### llama.cpp
 ```bash
 ./llama-cli \
+  -m neotoi-coder-v3-q4_k_m_patched.gguf \
   -ngl 99 \
   --temp 0.2 \
   -p "<|im_start|>user\nYour question<|im_end|>\n<|im_start|>assistant\n<think>"
 ```
+## What It Knows
+Everything v2.0 knew, plus:
+- Native scoped CSS via `css!()` macro (0.7.3)
+- Native CSS modules with `.module.css` imports (0.7.3)
+- `onauxclick` (middle-click) and `onscrollend` event handlers (0.7.3)
+- Real WebSocket Stream+Sink — `stream.next()`, `sink.send()` (0.7.4)
+- GlobalSignal cache rebuild patterns
+- T11 server functions — `#[server]` with extractors, fullstack
+  WebSocket one-liner, `ServerFnError` with HTTP status codes (0.7.3)
+- `use_context_provider` / `use_context` placement — body only, never
+  inside rsx!
+Carried forward from v2.0: Dioxus 0.7 RSX brace syntax (never function-
+call), `use_signal`, `use_resource` three-arm match, `r#for` on labels
+only, `GlobalSignal` `.write()` semantics, WCAG 2.2 AAA (tooltip always
+in DOM, listbox/option nesting, `aria_labelledby` on role containers),
+dioxus-primitives discipline, `styles!()` macro, Tailwind v4 utilities
+and semantic tokens, EN/VI i18n via pre-rsx! let bindings, dark mode
+via `document::eval`, static content navigation with `use_memo`,
+`use_context` panic behavior, `WritableResultExt`.
+## Known Limitations
+- **T2 RSX precision at Q14 / Q15 / Q22** — attribute placement
+  regression vs v2.0; v3.1 target
+- **MLX format** — GGUF Q4_K_M only at v3.0 release; MLX build coming in v3.1
+- **Non-Dioxus web frameworks** — out of scope by design
+- **Playwright / E2E testing** — out of scope (see the SvelteCoder line)
+## Transparency
+Full dataset, exam questions, and per-question model outputs are
+published alongside the weights:
+- **Weights:** [HuggingFace — rockypod/neotoi-coder](https://huggingface.co/rockypod/neotoi-coder)
+- **Dataset + exam + per-question results:** [GitHub — rockypod/neotoi-coder](https://github.com/rockypod/neotoi-coder)
+- **Ollama:** `ollama pull rockypod/neotoi-coder`
 ## License
+Neotoi Coder Community License v1.0 — see LICENSE file.
+Commercial use of model outputs permitted.
 Weight redistribution prohibited.
+Mental health deployment requires written permission.
 ## Credits
+Built with:
+- [Unsloth](https://github.com/unslothai/unsloth) — 2x faster fine-tuning
+- [TRL](https://github.com/huggingface/trl) — SFTTrainer
+- [Qwen3-Coder-14B](https://huggingface.co/Qwen/Qwen3-Coder-14B) — base model
+- [MLX](https://github.com/ml-explore/mlx) — Apple Silicon inference (coming in v3.1)
+- [Claude Code](https://claude.ai/code) — dataset pipeline and training infrastructure
+- [Dioxus](https://dioxuslabs.com) — the framework this model specializes in