Instructions to use RMDWLLC/kaiju-coder-mlx-1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RMDWLLC/kaiju-coder-mlx-1.0 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="RMDWLLC/kaiju-coder-mlx-1.0",
	filename="kaiju-coder-mlx-1.0-q8_0.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use RMDWLLC/kaiju-coder-mlx-1.0 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0
# Run inference directly in the terminal:
llama-cli -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0
# Run inference directly in the terminal:
llama-cli -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Use Docker

docker model run hf.co/RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

LM Studio
Jan

vLLM

How to use RMDWLLC/kaiju-coder-mlx-1.0 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RMDWLLC/kaiju-coder-mlx-1.0"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RMDWLLC/kaiju-coder-mlx-1.0",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Ollama
How to use RMDWLLC/kaiju-coder-mlx-1.0 with Ollama:
```
ollama run hf.co/RMDWLLC/kaiju-coder-mlx-1.0:Q8_0
```

Unsloth Studio

How to use RMDWLLC/kaiju-coder-mlx-1.0 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RMDWLLC/kaiju-coder-mlx-1.0 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RMDWLLC/kaiju-coder-mlx-1.0 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for RMDWLLC/kaiju-coder-mlx-1.0 to start chatting

How to use RMDWLLC/kaiju-coder-mlx-1.0 with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "RMDWLLC/kaiju-coder-mlx-1.0:Q8_0"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use RMDWLLC/kaiju-coder-mlx-1.0 with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use RMDWLLC/kaiju-coder-mlx-1.0 with Docker Model Runner:
```
docker model run hf.co/RMDWLLC/kaiju-coder-mlx-1.0:Q8_0
```

Lemonade

How to use RMDWLLC/kaiju-coder-mlx-1.0 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull RMDWLLC/kaiju-coder-mlx-1.0:Q8_0

Run and chat with the model

lemonade run user.kaiju-coder-mlx-1.0-Q8_0

List all available models

lemonade list

restokes92 commited on 5 days ago

Commit

f2f4b95

verified ·

1 Parent(s): 158eccb

Kaiju-Coder MLX 1.0: model card, license, notice, Modelfile

Browse files

Files changed (4) hide show

LICENSE +201 -0
Modelfile +30 -0
NOTICE +57 -0
README.md +187 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,201 @@

+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright [yyyy] [name of copyright owner]
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

Modelfile ADDED Viewed

	@@ -0,0 +1,30 @@

+# Modelfile generated by "ollama show"
+# To build a new Modelfile based on this, replace FROM with:
+# FROM kaiju-coder-mlx:1.0
+FROM ./kaiju-coder-mlx-1.0-q8_0.gguf
+TEMPLATE "{{ if .System }}<|im_start|>system
+{{ .System }}<|im_end|>
+{{ end }}{{ if .Prompt }}<|im_start|>user
+{{ .Prompt }} /no_think<|im_end|>
+<|im_start|>assistant
+<think>
+</think>
+{{ end }}{{ .Response }}"
+SYSTEM You are Kaiju-Coder MLX 1.0, a local coding and business-building assistant for premium websites, invoices, leads, staffing, and small-business operating systems. Answer directly, produce usable artifacts, and do not reveal hidden reasoning or thinking tags.
+RENDERER qwen3.5
+PARSER qwen3.5
+PARAMETER num_ctx 8192
+PARAMETER num_predict 2048
+PARAMETER top_k 20
+PARAMETER top_p 0.95
+PARAMETER min_p 0
+PARAMETER presence_penalty 0.2
+PARAMETER repeat_penalty 1.05
+PARAMETER stop <|im_end|>
+PARAMETER stop <|endoftext|>
+PARAMETER stop <|im_start|>
+PARAMETER temperature 0.2

NOTICE ADDED Viewed

	@@ -0,0 +1,57 @@

+Kaiju-Coder MLX 1.0 by Kiyomi
+=============================
+This product includes a fine-tune of and derivative work from a third-party
+base model, redistributed under the Apache License, Version 2.0. The full
+license text is in the accompanying LICENSE file.
+-------------------------------------------------------------------------------
+Base model
+-------------------------------------------------------------------------------
+Qwen3.6-35B-A3B
+Copyright 2026 Alibaba Cloud
+Licensed under the Apache License, Version 2.0.
+The base model is available from the Qwen team at Alibaba Cloud
+(Hugging Face repo: Qwen/Qwen3.6-35B-A3B). Architecture id: qwen3_5_moe;
+35.9B total parameters with roughly 3B active per token (mixture-of-experts).
+-------------------------------------------------------------------------------
+Modifications made in this work
+-------------------------------------------------------------------------------
+This work MODIFIED the base model. Specifically:
+- A LoRA fine-tune was trained on top of the base model and fused into the
+  released weights. The fine-tune data is RMDW/Kiyomi-owned deterministic
+  output for a business-niche builder use case (websites, Stripe, invoices,
+  leads, CRM/intake, automations).
+- The redistributed GGUF is a TEXT-ONLY derivative. The base Qwen3.6-35B-A3B
+  is a vision-language model; the vision pathway is stripped in this GGUF.
+  This work does not provide and does not advertise vision capabilities.
+- Tokenizer, chat-template, and serving configuration were adapted for the
+  GGUF/Ollama/llama.cpp local-serving path.
+These modifications were made by Richard Echols / RMDW. As required by the
+Apache License, Version 2.0, Section 4(b), the changed files carry notices
+stating that they were changed.
+-------------------------------------------------------------------------------
+Additions and fine-tune by this work
+-------------------------------------------------------------------------------
+Kaiju-Coder MLX additions, fine-tune weights, training and packaging scripts,
+model card, and documentation
+Copyright 2026 Richard Echols / RMDW
+Licensed under the Apache License, Version 2.0.
+-------------------------------------------------------------------------------
+Attribution and endorsement
+-------------------------------------------------------------------------------
+This is an independent fine-tune. Alibaba Cloud and the Qwen team do not
+endorse, sponsor, or support this work, and nothing in this distribution
+should be read as implying such endorsement. "Qwen" and "Alibaba Cloud" are
+used only to describe the origin of the base model, as permitted by Section 6
+of the Apache License, Version 2.0.

README.md ADDED Viewed

	@@ -0,0 +1,187 @@

+---
+license: apache-2.0
+base_model: Qwen/Qwen3.6-35B-A3B
+base_model_relation: finetune
+pipeline_tag: text-generation
+library_name: gguf
+language:
+  - en
+tags:
+  - qwen3_5_moe
+  - moe
+  - agent
+  - business
+  - tool-calling
+  - gguf
+  - coding
+  - local
+  - apache-2.0
+---
+# Kaiju-Coder MLX 1.0 by Kiyomi
+The local model that runs your business, not just your IDE.
+Kaiju-Coder MLX 1.0 is a local-first builder model for solo founders and small-business
+owners. It is tuned for the work that actually moves a one-person business: shipping a
+website, wiring Stripe checkout, writing invoices and proposals, capturing leads, building
+CRM/intake flows, and standing up small automations. It runs on your own machine through
+Ollama, LM Studio, or llama.cpp. No API key, no data leaving your laptop, Apache-2.0.
+This is a text-only GGUF derived from Qwen3.6-35B-A3B. It is a scoped business-niche model,
+not a frontier general-purpose coder. See Limitations before you rely on it.
+## Quant table
+Sizes are the on-disk GGUF size; RAM figures are approximate working-set estimates.
+| File | Bits | Size | RAM (approx) | Use |
+|---|---|---|---|---|
+| `kaiju-coder-mlx-1.0-q8_0.gguf` | Q8_0 | ~34.4 GB | ~40 GB | Highest fidelity, the verified release artifact (available now) |
+| `kaiju-coder-mlx-1.0-q5_k_m.gguf` | Q5_K_M | ~25 GB | ~28 GB | Balanced quality/size (coming soon) |
+| `kaiju-coder-mlx-1.0-q4_k_m.gguf` | Q4_K_M | ~21 GB | ~24 GB | Smallest, runs on more machines (coming soon) |
+The Q8_0 file is the Goku-verified release artifact and is available now (SHA256
+`514169306484b4eb4ebd936d28c5bf590c5e68a938ea44e2b18d988d0c157cc5`). The LoRA adapter is also
+included under `adapter/` for use on top of the base model. Smaller K-quants (Q5_K_M, Q4_K_M)
+are being added; community re-quants are welcome.
+This is a 35.9B-total mixture-of-experts model (architecture id `qwen3_5_moe`) with roughly
+3B active parameters per token, so it is lighter to run than its total size suggests, but it
+still needs enough memory to hold the full weight set.
+## Quickstart
+Kaiju-Coder is a chat/instruct model. Run it with thinking output turned off for
+customer-visible work, or you may see empty `<think></think>` scaffolding.
+### Ollama
+Download the GGUF and the `Modelfile` into the same folder, then:
+```bash
+ollama create kaiju-coder-mlx:1.0 -f Modelfile
+ollama run kaiju-coder-mlx:1.0 --think=false --hidethinking \
+  "Build a one-page landing site for a Charlotte roofing company with a Request an Inspection CTA."
+```
+API clients should pass top-level `think: false`:
+```bash
+curl http://127.0.0.1:11434/api/chat -d '{
+  "model": "kaiju-coder-mlx:1.0",
+  "think": false,
+  "messages": [{"role": "user", "content": "Write a Stripe Checkout route for a $250 deposit."}]
+}'
+```
+### LM Studio
+1. Download the GGUF into your LM Studio models folder (or use the in-app Hugging Face search).
+2. Load the model.
+3. In the model settings, keep the system prompt that ships in the GGUF metadata, and disable
+   any reasoning/thinking display so customer output is clean.
+4. Chat normally. For tool-calling agent workflows, use the Ollama or llama.cpp path below;
+   LM Studio is supported for direct chat and artifact generation.
+Note: LM Studio import is expected to work because the GGUF metadata is correct, but it has
+not yet been smoke-tested end to end. Treat LM Studio as a chat path until that smoke is
+published.
+### llama.cpp
+```bash
+./llama-cli \
+  -m kaiju-coder-mlx-1.0-q8_0.gguf \
+  --jinja \
+  -p "Write a clean invoice template in HTML for a landscaping business, deposit and balance lines included."
+```
+For server / tool-calling use:
+```bash
+./llama-server -m kaiju-coder-mlx-1.0-q8_0.gguf --jinja --port 8080
+```
+Raw `llama-cli` may render an empty `<think></think>` block. Use Ollama `--think=false
+--hidethinking` or an API `think:false` flag for clean customer-facing output.
+## Benchmarks
+Coding numbers below are measured. They come from a controlled EvalPlus run: think-off,
+greedy, the identical harness for both models, varying only the weights, served through the
+same Ollama runtime. Tool-calling (BFCL v3) and the BizAgent-Gold deliverable score are still
+pending and labeled TBD; they are not invented.
+| Benchmark | Base (Qwen3.6-35B-A3B) | Kaiju-Coder MLX 1.0 (adapter) |
+|---|---|---|
+| EvalPlus pass@1 (HumanEval+) | 89.6% | 88.4% |
+| EvalPlus pass@1 (HumanEval base) | 93.3% | 92.1% |
+| EvalPlus pass@1 (MBPP+) | 78.0% | 75.9% |
+| EvalPlus pass@1 (MBPP base) | 91.8% | 87.0% |
+| BFCL v3 (tool/function calling) | TBD (run pending) | TBD (run pending) |
+| BizAgent-Gold deliverable quality | TBD (run pending) | TBD (run pending) |
+Read honestly: the business fine-tune costs a little coding accuracy. On the rigorous EvalPlus
+"+" sets the gap is about 1 to 2 points (HumanEval+ 88.4 vs 89.6, MBPP+ 75.9 vs 78.0); the
+largest gap is on MBPP base tests (87.0 vs 91.8). The model is still a strong coder and keeps
+the base's frontier-class agentic foundation, while adding the business-owner workflows it is
+built for.
+The retired keyword smoke test is not a benchmark and is excluded on purpose. The real gates
+are EvalPlus pass@1 for coding, BFCL v3 for tool-calls, and a deliverable-quality run for
+business artifacts.
+Open rubric: the BizAgent-Gold task set and scoring rubric used to judge business deliverables
+are open. Rubric and tasks: `benchmarks/golden-bizagent-tasks.json` and
+`benchmarks/niche-config.json` in the source repository. The judge for any published score is
+an open model, named in the result; closed-model judges are not used.
+## What works raw vs needs the harness
+- Works raw (model alone, Ollama or LM Studio): identity and voice, safe refusals, invoices,
+  proposals, follow-up sequences, CRM/intake route files, Stripe reasoning with environment
+  placeholders, and compact single-file websites and components.
+- Needs the harness (a verifier/retry loop around the model): full, polished, multi-file
+  customer websites with screenshots. Raw single-shot agent runs do not reliably clear the
+  customer-grade website bar. For that work, drive the model through a file-write/retry harness
+  rather than a single raw call. The blessed agentic serving path is GGUF/Ollama/llama.cpp; the
+  end-of-tool-call token is baked into the tool training so tool calls close cleanly.
+## Limitations
+- Scoped, not frontier. This is a business-niche builder model, not a general-purpose frontier
+  coder. It is strongest on the founder workflows listed above and weaker outside them.
+- Text-only GGUF. The base Qwen3.6-35B-A3B is a vision-language model. This GGUF strips the
+  vision pathway. It does not see images and does not advertise vision.
+- Small coding regression vs base. On the rigorous EvalPlus "+" sets this fine-tune is within
+  about 1 to 2 points of the base (HumanEval+ 88.4 vs 89.6, MBPP+ 75.9 vs 78.0); on MBPP base
+  tests the gap is larger (87.0 vs 91.8). It is the expected cost of business tuning. Tool-calling
+  (BFCL v3) is measured separately and not yet published.
+- Validated on a focused lane. The model has been checked on a Kaiju/RMDW business-owner task
+  set, not on broad public benchmarks.
+- Run with thinking off. Direct CLI use can expose `<think>` scaffolding; pass `think:false`
+  (or `--think=false --hidethinking`) for customer-visible output.
+- Raw website delivery. Raw single-shot website generation is not customer-grade; use a
+  harness or file/retry path for polished multi-file sites.
+- Human review. Customer-facing deliverables should get a human review pass during early use.
+## Identity
+Kaiju-Coder MLX 1.0 by Kiyomi is a local-first builder for solo founders and small-business
+owners. It is honest about what it is: it does not pretend to be Claude, GPT, or any other
+model, and it does not claim vision. Voice: direct, ship-first, no corporate filler.
+## License and attribution
+Licensed under the Apache License, Version 2.0. See `LICENSE` and `NOTICE`.
+- Base model: Qwen/Qwen3.6-35B-A3B, Copyright 2026 Alibaba Cloud, licensed under Apache-2.0.
+- This work is a LoRA fine-tune that modified the base model, packaged as a text-only GGUF.
+- Fine-tuned from Qwen3.6-35B-A3B by Richard Echols / RMDW.
+- Not endorsed by Alibaba Cloud or the Qwen team. "Qwen" and "Alibaba Cloud" are referenced
+  only to describe the origin of the base model.
+Training-data policy: the fine-tune uses RMDW/Kiyomi-owned deterministic output only. No
+closed-model completions were used as supervised training targets. Any open-model judge used
+for evaluation scoring is named in the result.