Instructions to use RMDWLLC/kaiju-coder-7-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use RMDWLLC/kaiju-coder-7-adapter with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("/workspace/kaiju-coder/models/Qwen3.6-27B") model = PeftModel.from_pretrained(base_model, "RMDWLLC/kaiju-coder-7-adapter") - Notebooks
- Google Colab
- Kaggle
Kaiju Coder 7 Public Testing Quickstart
Kaiju Coder 7 is the public model name. The OpenAI-compatible model id is:
kaiju-coder-7
Use this guide for serious public testing. It avoids internal checkpoint names and keeps the current limitations clear.
Pick A Test Path
Path 1: OpenCode Against An Existing Endpoint
Use this if you already have Kaiju Coder 7 served at an OpenAI-compatible
/v1 endpoint.
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
cd kaiju-coder-7-opencode
python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18181/v1
Then run OpenCode inside the project you want to edit:
opencode
The installer sets kaiju/kaiju-coder-7 as the OpenCode model and
kaiju-coder-7 as the default agent. You can still select
kaiju/kaiju-coder-7 manually from OpenCode's model picker if you switch away.
For a bounded smoke test:
mkdir -p /tmp/kaiju-public-smoke
opencode run --dir /tmp/kaiju-public-smoke \
"Create hello.txt with exactly: Kaiju Coder 7 is ready"
Or run the packaged verifier, which checks the installer, live model endpoint, OpenCode binary, actual file creation, and wrong-directory behavior:
python3 scripts/run_kaiju_public_opencode_smoke.py
The helper installer adds:
- the
kaijuOpenAI-compatible provider model: kaiju/kaiju-coder-7anddefault_agent: kaiju-coder-7- the lean
kaiju-coder-7OpenCode agent - Kaiju as the default primary agent, so selecting Kaiju Coder 7 uses the
hidden fast artifact path without requiring
/kaiju - the
kaiju-coder-7-runrouter command for fast websites, owner packs, and Desktop artifact folders - the
kaiju_artifactOpenCode custom tool and/kaijucommand for routing large artifact prompts through the fast local router - a scoped no-autocontinue plugin that prevents false completion loops after compaction or output limits
For a fast website or owner-pack artifact without waiting on raw OpenCode multi-file streaming, run:
kaiju-coder-7-run \
--no-planner \
--kind website \
--out-dir "$HOME/Desktop/Kaiju-Coder-7-Test" \
--prompt "Build a premium one-page website for Harborline Bookkeeping with pricing, FAQ, and a cleanup-call CTA."
OpenCode should use this same command internally for large website, business-pack, and Desktop-output requests after the helper is installed.
Inside OpenCode, /kaiju is optional for large generated artifacts. The command
is prompt-backed, but it points the Kaiju agent at the kaiju_artifact custom
tool instead of making the model hand-write every file.
Path 2: Full Local Weights
Use this if the full RMDWLLC/kaiju-coder-7 Hugging Face repo has been
uploaded and you have suitable local GPU hardware.
hf download RMDWLLC/kaiju-coder-7 --local-dir ./kaiju-coder-7
Serve the downloaded folder with an OpenAI-compatible local server. Configure the server to expose:
model id: kaiju-coder-7
base URL: http://127.0.0.1:18084/v1
context: 16384
For the fastest OpenCode behavior, run the bundled fast proxy in a separate terminal and point OpenCode at the proxy:
KAIJU_OPENAI_BASE_URL=http://127.0.0.1:18084/v1 \
python3 scripts/kaiju_opencode_fast_proxy.py --host 127.0.0.1 --port 18181
Then install the OpenCode helper with:
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
cd kaiju-coder-7-opencode
python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18181/v1
Path 3: Runtime-Quantized Local Candidate
Use this only if you are comfortable with advanced serving setups. The current working quantized option is a runtime bitsandbytes recipe. A Q8_0 GGUF artifact has been converted, but it is still a candidate until runtime smoke passes.
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-quantized-runtime
cd kaiju-coder-7-quantized-runtime
Read README.md in that repo before serving. This path can reduce model memory
at runtime, but it still depends on access to the full Kaiju Coder 7 weights.
Recommended Test Prompt
Run this from an empty project folder:
Build a launch-ready local service business website and operating pack. Include
index.html, a Stripe checkout safety plan, a CSV parser with tests, a simple CRM
schema, a weekly money report, and a safety/provenance note. Write the files,
not just advice.
Expected result:
- files are written in the requested project folder
index.htmlis complete HTML- business docs start with Markdown H1 headings
- code includes a test or smoke-check command where practical
- no fake API keys, OAuth tokens, payment secrets, or private customer data
Current Recommended Defaults
- Public model id:
kaiju-coder-7 - OpenCode context:
16384 - Output cap for public testing:
2500 - Fast OpenCode path: vLLM bitsandbytes runtime behind the Kaiju fast proxy
- Current reliable product path: model plus deterministic business-owner harness/router plus verifier
- Raw multi-file OpenCode generation: still too slow for broad paid claims;
use
kaiju-coder-7-runfor fast public website and owner-pack tests while broader raw-model latency gates continue - Paid API: not public until launch preflight passes and the Stripe live-mode switch is deliberately completed
What Not To Claim Yet
Do not claim:
- that raw model weights alone reliably build every business-owner artifact
- that a paid hosted API is generally available
- that persisted quantized weights exist
- that 32k context is the current live default
Do claim:
- Kaiju Coder 7 has a working local/OpenCode release candidate
- the current tested OpenCode default is 16k context
- the helper package includes a lean agent and compaction loop guard
- the helper package includes the
kaiju-coder-7-runrouter command for fast artifact generation - the fast proxy keeps OpenCode tool calls intact while forcing bounded, non-thinking generation
- the paid API scaffold has tests and a launch preflight, but is not yet public
- the packaged public smoke verifies a fresh OpenCode one-file write before public claims are refreshed
- a GGUF Q8_0 candidate exists, but is not public quantized-weights release evidence until runtime smoke passes
Remaining Caveats Before Broader Claims
- Hugging Face public release repos are uploaded and public under
RMDWLLC. - The GGUF Q8_0 candidate still needs a runtime smoke before public quantized-weights upload.
- Raw multi-file OpenCode generation is still not the public speed story; use the deterministic router/harness for websites and business-owner packs.
- Public paid API launch has approval and preflight evidence, but real customer charging still needs a deliberate Stripe live-mode switch and controlled live payment verification.
- Do not claim 32k context as the live default until it is freshly restarted and re-confirmed.