Instructions to use RMDWLLC/kaiju-coder-7-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use RMDWLLC/kaiju-coder-7-adapter with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("/workspace/kaiju-coder/models/Qwen3.6-27B") model = PeftModel.from_pretrained(base_model, "RMDWLLC/kaiju-coder-7-adapter") - Notebooks
- Google Colab
- Kaggle
Kaiju Coder 7 Local Test Instructions
Use these commands from the repo root. The public release name is Kaiju Coder 7. Internally, this build is backed by the v1.8 adapter under runs/qwen36-27b-lora-v1.8-business-owner/adapter. The release-candidate raw model path is the merged full model on Gojira B at /home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged. The deterministic harness commands work locally now; the fastest current runtime is vLLM bitsandbytes on Gojira B over Tailscale with the local OpenCode fast proxy.
Run The Local Release-Candidate Gate
python3 scripts/run_kaiju_business_owner_rc_smoke.py
This validates reviewed data, checks v1.7 targets, builds the oversampled business-owner SFT file, smokes the local OpenAI-compatible harness API, runs the hard router suite, and runs static artifact checks.
For release status, read release/COMPLETION_AUDIT.md and release/HUGGINGFACE_RELEASE_DRAFT.md.
Merge The v1.8 Adapter
Use this if the merged full model must be rebuilt:
KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter \
KAIJU_MERGED_MODEL_DIR=/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged \
./scripts/run-gojira-b-qwen36-lora-merge.sh
Start Kaiju Coder 7 Serving
Use this for the fastest current model-side candidate:
KAIJU_VLLM_CONTEXT=16384 \
KAIJU_VLLM_QUANTIZATION=bitsandbytes \
KAIJU_VLLM_LOAD_FORMAT=bitsandbytes \
KAIJU_VLLM_GPU_UTIL=0.90 \
./scripts/start-qwen36-merged-vllm.sh
Confirm readiness:
curl http://100.109.109.14:18084/v1/models
Then keep the Mac-side fast proxy pointed at that vLLM endpoint:
KAIJU_OPENAI_BASE_URL=http://100.109.109.14:18084/v1 \
python3 scripts/kaiju_opencode_fast_proxy.py --host 127.0.0.1 --port 18181
The high-context 32768 target has benchmark evidence in
release/SERVING_BENCHMARKS.md, but the current speed/default path is 16k
runtime-quantized vLLM plus the local fast proxy.
Prepare Merged-Model Hugging Face Metadata
Use this before any full merged-model upload review. It syncs release metadata into the Gojira-B model folder but does not upload or read Hugging Face tokens. If the remote merged folder is root-owned, the helper automatically uses passwordless sudo for rsync without changing model ownership:
bash scripts/prepare_hf_merged_model_metadata.sh
KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh
bash scripts/upload_hf_merged_model_from_gojira_b.sh
Install And Smoke OpenCode
python3 scripts/install_kaiju_opencode_profile.py
opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
--dir /tmp/kaiju-opencode-loopguard-smoke \
--dangerously-skip-permissions \
'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'
The installer writes the kaiju provider, the lean kaiju-coder-7 agent, and
the scoped no-autocontinue plugin at
~/.config/opencode/kaiju-no-autocontinue.mjs.
Run The Deterministic Harness Smoke
python3 scripts/run_kaiju_api_harness_smoke.py
Run A Direct Model Eval
python3 evals/run_openai_compat_smoke.py \
--base-url http://100.109.109.14:18084/v1 \
--model kaiju-coder-7 \
--tasks evals/tasks/smoke.jsonl \
--max-tasks 1 \
--timeout 300 \
--max-tokens 768 \
--temperature 0 \
--disable-thinking \
--system-prompt-file prompts/kaiju-coder-api-system.md
For the selected final business-owner checkpoint, run the focused v1.8 business-owner pack and then score it. Raw merged model generation is slow, so use the harness for practical paid website delivery until broader raw website evals pass at acceptable latency:
python3 evals/run_openai_compat_smoke.py \
--base-url http://100.109.109.14:18084/v1 \
--model kaiju-coder-7 \
--tasks evals/tasks/business-owner-v18-comparison.jsonl \
--timeout 900 \
--max-tokens 2500 \
--temperature 0 \
--disable-thinking \
--stream \
--system-prompt-file prompts/kaiju-coder-api-system.md
python3 evals/score_quality_gate.py runs/evals/<merged-v18-run>/results.jsonl
Current merged evidence:
- Probe:
1,155visible chars in60.17s. - Proposal rerun:
1/1paid-ready,4.0/4.0,4,014chars in212.72s. - Jah credits backend:
4.0/4.0,9,718chars in566.36s.
Dynamic LoRA Serving Caveat
Do not use dynamic SGLang LoRA serving as release evidence for v1.8. The adapter-name-only path can be base-equivalent, and the corrected selector qwen36-27b:kaiju_v18_business_owner crashes this SGLang build with a fused-module LoRA buffer shape mismatch. Use the merged full-model path above.
Run The Business-Owner Harness
python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl
python3 evals/run_router_static_checks.py runs/evals/<router-run>/results.jsonl
Manual Prompt To Try First
Build me the full Kiyomi 7.7.7 AI company operating pack for a local business owner. I need the launch kit, website, content engine, connector checklist, intake CRM, money report, automations, operator handbook, lead generator, sales closer, ROI dashboard, and Workshop golden run. Make it owner-ready with no developer setup required.
Expected shape:
- A project folder with multiple files, not advice only.
- Complete HTML where HTML is requested.
- Lead/sales CSVs.
- Connector verification gates.
- ROI audit gate.
- Workshop golden-run gate.
- Clear owner commands such as
/kiyomiand/kiyomi-do.