# Kaiju Coder 7 Local Test Instructions Use these commands from the repo root. The public release name is Kaiju Coder 7. Internally, this build is backed by the v1.8 adapter under `runs/qwen36-27b-lora-v1.8-business-owner/adapter`. The release-candidate raw model path is the merged full model on Gojira B at `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`. The deterministic harness commands work locally now; the fastest current runtime is vLLM bitsandbytes on Gojira B over Tailscale with the local OpenCode fast proxy. ## Run The Local Release-Candidate Gate ```bash python3 scripts/run_kaiju_business_owner_rc_smoke.py ``` This validates reviewed data, checks v1.7 targets, builds the oversampled business-owner SFT file, smokes the local OpenAI-compatible harness API, runs the hard router suite, and runs static artifact checks. For release status, read `release/COMPLETION_AUDIT.md` and `release/HUGGINGFACE_RELEASE_DRAFT.md`. ## Merge The v1.8 Adapter Use this if the merged full model must be rebuilt: ```bash KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter \ KAIJU_MERGED_MODEL_DIR=/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged \ ./scripts/run-gojira-b-qwen36-lora-merge.sh ``` ## Start Kaiju Coder 7 Serving Use this for the fastest current model-side candidate: ```bash KAIJU_VLLM_CONTEXT=16384 \ KAIJU_VLLM_QUANTIZATION=bitsandbytes \ KAIJU_VLLM_LOAD_FORMAT=bitsandbytes \ KAIJU_VLLM_GPU_UTIL=0.90 \ ./scripts/start-qwen36-merged-vllm.sh ``` Confirm readiness: ```bash curl http://100.109.109.14:18084/v1/models ``` Then keep the Mac-side fast proxy pointed at that vLLM endpoint: ```bash KAIJU_OPENAI_BASE_URL=http://100.109.109.14:18084/v1 \ python3 scripts/kaiju_opencode_fast_proxy.py --host 127.0.0.1 --port 18181 ``` The high-context `32768` target has benchmark evidence in `release/SERVING_BENCHMARKS.md`, but the current speed/default path is 16k runtime-quantized vLLM plus the local fast proxy. ## Prepare Merged-Model Hugging Face Metadata Use this before any full merged-model upload review. It syncs release metadata into the Gojira-B model folder but does not upload or read Hugging Face tokens. If the remote merged folder is root-owned, the helper automatically uses passwordless sudo for rsync without changing model ownership: ```bash bash scripts/prepare_hf_merged_model_metadata.sh KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh bash scripts/upload_hf_merged_model_from_gojira_b.sh ``` ## Install And Smoke OpenCode ```bash python3 scripts/install_kaiju_opencode_profile.py opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \ --dir /tmp/kaiju-opencode-loopguard-smoke \ --dangerously-skip-permissions \ 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed' ``` The installer writes the `kaiju` provider, the lean `kaiju-coder-7` agent, and the scoped no-autocontinue plugin at `~/.config/opencode/kaiju-no-autocontinue.mjs`. ## Run The Deterministic Harness Smoke ```bash python3 scripts/run_kaiju_api_harness_smoke.py ``` ## Run A Direct Model Eval ```bash python3 evals/run_openai_compat_smoke.py \ --base-url http://100.109.109.14:18084/v1 \ --model kaiju-coder-7 \ --tasks evals/tasks/smoke.jsonl \ --max-tasks 1 \ --timeout 300 \ --max-tokens 768 \ --temperature 0 \ --disable-thinking \ --system-prompt-file prompts/kaiju-coder-api-system.md ``` For the selected final business-owner checkpoint, run the focused v1.8 business-owner pack and then score it. Raw merged model generation is slow, so use the harness for practical paid website delivery until broader raw website evals pass at acceptable latency: ```bash python3 evals/run_openai_compat_smoke.py \ --base-url http://100.109.109.14:18084/v1 \ --model kaiju-coder-7 \ --tasks evals/tasks/business-owner-v18-comparison.jsonl \ --timeout 900 \ --max-tokens 2500 \ --temperature 0 \ --disable-thinking \ --stream \ --system-prompt-file prompts/kaiju-coder-api-system.md python3 evals/score_quality_gate.py runs/evals//results.jsonl ``` Current merged evidence: - Probe: `1,155` visible chars in `60.17s`. - Proposal rerun: `1/1` paid-ready, `4.0/4.0`, `4,014` chars in `212.72s`. - Jah credits backend: `4.0/4.0`, `9,718` chars in `566.36s`. ## Dynamic LoRA Serving Caveat Do not use dynamic SGLang LoRA serving as release evidence for v1.8. The adapter-name-only path can be base-equivalent, and the corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes this SGLang build with a fused-module LoRA buffer shape mismatch. Use the merged full-model path above. ## Run The Business-Owner Harness ```bash python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl python3 evals/run_router_static_checks.py runs/evals//results.jsonl ``` ## Manual Prompt To Try First ```text Build me the full Kiyomi 7.7.7 AI company operating pack for a local business owner. I need the launch kit, website, content engine, connector checklist, intake CRM, money report, automations, operator handbook, lead generator, sales closer, ROI dashboard, and Workshop golden run. Make it owner-ready with no developer setup required. ``` Expected shape: - A project folder with multiple files, not advice only. - Complete HTML where HTML is requested. - Lead/sales CSVs. - Connector verification gates. - ROI audit gate. - Workshop golden-run gate. - Clear owner commands such as `/kiyomi` and `/kiyomi-do`.