restokes92 commited on 6 days ago

Commit

d95f073

verified ·

1 Parent(s): c239309

Upload Kaiju Coder 7 adapter release package

Browse files

Files changed (30) hide show

.gitattributes +1 -0
COMPLETION_AUDIT.md +84 -0
DATA_PROVENANCE_DRAFT.md +87 -0
EVAL_SCOREBOARD.md +70 -0
FINAL_RELEASE_REPORT.md +269 -0
GOAL_COMPLETION_AUDIT.md +60 -0
LOCAL_TEST_INSTRUCTIONS.md +147 -0
PAID_API_READINESS.md +266 -0
PUBLIC_TESTING_QUICKSTART.md +149 -0
QUANTIZATION_PLAN.md +96 -0
README.md +121 -0
SERVING_BENCHMARKS.md +358 -0
SOURCE_INVENTORY.md +41 -0
UPSTREAM_LICENSE_CHECK.md +38 -0
adapter_config.json +48 -0
adapter_model.safetensors +3 -0
chat_template.jinja +154 -0
cloudflare-bindings.example.json +16 -0
hf-release-permission-evidence.example.json +9 -0
paid-api-launch-evidence.example.json +61 -0
scripts/apply_paid_api_cloudflare_bindings.py +162 -0
scripts/check_hf_release_permission_evidence.py +164 -0
scripts/check_hf_uploaded_release.py +384 -0
scripts/check_paid_api_readiness.py +518 -0
scripts/collect_hf_release_permission_evidence.py +156 -0
scripts/collect_paid_api_launch_evidence.py +286 -0
tokenizer.json +3 -0
tokenizer_config.json +32 -0
training_args.bin +3 -0
upstream/qwen3.6-27b/LICENSE +202 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

COMPLETION_AUDIT.md ADDED Viewed

	@@ -0,0 +1,84 @@

+# Kaiju Business-Owner Release Completion Audit
+Date: 2026-06-03
+This audit maps the active goal to current evidence. It is intentionally
+conservative: the product-path harness is release-candidate ready for local
+testing, the fresh v1.8 Qwen 3.6 LoRA adapter exists, and a merged full-model
+artifact serves locally on Gojira-B. Dynamic SGLang LoRA serving is not counted
+as release evidence because the corrected LoRA selector crashes on this
+adapter. Human review, website latency/SLA decisions, broader comparison evals,
+and Hugging Face write permissions are still required before publishing
+externally.
+## Requirement Status
+| Requirement | Current evidence | Status |
+|---|---|---|
+| Continue from `RichardEchols/kaiju-coder`, not a restart | Branch `codex/kaiju-business-owner-rc` is based on `3d57eae92ad523519473f0ff3eca6661a9736de3`, matching `origin/main`. | Passed |
+| GitHub and local source inventory for Kaiju, Kiyomi, RMDW, Makoto, Mezzal, and wiki sources | `release/SOURCE_INVENTORY.md` and `release/source-inventory.json` generated from GitHub metadata, `git ls-remote` SHAs, and the requested local `/Users/richardecholsai7/Documents/RMDW-Wiki` snapshot marked non-authoritative/selective-reference-only. | Passed |
+| Legally reusable, provenance-preserving dataset update | `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl` adds reviewed RMDW-owned examples with `source_repos`, `source_paths`, and `provenance_notes`. | Passed |
+| Dataset validation | `python3 scripts/validate_training_data.py --min-examples 350` passes with `1,689` reviewed examples across `14` files. | Passed |
+| v1.7 business-owner SFT build | `python3 scripts/build_v17_business_owner_sft_dataset.py` writes `1,881` rows and `192` controlled business-owner repeats. | Passed |
+| Hard evals for business-owner workflows | `evals/tasks/router-hard-harness.jsonl` includes `business_suite` prompts; latest local RC smoke run produced `23/23` static pass. | Passed |
+| Local Kaiju product path runs | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` validates data, builds SFT, smokes the local API harness, runs router hard eval, and runs static checks. | Passed |
+| Complete Kiyomi 7.7.7 AI-company artifact generation | `business_suite` route writes a 19-file pack including launch kit, content engine, connector checklist, intake CRM, reporting, automations, operator handbook, leads, sales, ROI dashboard, and Workshop artifact. | Passed |
+| Secret/private-data guardrails | Dataset validation scans common secret patterns; verifier checks `no_hardcoded_secrets`; source inventory excludes credentials, tokens, private client data, and raw logs. | Passed |
+| Release artifacts | `release/MODEL_CARD_DRAFT.md`, `release/HF_ADAPTER_MODEL_CARD.md`, `release/DATA_PROVENANCE_DRAFT.md`, `release/EVAL_SCOREBOARD.md`, `release/LOCAL_TEST_INSTRUCTIONS.md`, `release/HUGGINGFACE_RELEASE_DRAFT.md`, `release/FINAL_RELEASE_REPORT.md`, `release/UPSTREAM_LICENSE_CHECK.md`, and this audit. | Passed |
+| Fresh Qwen 3.6 v1.7 fine-tune | After clearing old ComfyUI/Ollama workloads from Gojira B, training finished with `metrics.json`, train runtime `1663.7101s`, train loss `1.7260706673065822`, and an adapter directory. | Passed |
+| Local inference against new v1.7 checkpoint | SGLang served `kaiju_v17_business_owner` over Tailscale at `http://100.109.109.14:18083/v1` with `context=4096` and `mem_fraction=0.90`; website and proposal smoke tasks returned non-empty outputs. | Passed |
+| Stronger Qwen 3.6 v1.8 fine-tune | Gojira B was cleared of ComfyUI/SGLang/Ollama GPU conflicts; v1.8 finished with `metrics.json`, train runtime `11666.7564s`, train loss `0.9281658741335074`, and an adapter directory. | Passed |
+| v1.8 adapter merged into full model | `scripts/run-gojira-b-qwen36-lora-merge.sh` merged `/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter` into `/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`; remote artifact is `51G` with `14` safetensor shards and preserved base config/processor sidecars. | Passed |
+| Local inference against v1.8 merged checkpoint | `scripts/start-qwen36-merged-sglang.sh` serves `kaiju-coder-7` over Tailscale at `http://100.109.109.14:18083/v1`; current restored live endpoint reports max model len `16384`. Prior benchmarks proved 12k/16k/24k/32k startup and smoke evidence, with 32k treated as the high-context target rather than the currently parked runtime. | Passed |
+| v1.8 merged business-owner eval | Probe returned `1,155` visible chars in `60.17s`; proposal rerun scored `1/1`, `4.0/4.0`, `4,014` chars in `212.72s`; Jah credits backend scored `4.0/4.0`, `9,718` chars in `566.36s`. | Passed with latency caveat |
+| OpenCode local run path | Local OpenCode provider/agent is installed for `kaiju/kaiju-coder-7` with 16k context and the scoped no-autocontinue plugin at `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`. Fresh public smoke wrote `hello.txt` with exactly `Kaiju Coder 7 fresh public smoke ok`; packaged public verifier `python3 scripts/run_kaiju_public_opencode_smoke.py --timeout 900 --keep-dir` passed `4/4` in `runs/public-opencode-smoke/20260603T182222Z/summary.md`, including wrong-directory leakage checks; loop-guard smoke wrote `loopguard.txt` with exactly `Kaiju Coder 7 loop guard installed`; latest harnessed customer-readiness pack `runs/opencode-customer-readiness/20260603T185835Z/summary.md` passed `4/4` with `28/28` required files, including release provenance and safety review. | Passed for harnessed/product path |
+| Runtime-quantized local path | vLLM bitsandbytes runtime quantization passed identity/code/business-doc smokes at 8k/16k, reported about `17.8 GiB` model memory, and passed OpenCode one-file smoke with exact content `Kaiju Coder 7 quantized runtime ok`. Persisted quantized weights are still pending. | Runtime recipe passed; persisted weights pending |
+| Paid API gateway scaffold | `cd gateway/cloudflare-worker && npm run check` passes `16/16` Worker tests covering bearer auth, inactive keys, insufficient credits, debit/refund, rate limit before debit, model `kaiju-coder-7` enforcement, streaming/thinking/token caps, secret-content rejection without logging, signed Stripe Checkout top-up idempotency, origin-only R2 artifact upload, and account-scoped artifact download. `python3 scripts/check_paid_api_readiness.py --mode scaffold` now passes `17` checks, including the guarded `npm run prepare:cloudflare` resource-prep path, Wrangler dry-run deploy wiring, artifact route controls, sanitized launch-evidence template, and reviewed Cloudflare bindings template. `scripts/apply_paid_api_cloudflare_bindings.py` previews/applies real D1/KV/R2 bindings while refusing placeholders and secret-looking input. `scripts/collect_paid_api_launch_evidence.py` can preview or write the remaining sanitized staging evidence without storing API keys, full prompts, or model responses. `--mode launch` fails by design until real D1/KV/R2 bindings, Wrangler secrets, Stripe webhook staging evidence, paid-route staging request, latency evidence, and rollback proof are attached through `release/paid-api-launch-evidence.json`. | Local scaffold passed; live deployment pending |
+| Dynamic SGLang LoRA selector | Adapter-name-only serving can be base-equivalent; corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes with `LoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16])`. | Not release path |
+| Hugging Face helper repo upload readiness | Adapter, OpenCode helper, and runtime-quantized recipe staging folders build under `/tmp/kaiju-coder-7-hf-staging`; upload script is dry-run safe and namespace-configurable. Apply mode now requires staged checksum/integrity validation and `check_human_release_review.py --mode public` before repo creation. Local `hf` CLI is installed and authenticated as `restokes92`, but private repo creation attempts under `RichardEchols`, `RMDWLLC`, and `restokes92` returned `403 Forbidden`. | Package ready; upload blocked by review/token permissions |
+| Hugging Face merged model upload readiness | `scripts/prepare_hf_merged_model_metadata.sh` stages the model card, quickstarts, provenance, benchmarks, evals, paid API status, final report, upstream license, and `MERGED_MODEL_RELEASE_MANIFEST.json` for the remote merged-model directory. Latest apply-mode metadata sync passed on Gojira-B using passwordless sudo rsync for the root-owned folder. `scripts/upload_hf_merged_model_from_gojira_b.sh` refuses to preview or upload unless that metadata and the `51G`/`14`-shard merged model are present; latest dry run confirmed `Metadata: present` and printed the correct `hf upload-large-folder` command. Apply mode requires `check_human_release_review.py --mode public --require-merged-upload` before remote upload. Gojira-B has `hf` `1.17.0` and auth, but repo creation still needs a write-capable namespace. | Package ready; upload blocked by review/token permissions |
+| Consolidated release readiness check | `python3 scripts/check_kaiju_public_release_readiness.py --mode local` reports local public-testing readiness while keeping Hugging Face namespace permission, paid API launch preflight, and human review as explicit manual blockers. `--mode public` remains red until those external gates pass. The local check calls `scripts/check_hf_staging_integrity.py` to validate staged files, public naming hygiene, raw secret-looking values, and checksums. It also requires `release/FINAL_RELEASE_REPORT.md`, generated by `scripts/generate_kaiju_final_report.py`, and the local `release/bundles/LATEST.json` archive checksum produced by `scripts/create_hf_release_bundle.py`, so the final release state has exact commands, blockers, changed files, first-test instructions, and a reviewable HF bundle. It also calls `scripts/check_human_release_review.py` so `release/HUMAN_RELEASE_REVIEW.md` is the structured human signoff gate. | Local mode passed; public mode pending |
+## Commands With Current Passing Evidence
+```bash
+python3 -m unittest discover -s tests -p 'test_*.py'
+python3 scripts/run_kaiju_business_owner_rc_smoke.py
+python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed
+python3 scripts/install_kaiju_opencode_profile.py
+mkdir -p /tmp/kaiju-opencode-fresh-public-smoke
+opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-fresh-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'
+opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'
+python3 scripts/check_paid_api_readiness.py --mode scaffold
+python3 -m py_compile scripts/run_kaiju_api_harness_smoke.py scripts/run_kaiju_business_owner_rc_smoke.py scripts/build_v17_business_owner_sft_dataset.py kaiju_harness/business_suite.py kaiju_harness/router.py kaiju_harness/verification.py
+git diff --check
+bash scripts/upload_hf_merged_model_from_gojira_b.sh
+KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh
+python3 scripts/check_hf_staging_integrity.py
+python3 scripts/check_human_release_review.py --mode local
+python3 scripts/generate_kaiju_final_report.py
+python3 scripts/create_hf_release_bundle.py
+python3 scripts/check_kaiju_public_release_readiness.py --mode local
+```
+## Remaining Blocker
+The fresh v1.8 adapter, merged full-model artifact, and direct merged-model inference path are proven. The current completed local release candidate is:
+```text
+Kaiju Coder 7 merged model + deterministic business-owner harness + verifier + source-backed v1.7/v1.8 dataset/release package
+```
+That must be described honestly until external release review confirms:
+- human review of generated artifacts
+- raw website latency/SLA positioning or explicit harness-first website positioning
+- base Qwen and GLM comparison results
+- final human review of upstream license/notice packaging
+- Hugging Face write-capable token or namespace permission
+- Hugging Face repo creation permission for the 51GB merged model upload from
+  Gojira-B
+- final Hugging Face upload metadata and public/private release decision
+- live Cloudflare D1/KV/R2 resources, Stripe products/webhook endpoint,
+  deployment secrets, staging end-to-end paid API requests, rollback, and
+  support boundaries if exposed commercially

DATA_PROVENANCE_DRAFT.md ADDED Viewed

	@@ -0,0 +1,87 @@

+# Kaiju Coder 7 by Kiyomi - Data Provenance Draft
+This draft records the current data boundary for release review.
+## Policy
+Kaiju Coder training data must be legally usable for a commercial derivative model.
+Allowed:
+- RMDW-authored examples.
+- RMDW-owned repository diffs and documentation.
+- Human-reviewed examples created specifically for Kaiju.
+- Public permissive data only when license review confirms compatibility.
+Not allowed:
+- Closed-model answers from OpenAI, Anthropic, Gemini, or similar services as supervised completions.
+- Unreviewed customer data.
+- Private customer code without consent.
+- Secrets, tokens, credentials, cookies, or private keys.
+- Unlicensed scraped code.
+## v0.1 Dataset Snapshot
+- Total reviewed examples: 575
+- Dataset build: `datasets/build/kaiju-sft-v0.1.jsonl`
+- Candidate sources:
+  - `datasets/candidates/rmdw-git-patches.jsonl`
+  - `datasets/candidates/v0.1-safe-git-backlog.jsonl`
+  - `datasets/candidates/v0.1-file-level-git.jsonl`
+  - `datasets/candidates/v0.1-wiki-strategy-business-identity.jsonl`
+## v1.7 Business-Owner Suite Addendum
+- Date prepared: 2026-06-03
+- Reviewed examples: 8
+- Candidate file: `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl`
+- Addendum-only SFT build: `datasets/build/kaiju-sft-v1.7-business-owner-suite.jsonl`
+- Training SFT build: `datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl`
+- Training config: `training/configs/qwen36-27b-lora-v1.7.example.json`
+- v1.8 training config: `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json`
+- New task type: `business_suite`
+- Source inventory: `release/SOURCE_INVENTORY.md`, refreshed from GitHub source-of-truth repositories and the requested local RMDW wiki snapshot.
+This addendum targets Kiyomi 7.7.7 style business-owner work: complete AI-company build packs, premium service websites, intake and CRM flows, sales follow-up, proposals, ROI dashboards, operator handbooks, and Workshop golden-run automations.
+Every row includes:
+- `source_repos`
+- `source_paths`
+- `provenance_notes`
+- `reviewed: true`
+- `license: RMDW-owned`
+For the v1.7 LoRA run, the 8 reviewed business-owner rows are oversampled 24 times by `scripts/build_v17_business_owner_sft_dataset.py`. Repeated rows receive unique IDs ending in `__v17_business_repeat_NN` and preserve the original source repository, source path, and provenance metadata.
+Client-site repositories are used only as eval and generalized pattern sources unless a row is explicitly reviewed for training eligibility. Do not bulk-train on client-specific text, contact details, contracts, or private business data.
+The local wiki path `/Users/richardecholsai7/Documents/RMDW-Wiki` is present but is not a git checkout. It is recorded as `RMDW-Wiki-local`, `selective-reference-only`, with `credentials.md`, `customers.md`, `customers/`, and `raw/` excluded. The GitHub `RichardEchols/rmdw-agent-wiki` repo remains the authoritative wiki source for training/eval provenance unless a reviewer documents a local exception.
+## Category Mix
+The v0.1 category gate passed:
+- Website/UI: at least 75 examples
+- Coding: at least 75 examples
+- Debugging: at least 50 examples
+- Automation: at least 50 examples
+- Tool-use: at least 50 examples
+- Strategy: at least 25 examples
+- Business: at least 15 examples
+- Identity: at least 10 examples
+## Release Review Checklist
+Before public release:
+- Re-run dataset validation.
+- Re-run source inventory against the current GitHub source-of-truth SHAs.
+- Spot-check examples for secrets and private data.
+- Confirm client-site rows are generalized pattern examples or eval-only.
+- Confirm closed-model outputs are not used as supervised completions.
+- Record exact base model revision.
+- Attach upstream license and notices.
+- Attach eval summary.
+- Document known limitations and unsafe use boundaries.

EVAL_SCOREBOARD.md ADDED Viewed

	@@ -0,0 +1,70 @@

+# Kaiju Coder 7 Business-Owner Eval Scoreboard
+This scoreboard tracks the current release-candidate evidence. Do not publish weights or paid API claims until every required row has a dated result and reviewer.
+## Completed Local Gates
+| Gate | Command | Result | Date |
+|---|---|---:|---|
+| Source inventory refresh | `python3 scripts/build_source_inventory.py` | Passed | 2026-06-03 |
+| Candidate validation | `python3 scripts/validate_training_data.py --min-examples 350` | 1,689 examples / passed | 2026-06-03 |
+| v1.7 category targets | `python3 scripts/check_dataset_targets.py --targets datasets/v1.7-targets.json` | Passed | 2026-06-03 |
+| Business-owner SFT build | `python3 scripts/build_v17_business_owner_sft_dataset.py` | 1,881 rows / 192 repeats | 2026-06-03 |
+| Router hard harness | `python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl` | 23/23 | 2026-06-03 |
+| Router static checks | `python3 evals/run_router_static_checks.py runs/evals/20260603T103915Z-kaiju_router_harness/results.jsonl` | 23/23 | 2026-06-03 |
+| Business-suite prompts | Included in router hard harness | 2/2 | 2026-06-03 |
+| Deterministic API harness smoke | `python3 scripts/run_kaiju_api_harness_smoke.py` | Passed: website + business-suite API artifacts | 2026-06-03 |
+| Direct business-suite artifact | `python3 scripts/run_kaiju_router.py --prompt "...Kiyomi 7.7.7 AI company operating pack..." --print-manifest` | 19 files / passed | 2026-06-03 |
+| Full local RC smoke gate | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` | Passed; latest router/static run `20260603T103915Z-kaiju_router_harness` | 2026-06-03 |
+| v1.7 LoRA train | `./scripts/run-gojira-b-qwen36-lora-train.sh` | Finished; runtime `1663.7101s`, train loss `1.7260706673065822`, adapter present | 2026-06-03 |
+| v1.7 SGLang serve | `./scripts/start-qwen36-lora-sglang.sh` with `KAIJU_QWEN36_LORA_CONTEXT=4096`, `KAIJU_QWEN36_LORA_MEM_FRACTION=0.90` | `/v1/models` returned `kaiju_v17_business_owner` | 2026-06-03 |
+| Raw served adapter smoke: website | `python3 evals/run_openai_compat_smoke.py --base-url http://100.109.109.14:18083/v1 --model kaiju_v17_business_owner --tasks evals/tasks/smoke.jsonl --max-tasks 1 --disable-thinking` | Passed; `20260603T031300Z-kaiju_v17_business_owner`, 2,726 chars in 174.49s | 2026-06-03 |
+| Raw served adapter smoke: proposal | `python3 evals/run_openai_compat_smoke.py --base-url http://100.109.109.14:18083/v1 --model kaiju_v17_business_owner --tasks /tmp/kaiju-proposal-smoke.jsonl --system-prompt-file prompts/kaiju-coder-api-system.md --disable-thinking` | Passed; `20260603T032107Z-kaiju_v17_business_owner`, 4,306 chars in 232.27s | 2026-06-03 |
+| Raw served adapter quality: website | `python3 evals/score_quality_gate.py runs/evals/20260603T033825Z-kaiju_v17_business_owner/results.jsonl` | Failed paid-ready: `3.71/4.0`, missing complete HTML after 12,706 chars / 793.96s | 2026-06-03 |
+| Raw served adapter quality: proposal | `python3 evals/score_quality_gate.py runs/evals/20260603T032107Z-kaiju_v17_business_owner/results.jsonl` | Passed paid-ready: `4.0/4.0` | 2026-06-03 |
+| Raw served adapter quality: Jah credits | `python3 evals/score_quality_gate.py runs/evals/20260603T035612Z-kaiju_v17_business_owner/results.jsonl` | Passed paid-ready: `4.0/4.0` | 2026-06-03 |
+| Base Qwen comparison: proposal | `python3 evals/compare_quality_runs.py runs/quality-gates/20260603T035200Z-qwen36-27b/scores.jsonl runs/quality-gates/20260603T032107Z-kaiju_v17_business_owner/scores.jsonl` | Tie: base `4.0/4.0`, Kaiju v1.7 `4.0/4.0` | 2026-06-03 |
+| Base Qwen comparison: Jah credits | `python3 evals/compare_quality_runs.py runs/quality-gates/20260603T040140Z-qwen36-27b/scores.jsonl runs/quality-gates/20260603T035612Z-kaiju_v17_business_owner/scores.jsonl` | Tie: base `4.0/4.0`, Kaiju v1.7 `4.0/4.0`; deterministic outputs were byte-identical | 2026-06-03 |
+| Raw adapter differentiation probe | Identity and Jah probes comparing `qwen36-27b` to `kaiju_v17_business_owner` | Current v1.7 SGLang outputs can be byte-identical to base on deterministic prompts; 24-step v1.7 is too weak as a raw-weight differentiator | 2026-06-03 |
+| v1.8 stronger LoRA train | `KAIJU_LORA_CONFIG=training/configs/qwen36-27b-lora-v1.8-business-owner.example.json KAIJU_SFT_DATASET=datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl KAIJU_LORA_RUN_DIR=runs/qwen36-27b-lora-v1.8-business-owner KAIJU_MIN_TRAIN_EXAMPLES=350 KAIJU_SKIP_DATASET_BUILD=1 KAIJU_TRAIN_BACKGROUND=1 ./scripts/run-gojira-b-qwen36-lora-train.sh` | Finished; runtime `11666.7564s`, train loss `0.9281658741335074`, adapter present | 2026-06-03 |
+| v1.8 SGLang dynamic LoRA serve | `./scripts/start-qwen36-lora-sglang.sh` with v1.8 adapter, `KAIJU_QWEN36_LORA_CONTEXT=8192`, `KAIJU_QWEN36_LORA_MEM_FRACTION=0.90` | Historical only: `/v1/models` listed `kaiju_v18_business_owner`, but adapter-name-only output can be base-equivalent; not release evidence | 2026-06-03 |
+| Corrected v1.8 dynamic LoRA selector | Model selector `qwen36-27b:kaiju_v18_business_owner` under SGLang with fused target modules | Fails: `LoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16])`; dynamic LoRA is not the release path | 2026-06-03 |
+| v1.8 LoRA merge | `KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter ./scripts/run-gojira-b-qwen36-lora-merge.sh` | Passed; merged full model at `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`, `51G`, `14` shards | 2026-06-03 |
+| Kaiju Coder 7 merged SGLang serve | `./scripts/start-qwen36-merged-sglang.sh` with `KAIJU_QWEN36_MERGED_CONTEXT=32768`, `KAIJU_QWEN36_MERGED_MEM_FRACTION=0.90` | `/v1/models` returned `kaiju-coder-7`, max model len `32768`; 12k/16k/24k/32k evidence is recorded in `release/SERVING_BENCHMARKS.md` | 2026-06-03 |
+| Kaiju Coder 7 restored 32k direct API smoke | `python3 scripts/benchmark_kaiju_serving.py --contexts 32768 --prompts identity business_doc --max-tokens 768 --timeout 420` | Passed; `/v1/models` returned `kaiju-coder-7`, max model len `32768`; identity `2.92s`; business proposal `94.28s`, `1,737` chars | 2026-06-03 |
+| Kaiju Coder 7 restored 32k OpenCode one-file smoke | `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-32k-final-smoke 'Create hello.txt with exactly: Kaiju Coder 7 final 32k ok'` | Passed; wrote `hello.txt` with exactly `Kaiju Coder 7 final 32k ok` | 2026-06-03 |
+| Kaiju Coder 7 current restored 16k direct API smoke | `python3 scripts/benchmark_kaiju_serving.py --contexts 16384 --prompts identity --max-tokens 64 --timeout 120` | Passed; latest run `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`, identity `2.3s`, `26` chars | 2026-06-03 |
+| Kaiju Coder 7 current restored 16k OpenCode one-file smoke | `mkdir -p /tmp/kaiju-opencode-fresh-public-smoke && opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-fresh-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'` | Passed; `/v1/models` returned `kaiju-coder-7`, max model len `16384`; wrote `hello.txt` with exactly `Kaiju Coder 7 fresh public smoke ok` | 2026-06-03 |
+| Kaiju Coder 7 packaged public OpenCode smoke | `python3 scripts/run_kaiju_public_opencode_smoke.py --timeout 900 --keep-dir` | Passed; latest run `runs/public-opencode-smoke/20260603T182222Z/summary.md`, `4/4` checks passed; installer dry-run, OpenCode `1.15.13`, live 16k model, and file written only in the requested temp workspace | 2026-06-03 |
+| Kaiju Coder 7 loop-guarded OpenCode install | `python3 scripts/install_kaiju_opencode_profile.py`; `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'` | Passed; config includes `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`; wrote `loopguard.txt` with exact requested content and exited cleanly | 2026-06-03 |
+| Current harnessed OpenCode customer-readiness pack | `python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed` | Passed; latest run `runs/opencode-customer-readiness/20260603T185835Z/summary.md`, `4/4` tasks passed and `28/28` required files written, including release provenance and safety review | 2026-06-03 |
+| Paid API Worker scaffold | `cd gateway/cloudflare-worker && npm run check && npm run preflight` | Passed `16/16` Worker tests and `17` scaffold preflight checks; covers bearer auth, inactive keys, insufficient credits, debit/refund, rate limit before debit, model `kaiju-coder-7` enforcement, stream/thinking/token caps, secret-content rejection without logging, signed Stripe Checkout top-up idempotency, origin-only R2 artifact upload, account-scoped artifact download, guarded Cloudflare resource prep, Wrangler dry-run deploy, sanitized paid-launch evidence template packaging, reviewed Cloudflare bindings template, binding applier guardrails, and sanitized evidence collection helper | 2026-06-03 |
+| Kaiju Coder 7 merged vLLM serve | `KAIJU_VLLM_CONTEXT=16384 ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed at 16k with Gojira nightly vLLM after `pandas` preinstall and `--language-model-only`; identity `19.99s`, code patch `28.8s`; not faster enough to replace SGLang | 2026-06-03 |
+| Kaiju Coder 7 runtime-quantized vLLM serve | `KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_QUANTIZATION=bitsandbytes KAIJU_VLLM_LOAD_FORMAT=bitsandbytes ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed at 8k and 16k; 16k identity `19.51s`, code patch `11.3s`; vLLM log reported about `17.8 GiB` model memory | 2026-06-03 |
+| Kaiju Coder 7 runtime-quantized business-doc smoke | `KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_QUANTIZATION=bitsandbytes KAIJU_VLLM_LOAD_FORMAT=bitsandbytes KAIJU_VLLM_PROMPTS=business_doc KAIJU_VLLM_MAX_TOKENS=768 KAIJU_VLLM_PROMPT_TIMEOUT=420 ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed; business proposal `53.44s`, `1,610` chars, `30.127` chars/s; wrapper restored SGLang after completion | 2026-06-03 |
+| Kaiju Coder 7 runtime-quantized OpenCode one-file smoke | `bash scripts/run_kaiju_quantized_opencode_smoke.sh` | Passed at 16k after vLLM `--enable-auto-tool-choice`; OpenCode wrote `hello.txt` with exactly `Kaiju Coder 7 quantized runtime ok` | 2026-06-03 |
+| Hugging Face CLI install/auth check | `hf version && hf auth whoami && hf auth list` | `hf` installed locally at version `1.17.0`; auth user `restokes92`; token name `gojirakiyomikode` | 2026-06-03 |
+| Hugging Face private repo create attempt | `KAIJU_HF_UPLOAD_APPLY=1 bash scripts/upload_hf_release_staging.sh` with namespaces `RichardEchols`, `RMDWLLC`, and `restokes92` | Blocked by Hugging Face `403 Forbidden`; current token cannot create model repos in those namespaces | 2026-06-03 |
+| Hugging Face merged-model metadata and upload boundary | `bash scripts/prepare_hf_merged_model_metadata.sh`; `KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh`; `bash scripts/upload_hf_merged_model_from_gojira_b.sh`; `KAIJU_HF_UPLOAD_APPLY=1 bash scripts/upload_hf_merged_model_from_gojira_b.sh` | Metadata prep synced model card, quickstarts, provenance, benchmarks, evals, paid API status, final report, upstream license, and `MERGED_MODEL_RELEASE_MANIFEST.json` to Gojira-B; sudo rsync handled the root-owned merged folder; upload dry run confirmed metadata plus the `51G`/`14`-shard merged model before printing `hf upload-large-folder`; apply remains blocked by human review and Hugging Face namespace permission before any large upload | 2026-06-03 |
+| v1.8 merged endpoint probe | Direct OpenAI-compatible chat request with top-level `chat_template_kwargs` disabling thinking | Passed; `1,155` visible chars in `60.17s`, normal `content` response | 2026-06-03 |
+| Kaiju Coder 7 merged focused proposal eval | `python3 evals/run_openai_compat_smoke.py --model kaiju-coder-7 --tasks evals/tasks/business-owner-v18-comparison.jsonl --max-tasks 1 --max-tokens 1800 ...` then `python3 evals/score_quality_gate.py <results.jsonl>` | Passed: `1/1` paid-ready, `4.0/4.0`, `4,014` chars, `212.72s` | 2026-06-03 |
+| Kaiju Coder 7 merged focused Jah credits eval | `python3 evals/run_openai_compat_smoke.py --model kaiju-coder-7 --tasks evals/tasks/business-owner-v18-comparison.jsonl ...` then `python3 evals/score_quality_gate.py <results.jsonl>` | Passed: `4.0/4.0`, `9,718` chars, `566.36s` | 2026-06-03 |
+| Full local RC smoke gate | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` | Passed; latest router/static run `20260603T103915Z-kaiju_router_harness` | 2026-06-03 |
+## Required Before Release
+| Gate | Required result | Status |
+|---|---|---|
+| v1.7 LoRA train | Finished metrics and adapter under `runs/qwen36-27b-lora-v1.7-business-owner` | Passed |
+| v1.8 stronger LoRA train | Finished metrics and adapter under `runs/qwen36-27b-lora-v1.8-business-owner` | Passed |
+| v1.8 merged focused smoke | `python3 evals/run_openai_compat_smoke.py --tasks evals/tasks/business-owner-v18-comparison.jsonl --model kaiju-coder-7 ...` then `python3 evals/score_quality_gate.py` | Passed for proposal rerun and Jah credits backend; broader sweep pending |
+| Direct commercial eval | No critical failures, scored summary attached | Passed for targeted high-value tasks when using the product harness plus 8k raw website mode; broader task sweep still pending |
+| Base Qwen comparison | Kaiju beats base Qwen on RMDW/Kiyomi practical tasks | Not yet: raw deterministic identity still matches base; compare broader tasks before model-level improvement claims |
+| GLM comparison | Kaiju is near or above GLM on highest-value business-owner tasks | Pending |
+| Local inference smoke | OpenAI-compatible endpoint returns usable business-owner artifact | Passed for v1.8 merged SGLang endpoint and product harness |
+| Human review | Richard reviews artifacts for usefulness, privacy, and sellability | Pending |
+| Release package | Model card, provenance, license notes, eval summary, limitations, Hugging Face draft, completion audit, and run instructions complete | Staged and upload-scripted; upload blocked by HF token permissions and human/public-review decision |
+## Decision Rule
+The v1.8 adapter is a completed local checkpoint and the merged full model is the current served raw-model path. The business-owner product should still be published honestly as merged model plus deterministic harness plus verifier. Raw merged v1.8 is useful on business documents and Jah credits but slow on this SGLang stack. Do not claim raw-weight superiority until broader base/GLM and raw website comparisons pass.

FINAL_RELEASE_REPORT.md ADDED Viewed

	@@ -0,0 +1,269 @@

+# Kaiju Coder 7 Final Release Report
+Generated: `2026-06-03T20:03:02Z`
+Product name: `Kaiju Coder 7`
+Public model id: `kaiju-coder-7`
+Current source branch: `codex/kaiju-business-owner-rc`
+Current HEAD: `3d57eae92ad523519473f0ff3eca6661a9736de3`
+Current `origin/main`: `3d57eae92ad523519473f0ff3eca6661a9736de3`
+## Current Verdict
+Kaiju Coder 7 is a local public-testing release candidate, not a fully public
+commercial launch yet. The local model path, OpenCode profile, harnessed
+business-owner evals, Hugging Face staging package, runtime-quantized recipe,
+and paid API scaffold are in place. Public release still requires human
+approval, a write-capable Hugging Face namespace/token, and live paid API
+resources before the hosted API can be sold.
+## Runtime
+| Field | Value |
+|---|---|
+| Status | `pass` |
+| Base URL | `http://100.109.109.14:18083/v1` |
+| Model id | `kaiju-coder-7` |
+| Max model length | `16384` |
+| Detail | `` |
+Recommended default today: `16k` context through `kaiju-coder-7`. Higher
+context has benchmark evidence, but the currently parked default is 16k for
+stability and speed.
+## Readiness Summary
+| Area | Result |
+|---|---|
+| Local public-testing readiness | `ready=True pass=23 fail=0 manual=1 rc=0` |
+| Hugging Face release readiness | `ready=True pass=23 fail=0 manual=1 rc=0` |
+| Public launch readiness | `ready=False pass=23 fail=1 manual=0 rc=1` |
+| Hugging Face staging integrity | `ready=True pass=6 fail=0 manual=0 rc=0` |
+| Paid API launch readiness | `ready=False pass=17 fail=3 manual=7 rc=1` |
+## Hugging Face Release Blockers
+| Status | Check | Detail |
+|---|---|---|
+| manual | paid API launch preflight | 17 pass, 3 fail, 7 manual |
+## Public Launch Blockers
+| Status | Check | Detail |
+|---|---|---|
+| fail | paid API launch preflight | 17 pass, 3 fail, 7 manual |
+## Paid API Launch Blockers
+| Status | Check | Detail |
+|---|---|---|
+| fail | live D1 binding | KAIJU_BILLING_DB is missing or still placeholder/commented |
+| fail | live KV binding | KAIJU_RATE_LIMIT_KV is missing or still placeholder/commented |
+| fail | artifact R2 binding | KAIJU_ARTIFACT_BUCKET is missing; artifact routes cannot launch |
+| manual | public route mode | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `public_route_mode` |
+| manual | wrangler secret list confirms KAIJU_ORIGIN_URL, KAIJU_ORIGIN_SECRET, and KAIJU_STRIPE_WEBHOOK_SECRET | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `wrangler_secrets_verified` |
+| manual | D1 migration 0001_paid_api.sql applied to the live billing database | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `d1_migration_applied` |
+| manual | Stripe Checkout top-up products and webhook endpoint tested with metadata.kaiju_api_key_id | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `stripe_checkout_topup_staging` |
+| manual | staging request passed through Worker to Gojira-B origin with model=kaiju-coder-7 | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `worker_to_gojira_staging_request` |
+| manual | rollback command or route switch was exercised and recorded | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `rollback_exercised` |
+| manual | p95 latency for paid routes is recorded after staging traffic | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `paid_route_latency` |
+## Evidence Paths
+| Evidence | Path |
+|---|---|
+| Completion audit | `release/COMPLETION_AUDIT.md` |
+| Goal completion audit | `release/GOAL_COMPLETION_AUDIT.md` |
+| Release evidence refresh runner | `scripts/refresh_kaiju_release_evidence.py` |
+| Eval scoreboard | `release/EVAL_SCOREBOARD.md` |
+| Public testing quickstart | `release/PUBLIC_TESTING_QUICKSTART.md` |
+| Serving benchmarks | `release/SERVING_BENCHMARKS.md` |
+| Hugging Face release draft | `release/HUGGINGFACE_RELEASE_DRAFT.md` |
+| Hugging Face release bundle | `release/bundles/LATEST.md` |
+| Hugging Face bundle integrity checker | `scripts/check_hf_release_bundle_integrity.py` |
+| Hugging Face permission evidence template | `release/hf-release-permission-evidence.example.json` |
+| Hugging Face permission evidence collector | `scripts/collect_hf_release_permission_evidence.py` |
+| Hugging Face permission evidence checker | `scripts/check_hf_release_permission_evidence.py` |
+| Merged-model metadata prep | `scripts/prepare_hf_merged_model_metadata.sh` |
+| Human release review gate | `release/HUMAN_RELEASE_REVIEW.md` |
+| Paid API readiness | `release/PAID_API_READINESS.md` |
+| Paid API evidence collector | `scripts/collect_paid_api_launch_evidence.py` |
+| Paid API launch evidence template | `release/paid-api-launch-evidence.example.json` |
+| Cloudflare bindings template | `release/cloudflare-bindings.example.json` |
+| Cloudflare bindings applier | `scripts/apply_paid_api_cloudflare_bindings.py` |
+| Latest direct API smoke | `runs/benchmarks/20260603T193000Z-kaiju-coder-7-serving/summary.md` |
+| Latest OpenCode customer pack | `runs/opencode-customer-readiness/20260603T185835Z/summary.md` |
+| Latest public OpenCode smoke | `runs/public-opencode-smoke` |
+## What Richard Should Test First
+```bash
+python3 scripts/check_kaiju_public_release_readiness.py --mode local
+python3 scripts/install_kaiju_opencode_profile.py
+mkdir -p /tmp/kaiju-public-smoke
+opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 public smoke ok'
+python3 scripts/run_kaiju_public_opencode_smoke.py
+python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed
+bash scripts/prepare_hf_merged_model_metadata.sh
+bash scripts/prepare_hf_release_staging.sh
+python3 scripts/check_hf_staging_integrity.py --require-checksums
+python3 scripts/create_hf_release_bundle.py
+python3 scripts/check_hf_release_bundle_integrity.py
+python3 scripts/check_kaiju_goal_completion.py --write
+python3 scripts/refresh_kaiju_release_evidence.py --skip-opencode-smoke
+python3 scripts/collect_hf_release_permission_evidence.py
+# After HF repo-create permission is fixed:
+python3 scripts/collect_hf_release_permission_evidence.py --apply --write
+python3 scripts/check_hf_release_permission_evidence.py
+python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
+python3 scripts/check_kaiju_public_release_readiness.py --mode public
+cp release/cloudflare-bindings.example.json release/cloudflare-bindings.json
+# Replace placeholder D1/KV IDs in release/cloudflare-bindings.json first.
+python3 scripts/apply_paid_api_cloudflare_bindings.py --bindings-file release/cloudflare-bindings.json
+cp release/paid-api-launch-evidence.example.json release/paid-api-launch-evidence.json
+python3 scripts/collect_paid_api_launch_evidence.py --help
+python3 scripts/check_paid_api_readiness.py --mode launch --evidence-file release/paid-api-launch-evidence.json
+```
+Do not expose the paid hosted API until `python3
+scripts/check_paid_api_readiness.py --mode launch` has no failures and the
+human release review explicitly approves public paid API launch.
+## Changed Files
+`git status --short` currently reports `112` changed paths.
+| State | Path |
+|---|---|
+| M | `.gitignore` |
+| M | `LICENSE_NOTES.md` |
+| M | `README.md` |
+| M | `datasets/schema.json` |
+| M | `docs/custom-harness.md` |
+| M | `evals/BAKEOFF_CURRENT.md` |
+| M | `evals/run_openai_compat_smoke.py` |
+| M | `evals/run_router_static_checks.py` |
+| M | `evals/tasks/router-hard-harness.jsonl` |
+| M | `gateway/README.md` |
+| M | `gateway/cloudflare-worker/README.md` |
+| M | `gateway/cloudflare-worker/migrations/0001_paid_api.sql` |
+| M | `gateway/cloudflare-worker/package.json` |
+| M | `gateway/cloudflare-worker/src/index.js` |
+| M | `gateway/cloudflare-worker/test/index.test.js` |
+| M | `kaiju_harness/router.py` |
+| M | `kaiju_harness/verification.py` |
+| D | `models/README.md` |
+| D | `models/qwen3.6-27b-base.md` |
+| D | `models/qwen3.6-27b-fp8.md` |
+| M | `prompts/kaiju-coder-api-system.md` |
+| M | `prompts/kaiju-coder-speed-system.md` |
+| M | `release/DATA_PROVENANCE_DRAFT.md` |
+| M | `release/MODEL_CARD_DRAFT.md` |
+| M | `scripts/build_sft_dataset.py` |
+| M | `scripts/check-gojira-b-capacity.sh` |
+| M | `scripts/run-gojira-b-qwen36-lora-eval.sh` |
+| M | `scripts/run-gojira-b-qwen36-lora-sglang-eval.sh` |
+| M | `scripts/run-gojira-b-qwen36-lora-train.sh` |
+| M | `scripts/run_kaiju_api_harness_smoke.py` |
+| M | `scripts/start-qwen36-lora-sglang.sh` |
+| M | `scripts/stop-qwen36-lora-sglang.sh` |
+| M | `scripts/validate_training_data.py` |
+| M | `scripts/watch-gojira-b-qwen36-lora-train.sh` |
+| ?? | `.opencode/` |
+| ?? | `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl` |
+| ?? | `datasets/v1.7-targets.json` |
+| ?? | `evals/tasks/business-owner-v18-comparison.jsonl` |
+| ?? | `evals/tasks/business-owner-v18-smoke.jsonl` |
+| ?? | `evals/tasks/opencode-customer-readiness.jsonl` |
+| ?? | `kaiju_harness/business_suite.py` |
+| ?? | `release/COMPLETION_AUDIT.md` |
+| ?? | `release/EVAL_SCOREBOARD.md` |
+| ?? | `release/FINAL_RELEASE_REPORT.md` |
+| ?? | `release/GOAL_COMPLETION_AUDIT.md` |
+| ?? | `release/HF_ADAPTER_MODEL_CARD.md` |
+| ?? | `release/HUGGINGFACE_RELEASE_DRAFT.md` |
+| ?? | `release/HUMAN_RELEASE_REVIEW.md` |
+| ?? | `release/LOCAL_TEST_INSTRUCTIONS.md` |
+| ?? | `release/PAID_API_READINESS.md` |
+| ?? | `release/PUBLIC_TESTING_QUICKSTART.md` |
+| ?? | `release/QUANTIZATION_PLAN.md` |
+| ?? | `release/SERVING_BENCHMARKS.md` |
+| ?? | `release/SOURCE_INVENTORY.md` |
+| ?? | `release/UPSTREAM_LICENSE_CHECK.md` |
+| ?? | `release/bundles/` |
+| ?? | `release/cloudflare-bindings.example.json` |
+| ?? | `release/hf-release-permission-evidence.example.json` |
+| ?? | `release/hf-release-permission-evidence.json` |
+| ?? | `release/huggingface/` |
+| ?? | `release/opencode/` |
+| ?? | `release/paid-api-launch-evidence.example.json` |
+| ?? | `release/quantized-runtime/` |
+| ?? | `release/source-inventory.json` |
+| ?? | `release/upstream/` |
+| ?? | `scripts/apply_paid_api_cloudflare_bindings.py` |
+| ?? | `scripts/benchmark_kaiju_serving.py` |
+| ?? | `scripts/build_source_inventory.py` |
+| ?? | `scripts/build_v17_business_owner_sft_dataset.py` |
+| ?? | `scripts/check_hf_release_bundle_integrity.py` |
+| ?? | `scripts/check_hf_release_permission_evidence.py` |
+| ?? | `scripts/check_hf_release_permissions.sh` |
+| ?? | `scripts/check_hf_staging_integrity.py` |
+| ?? | `scripts/check_hf_uploaded_release.py` |
+| ?? | `scripts/check_human_release_review.py` |
+| ?? | `scripts/check_kaiju_goal_completion.py` |
+| ?? | `scripts/check_kaiju_public_release_readiness.py` |
+| ?? | `scripts/check_kaiju_quantization_prereqs.py` |
+| ?? | `scripts/check_paid_api_readiness.py` |
+| ?? | `scripts/collect_hf_release_permission_evidence.py` |
+| ?? | `scripts/collect_paid_api_launch_evidence.py` |
+| ?? | `scripts/create_hf_release_bundle.py` |
+| ?? | `scripts/generate_kaiju_final_report.py` |
+| ?? | `scripts/gojira-b-ssh-lib.sh` |
+| ?? | `scripts/install_kaiju_opencode_profile.py` |
+| ?? | `scripts/opencode-kaiju-no-autocontinue.mjs` |
+| ?? | `scripts/prepare_hf_merged_model_metadata.sh` |
+| ?? | `scripts/prepare_hf_release_staging.sh` |
+| ?? | `scripts/prepare_paid_api_cloudflare_resources.sh` |
+| ?? | `scripts/probe-gojira-b-kaiju-quantization.sh` |
+| ?? | `scripts/refresh_kaiju_release_evidence.py` |
+| ?? | `scripts/run-gojira-b-qwen36-lora-merge.sh` |
+| ?? | `scripts/run-gojira-b-vllm-serving-benchmark.sh` |
+| ?? | `scripts/run_kaiju_business_owner_rc_smoke.py` |
+| ?? | `scripts/run_kaiju_opencode_customer_pack.py` |
+| ?? | `scripts/run_kaiju_public_opencode_smoke.py` |
+| ?? | `scripts/run_kaiju_quantized_opencode_smoke.sh` |
+| ?? | `scripts/start-qwen36-merged-sglang.sh` |
+| ?? | `scripts/start-qwen36-merged-vllm.sh` |
+| ?? | `scripts/stop-qwen36-merged-sglang.sh` |
+| ?? | `scripts/stop-qwen36-merged-vllm.sh` |
+| ?? | `scripts/upload_hf_merged_model_from_gojira_b.sh` |
+| ?? | `scripts/upload_hf_release_staging.sh` |
+| ?? | `tests/test_kiyomi_business_suite.py` |
+| ?? | `tests/test_release_package.py` |
+| ?? | `tests/test_source_inventory.py` |
+| ?? | `tests/test_training_provenance.py` |
+| ?? | `tests/test_v17_business_dataset.py` |
+| ?? | `training/configs/qwen36-27b-lora-v1.7.example.json` |
+| ?? | `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json` |
+| ?? | `training/scripts/qwen36_lora_merge.py` |
+| ?? | `training/v1.7-business-owner-runbook.md` |
+## Commands Run During Report Generation
+| Label | Command | Return code |
+|---|---|---|
+| git branch | `git branch --show-current` | 0 |
+| git HEAD | `git rev-parse HEAD` | 0 |
+| git origin/main | `git rev-parse origin/main` | 0 |
+| git status | `git status --short` | 0 |
+| local readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode local --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 0 |
+| HF release readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode hf-release --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 0 |
+| public readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode public --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 1 |
+| HF staging integrity | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_hf_staging_integrity.py --staging-dir /tmp/kaiju-coder-7-hf-staging --require-checksums --json` | 0 |
+| paid API launch readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_paid_api_readiness.py --mode launch --json` | 1 |
+## Report Safety
+This generator intentionally avoids secret-bearing commands such as auth token
+lists, environment dumps, process command-line scans, Wrangler secret lists, and
+payment-provider credential output.

GOAL_COMPLETION_AUDIT.md ADDED Viewed

	@@ -0,0 +1,60 @@

+# Kaiju Coder 7 Goal Completion Audit
+Generated: `2026-06-03T20:03:23Z`
+Overall: `not complete`
+Summary: `16 passed / 1 blocked / 0 manual`
+This audit maps the active Kaiju Coder 7 objective to current evidence. It is stricter than local readiness: local public testing can pass while Hugging Face upload, human review, and paid API launch remain blocked.
+## Readiness Commands
+| Check | Ready | Return Code |
+|---|---:|---:|
+| Local public-testing readiness | `True` | `0` |
+| Hugging Face release readiness | `True` | `0` |
+| Public launch readiness | `False` | `1` |
+| Paid API scaffold | `True` | `0` |
+| Paid API launch | `False` | `1` |
+| HF staging integrity | `True` | `0` |
+| HF namespace permission evidence | `True` | `0` |
+| Human public review | `True` | `0` |
+## Requirement Audit
+| Area | Requirement | Status | Evidence | Blocker |
+|---|---|---|---|---|
+| Identity | Product name is Kaiju Coder 7 and public/API model id is kaiju-coder-7. | `passed` | scripts/check_kaiju_public_release_readiness.py --mode local; release/PUBLIC_TESTING_QUICKSTART.md |  |
+| OpenCode | Lean Kaiju-specific OpenCode config/agent minimizes prompt overhead and disables synthetic auto-continue loops. | `passed` | .opencode/agents/kaiju-coder-7.md; scripts/opencode-kaiju-no-autocontinue.mjs; scripts/install_kaiju_opencode_profile.py |  |
+| OpenCode | opencode -m kaiju/kaiju-coder-7 works from this Mac with the recommended config. | `passed` | runs/public-opencode-smoke latest passing summary; scripts/run_kaiju_public_opencode_smoke.py |  |
+| OpenCode | Customer-readiness pack passes without wrong-directory output, fake compaction completion, missing files, or secret leakage. | `passed` | runs/opencode-customer-readiness/20260603T185835Z/summary.md |  |
+| Runtime | Direct API smoke passes using model=kaiju-coder-7. | `passed` | runs/benchmarks/20260603T193000Z-kaiju-coder-7-serving/summary.md |  |
+| Runtime | 12k, 16k, 24k, and 32k context benchmarks are recorded with a recommended default. | `passed` | release/SERVING_BENCHMARKS.md records 12288, 16384, 24576, 32768 and recommends 16k live default |  |
+| Runtime | SGLang and vLLM/practical faster serving path are benchmarked honestly. | `passed` | release/SERVING_BENCHMARKS.md; release/quantized-runtime/README.md |  |
+| Runtime | At least one public-friendly quantized/local candidate is working or clearly documented as blocked with evidence. | `passed` | release/quantized-runtime/README.md documents vLLM bitsandbytes runtime candidate and persisted-weights limitation |  |
+| Hugging Face | Public-friendly HF release structure is staged with adapter, OpenCode helper, runtime-quantized helper, model cards, provenance, evals, and docs. | `passed` | python3 scripts/check_hf_staging_integrity.py --require-checksums |  |
+| Hugging Face | At least one public Hugging Face release path is ready to upload or uploaded. | `passed` | python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release |  |
+| Hugging Face | Merged 51GB model repo upload is guarded and ready after human review/namespace permission. | `passed` | scripts/prepare_hf_merged_model_metadata.sh; scripts/upload_hf_merged_model_from_gojira_b.sh dry run |  |
+| Quality | Customer-style evals cover website, proposal, Stripe/payment, CRM/reporting, CSV/parser, Kiyomi operating pack, and safety/provenance. | `passed` | evals/tasks/opencode-customer-readiness.jsonl; runs/opencode-customer-readiness/20260603T185835Z/summary.md |  |
+| Quality | Model/harness prompts produce file-oriented business-owner artifacts rather than vague advice. | `passed` | kaiju_harness/business_suite.py; release/EVAL_SCOREBOARD.md |  |
+| Provenance | Training/eval provenance is preserved and public docs avoid internal checkpoint naming except license/provenance attribution. | `passed` | release/SOURCE_INVENTORY.md; release/DATA_PROVENANCE_DRAFT.md; release/PUBLIC_TESTING_QUICKSTART.md |  |
+| Paid API | Paid API scaffold covers API keys, Stripe billing, rate limits, logging controls, abuse controls, rollback plan, and pricing assumptions. | `passed` | python3 scripts/check_paid_api_readiness.py --mode scaffold; gateway/cloudflare-worker tests |  |
+| Paid API | Paid API is ready for public charging. | `blocked` | python3 scripts/check_paid_api_readiness.py --mode launch | Requires live D1/KV/R2 bindings, Wrangler secrets, Stripe staging evidence, Worker-to-Gojira staging request, rollback proof, latency evidence, and human approval. |
+| Final Report | Final report includes exact commands run, eval results, changed files, remaining risks, and what Richard should test first. | `passed` | release/FINAL_RELEASE_REPORT.md |  |
+## Blocking Items
+- Paid API: Paid API is ready for public charging.: Requires live D1/KV/R2 bindings, Wrangler secrets, Stripe staging evidence, Worker-to-Gojira staging request, rollback proof, latency evidence, and human approval.
+## Commands To Re-run
+```bash
+python3 scripts/check_kaiju_public_release_readiness.py --mode local
+python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
+python3 scripts/check_kaiju_public_release_readiness.py --mode public
+python3 scripts/check_paid_api_readiness.py --mode scaffold
+python3 scripts/check_paid_api_readiness.py --mode launch
+python3 scripts/check_hf_staging_integrity.py --require-checksums
+python3 scripts/check_hf_release_permission_evidence.py
+python3 scripts/check_human_release_review.py --mode public
+```

LOCAL_TEST_INSTRUCTIONS.md ADDED Viewed

	@@ -0,0 +1,147 @@

+# Kaiju Coder 7 Local Test Instructions
+Use these commands from the repo root. The public release name is Kaiju Coder 7. Internally, this build is backed by the v1.8 adapter under `runs/qwen36-27b-lora-v1.8-business-owner/adapter`. The release-candidate raw model path is the merged full model on Gojira B at `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`. The deterministic harness commands work locally now; the SGLang commands require Gojira B over Tailscale.
+## Run The Local Release-Candidate Gate
+```bash
+python3 scripts/run_kaiju_business_owner_rc_smoke.py
+```
+This validates reviewed data, checks v1.7 targets, builds the oversampled business-owner SFT file, smokes the local OpenAI-compatible harness API, runs the hard router suite, and runs static artifact checks.
+For release status, read `release/COMPLETION_AUDIT.md` and `release/HUGGINGFACE_RELEASE_DRAFT.md`.
+## Merge The v1.8 Adapter
+Use this if the merged full model must be rebuilt:
+```bash
+KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter \
+KAIJU_MERGED_MODEL_DIR=/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged \
+  ./scripts/run-gojira-b-qwen36-lora-merge.sh
+```
+## Start Kaiju Coder 7 Serving
+Use this for the current model-side candidate:
+```bash
+KAIJU_QWEN36_MERGED_PORT=18083 \
+KAIJU_QWEN36_MERGED_SESSION=kaiju_qwen36_v18_merged_sglang \
+KAIJU_QWEN36_MERGED_CONTEXT=16384 \
+KAIJU_QWEN36_MERGED_MEM_FRACTION=0.85 \
+  ./scripts/start-qwen36-merged-sglang.sh
+```
+Confirm readiness:
+```bash
+curl http://100.109.109.14:18083/v1/models
+```
+The high-context `32768` target has benchmark evidence in
+`release/SERVING_BENCHMARKS.md`, but the current restored Gojira-B endpoint is
+parked at `16384` for reliable local/OpenCode testing after the quantized-vLLM
+smoke work.
+## Prepare Merged-Model Hugging Face Metadata
+Use this before any full merged-model upload review. It syncs release metadata
+into the Gojira-B model folder but does not upload or read Hugging Face tokens.
+If the remote merged folder is root-owned, the helper automatically uses
+passwordless sudo for rsync without changing model ownership:
+```bash
+bash scripts/prepare_hf_merged_model_metadata.sh
+KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh
+bash scripts/upload_hf_merged_model_from_gojira_b.sh
+```
+## Install And Smoke OpenCode
+```bash
+python3 scripts/install_kaiju_opencode_profile.py
+opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
+  --dir /tmp/kaiju-opencode-loopguard-smoke \
+  --dangerously-skip-permissions \
+  'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'
+```
+The installer writes the `kaiju` provider, the lean `kaiju-coder-7` agent, and
+the scoped no-autocontinue plugin at
+`~/.config/opencode/kaiju-no-autocontinue.mjs`.
+## Run The Deterministic Harness Smoke
+```bash
+python3 scripts/run_kaiju_api_harness_smoke.py
+```
+## Run A Direct Model Eval
+```bash
+python3 evals/run_openai_compat_smoke.py \
+  --base-url http://100.109.109.14:18083/v1 \
+  --model kaiju-coder-7 \
+  --tasks evals/tasks/smoke.jsonl \
+  --max-tasks 1 \
+  --timeout 300 \
+  --max-tokens 768 \
+  --temperature 0 \
+  --disable-thinking \
+  --system-prompt-file prompts/kaiju-coder-api-system.md
+```
+For the selected final business-owner checkpoint, run the focused v1.8
+business-owner pack and then score it. Raw merged model generation is slow, so
+use the harness for practical paid website delivery until broader raw website
+evals pass at acceptable latency:
+```bash
+python3 evals/run_openai_compat_smoke.py \
+  --base-url http://100.109.109.14:18083/v1 \
+  --model kaiju-coder-7 \
+  --tasks evals/tasks/business-owner-v18-comparison.jsonl \
+  --timeout 900 \
+  --max-tokens 2500 \
+  --temperature 0 \
+  --disable-thinking \
+  --stream \
+  --system-prompt-file prompts/kaiju-coder-api-system.md
+python3 evals/score_quality_gate.py runs/evals/<merged-v18-run>/results.jsonl
+```
+Current merged evidence:
+- Probe: `1,155` visible chars in `60.17s`.
+- Proposal rerun: `1/1` paid-ready, `4.0/4.0`, `4,014` chars in `212.72s`.
+- Jah credits backend: `4.0/4.0`, `9,718` chars in `566.36s`.
+## Dynamic LoRA Serving Caveat
+Do not use dynamic SGLang LoRA serving as release evidence for v1.8. The adapter-name-only path can be base-equivalent, and the corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes this SGLang build with a fused-module LoRA buffer shape mismatch. Use the merged full-model path above.
+## Run The Business-Owner Harness
+```bash
+python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl
+python3 evals/run_router_static_checks.py runs/evals/<router-run>/results.jsonl
+```
+## Manual Prompt To Try First
+```text
+Build me the full Kiyomi 7.7.7 AI company operating pack for a local business owner. I need the launch kit, website, content engine, connector checklist, intake CRM, money report, automations, operator handbook, lead generator, sales closer, ROI dashboard, and Workshop golden run. Make it owner-ready with no developer setup required.
+```
+Expected shape:
+- A project folder with multiple files, not advice only.
+- Complete HTML where HTML is requested.
+- Lead/sales CSVs.
+- Connector verification gates.
+- ROI audit gate.
+- Workshop golden-run gate.
+- Clear owner commands such as `/kiyomi` and `/kiyomi-do`.

PAID_API_READINESS.md ADDED Viewed

	@@ -0,0 +1,266 @@

+# Kaiju Coder 7 Paid API Readiness
+Do not sell the hosted API as generally available until the gates below pass.
+## Current Position
+Kaiju Coder 7 can be served locally through an OpenAI-compatible SGLang
+endpoint. The reliable commercial product path is:
+```text
+Kaiju Coder 7 model + deterministic business-owner harness + verifier + gateway controls
+```
+Raw multi-file OpenCode generation is not yet fast enough to be the paid API
+promise by itself. The harnessed customer-readiness pack passes and should be
+the paid-route baseline until raw-agent generation improves.
+## Required Gateway Behavior
+- Use model id `kaiju-coder-7`.
+- Disable hidden thinking where the serving stack supports it.
+- Stream responses for long outputs.
+- Cap max output by route.
+- Reject requests with secret-looking prompt content when possible.
+- Never log API keys, bearer tokens, OAuth tokens, payment credentials, or full
+  private customer prompts by default.
+- Keep request ids, customer id, route, token counts, latency, status, and coarse
+  failure reason.
+## Billing And Access
+- API keys must be scoped per customer/account.
+- Stripe subscription or prepaid credit balance must be checked before serving.
+- Rate limits must be per key and per account.
+- Failed auth and rate-limit events should be logged without prompt content.
+- Admin override keys must be separate from customer keys.
+## Current Gateway Scaffold Evidence
+Local Worker scaffold:
+- `gateway/cloudflare-worker/src/index.js`
+- `gateway/cloudflare-worker/migrations/0001_paid_api.sql`
+- `gateway/cloudflare-worker/test/index.test.js`
+Verified on 2026-06-03 with:
+```bash
+cd gateway/cloudflare-worker
+npm run check
+npm run preflight
+```
+Result: `16/16` Worker tests passed and `17` paid API scaffold preflight checks
+passed.
+The scaffold preflight also checks that the guarded Cloudflare resource-prep
+script, `scripts/prepare_paid_api_cloudflare_resources.sh`, is wired through
+`npm run prepare:cloudflare`, and that the reviewed binding template is present.
+Covered locally:
+- missing bearer token returns `401`
+- inactive API key returns `403`
+- insufficient credits return `402` before origin fetch
+- successful chat request forwards `x-kaiju-origin-secret` and debits credits
+- origin fetch failure refunds credits
+- fixed-window rate limit blocks before debit
+- public chat payload is forced to model `kaiju-coder-7`, streaming, thinking
+  disabled, and token capped
+- unsupported model is rejected before debit
+- secret-looking prompt content is rejected before debit, origin fetch, or logs
+- signed Stripe Checkout webhook credits prepaid balance
+- duplicate Stripe Checkout webhook does not double-credit
+- invalid Stripe signature is rejected
+- origin-only artifact upload stores bounded text artifacts in R2
+- authenticated artifact download is scoped to the caller's account namespace
+- unsafe artifact paths are rejected before R2 storage
+- secret-looking artifact content is rejected before R2 storage
+Executable preflight:
+```bash
+python3 scripts/check_kaiju_public_release_readiness.py --mode local
+python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
+python3 scripts/check_kaiju_public_release_readiness.py --mode public
+python3 scripts/generate_kaiju_final_report.py
+python3 scripts/check_kaiju_goal_completion.py --write
+python3 scripts/refresh_kaiju_release_evidence.py --skip-opencode-smoke
+python3 scripts/check_hf_staging_integrity.py
+python3 scripts/check_hf_release_bundle_integrity.py
+python3 scripts/collect_hf_release_permission_evidence.py
+python3 scripts/check_hf_release_permission_evidence.py
+python3 scripts/check_human_release_review.py --mode local
+python3 scripts/check_human_release_review.py --mode public
+cd gateway/cloudflare-worker
+npm run prepare:cloudflare
+cd ../..
+cp release/cloudflare-bindings.example.json release/cloudflare-bindings.json
+# Replace placeholder D1/KV IDs in release/cloudflare-bindings.json first.
+python3 scripts/apply_paid_api_cloudflare_bindings.py --bindings-file release/cloudflare-bindings.json
+python3 scripts/check_paid_api_readiness.py --mode scaffold
+python3 scripts/check_paid_api_readiness.py --mode launch
+```
+`check_kaiju_public_release_readiness.py --mode local` is the consolidated
+public-testing readiness command. It can pass while public upload and paid API
+launch remain manual blockers. `--mode hf-release` checks the downloadable
+model/helper release and requires sanitized Hugging Face namespace permission
+evidence plus human review while keeping paid API launch manual. `--mode public`
+must remain red until Hugging Face write permissions, live Cloudflare resources,
+Stripe staging evidence, rollback proof, and human review are complete.
+`generate_kaiju_final_report.py` writes `release/FINAL_RELEASE_REPORT.md` with
+the current local/public readiness summaries, launch blockers, changed files,
+commands run, and first commands Richard should test. It is part of the release
+packet and does not inspect tokens, environment variables, or process command
+lines.
+`check_kaiju_goal_completion.py --write` writes
+`release/GOAL_COMPLETION_AUDIT.md`, a stricter objective-level audit. It should
+remain red while Hugging Face upload, human review, or live paid API launch
+evidence are missing.
+`refresh_kaiju_release_evidence.py` is a safe local refresh runner. It updates
+direct API smoke evidence, goal audit, final report, HF staging, local bundle,
+merged-model metadata on Gojira-B, and dry-run upload previews without reading
+tokens or uploading anything.
+`check_hf_staging_integrity.py` validates the staged Hugging Face package for
+required files, public naming hygiene, raw secret-looking values, and staging
+checksums. It does not upload, create repos, or print matched secret values.
+`check_hf_release_permission_evidence.py` validates sanitized Hugging Face
+repo-create evidence in `release/hf-release-permission-evidence.json`. Start
+from `release/hf-release-permission-evidence.example.json` only after the
+private permission probe succeeds, or use
+`scripts/collect_hf_release_permission_evidence.py --apply --write` to run the
+probe and write the sanitized evidence automatically. Never include raw auth
+output or tokens.
+`check_human_release_review.py` reads `release/HUMAN_RELEASE_REVIEW.md`. Local
+mode may pass with pending/manual review fields; public mode must fail until
+Richard changes the signoff fields to approved decisions.
+`npm run prepare:cloudflare` is dry-run safe by default. It prints the exact
+Wrangler commands for creating `KAIJU_BILLING_DB`, `KAIJU_RATE_LIMIT_KV`, and
+`KAIJU_ARTIFACT_BUCKET`, applying the D1 migration, setting required secrets,
+deploying, listing deployments, and exercising rollback. `npm run check` also
+runs `npx wrangler deploy --dry-run` so the current Worker build path is validated
+without publishing. Set
+`KAIJU_CF_RESOURCE_APPLY=1` only when the intended Cloudflare account is active.
+After real D1/KV/R2 resources exist, copy
+`release/cloudflare-bindings.example.json` to `release/cloudflare-bindings.json`,
+replace the placeholder IDs, and preview the reviewed config update:
+```bash
+python3 scripts/apply_paid_api_cloudflare_bindings.py \
+  --bindings-file release/cloudflare-bindings.json
+```
+The applier refuses placeholder values and secret-looking input. Only after the
+preview is reviewed should it update `gateway/cloudflare-worker/wrangler.jsonc`:
+```bash
+python3 scripts/apply_paid_api_cloudflare_bindings.py \
+  --bindings-file release/cloudflare-bindings.json \
+  --write
+```
+`--mode scaffold` verifies the local gateway implementation and should pass.
+`--mode launch` is stricter and should fail until real Cloudflare bindings,
+Wrangler secrets, Stripe webhook evidence, staging traffic, latency evidence,
+and rollback proof are attached.
+Launch evidence is attached through a sanitized JSON file:
+```bash
+cp release/paid-api-launch-evidence.example.json release/paid-api-launch-evidence.json
+python3 scripts/collect_paid_api_launch_evidence.py --help
+python3 scripts/check_paid_api_readiness.py --mode launch \
+  --evidence-file release/paid-api-launch-evidence.json
+```
+Use `scripts/collect_paid_api_launch_evidence.py` to preview or write sanitized
+launch evidence after staging resources exist. It can read the staging API key
+from an environment variable for live probes, but it never writes the key, full
+prompt, or model response to the evidence file. By default it prints a preview;
+pass `--write` only after reviewing the target file path.
+Only record secret names, route names, request ids, coarse latency numbers, and
+pass/fail facts. Do not put raw API keys, bearer tokens, OAuth tokens, Stripe
+secret keys, webhook signing secrets, tunnel credentials, full private prompts,
+or customer private data in the evidence file. The checker scans the evidence
+file for common secret-looking values and fails launch readiness if it finds
+them.
+## Minimum API Gates
+| Gate | Required Evidence |
+| --- | --- |
+| Auth | Unauthorized requests fail; valid test key works |
+| Billing | Unpaid/suspended account is denied before model call |
+| Rate limit | Burst and daily caps work per key |
+| Logging | Logs omit secrets and full private prompts |
+| Abuse control | Secret-looking payloads and obviously unsafe automation requests are rejected or redacted |
+| Artifacts | Origin-only R2 upload and account-scoped artifact download pass |
+| Rollback | One command can route traffic back to previous stable model/harness |
+| Latency | p95 for paid routes is documented and acceptable |
+| Quality | Business-owner eval pack passes with complete files/artifacts |
+Current quality evidence:
+- Harnessed customer-readiness pack:
+  `runs/opencode-customer-readiness/20260603T185835Z/summary.md`, `4/4`
+  passed, `28/28` required files written, including the release provenance and
+  safety review task.
+- Restored 32k SGLang direct API smoke:
+  `runs/benchmarks/20260603T155233Z-kaiju-coder-7-serving/summary.md`,
+  identity passed in `2.92s`; business proposal passed in `94.28s` with
+  `1,737` chars.
+- Runtime-quantized vLLM OpenCode smoke:
+  `bash scripts/run_kaiju_quantized_opencode_smoke.sh` passed at 16k after
+  vLLM launched with `--enable-auto-tool-choice`; OpenCode wrote
+  `hello.txt` with exactly `Kaiju Coder 7 quantized runtime ok`.
+- Current restored 16k SGLang direct API smoke:
+  `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`,
+  identity passed in `2.3s`.
+- Raw OpenCode multi-file pack remains a blocker for raw-agent claims.
+## Pricing Assumptions To Validate
+- Raw model tokens are slow and expensive enough that per-token pricing alone is
+  not the right first product.
+- Better first API product: priced business-owner routes such as website pack,
+  proposal pack, ROI/report pack, and Kiyomi operating pack.
+- Charge for complete artifacts and verified workflow output, with token usage
+  as an internal cost-control metric.
+## Release Blockers
+- Raw OpenCode customer-readiness task currently times out on multi-file work.
+- Harnessed customer-readiness route passes; paid API must route through that
+  deterministic product path until a faster raw/quantized path passes.
+- Context-size benchmarks passed at 12k, 16k, 24k, and 32k, but the current
+  parked Gojira-B/OpenCode profile is 16k. Treat 32k as the high-context target
+  to re-confirm after restart before using it as a public default.
+- Restored 32k business-document direct API smoke passed, but the `94.28s`
+  latency is too slow for ungated paid API use without streaming, queueing,
+  and route-level caps.
+- vLLM serving has been tested at 16k, but it is not clearly faster than SGLang
+  and needs the Gojira nightly image plus text-only launch flags.
+- Runtime-quantized vLLM bitsandbytes has passed 8k and 16k identity/code
+  smoke tests, passed a 16k business-document smoke in `53.44s`, and reduces
+  model memory to about `17.8 GiB`; its OpenCode one-file smoke now passes.
+- Persisted quantized public weights are still pending.
+- Hosted gateway scaffold now has local-tested API key, D1 prepaid credits,
+  fixed-window rate limit, model enforcement, secret-content rejection, and
+  signed Stripe webhook top-up behavior. It also has a sanitized launch-evidence
+  collector for the remaining staging proof. It is not live-paid ready until real
+  Cloudflare resources, Stripe products/webhook endpoint, deployment secrets,
+  sanitized launch evidence, and staging end-to-end requests pass.
+- `python3 scripts/check_paid_api_readiness.py --mode launch` currently fails
+  by design because live D1/KV/R2 bindings and manual launch evidence are not
+  attached. This prevents local scaffold readiness from being mistaken for
+  paid public launch approval.

PUBLIC_TESTING_QUICKSTART.md ADDED Viewed

	@@ -0,0 +1,149 @@

+# Kaiju Coder 7 Public Testing Quickstart
+Kaiju Coder 7 is the public model name. The OpenAI-compatible model id is:
+```text
+kaiju-coder-7
+```
+Use this guide for serious public testing. It avoids internal checkpoint names
+and keeps the current limitations clear.
+## Pick A Test Path
+### Path 1: OpenCode Against An Existing Endpoint
+Use this if you already have Kaiju Coder 7 served at an OpenAI-compatible
+`/v1` endpoint.
+```bash
+git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
+cd kaiju-coder-7-opencode
+python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
+```
+Then run OpenCode inside the project you want to edit:
+```bash
+opencode -m kaiju/kaiju-coder-7 --agent kaiju-coder-7
+```
+For a bounded smoke test:
+```bash
+mkdir -p /tmp/kaiju-public-smoke
+opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
+  --dir /tmp/kaiju-public-smoke \
+  "Create hello.txt with exactly: Kaiju Coder 7 is ready"
+```
+Or run the packaged verifier, which checks the installer, live model endpoint,
+OpenCode binary, actual file creation, and wrong-directory behavior:
+```bash
+python3 scripts/run_kaiju_public_opencode_smoke.py
+```
+The helper installer adds:
+- the `kaiju` OpenAI-compatible provider
+- the lean `kaiju-coder-7` OpenCode agent
+- a scoped no-autocontinue plugin that prevents false completion loops after
+  compaction or output limits
+### Path 2: Full Local Weights
+Use this if the full `RMDWLLC/kaiju-coder-7` Hugging Face repo has been
+uploaded and you have suitable local GPU hardware.
+```bash
+hf download RMDWLLC/kaiju-coder-7 --local-dir ./kaiju-coder-7
+```
+Serve the downloaded folder with an OpenAI-compatible local server. Configure
+the server to expose:
+```text
+model id: kaiju-coder-7
+base URL: http://127.0.0.1:18083/v1
+context: 16384
+```
+Then install the OpenCode helper with:
+```bash
+git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
+cd kaiju-coder-7-opencode
+python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
+```
+### Path 3: Runtime-Quantized Local Candidate
+Use this only if you are comfortable with advanced serving setups. The current
+working quantized option is a runtime bitsandbytes recipe, not a separate
+persisted quantized weights repo.
+```bash
+git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-quantized-runtime
+cd kaiju-coder-7-quantized-runtime
+```
+Read `README.md` in that repo before serving. This path can reduce model memory
+at runtime, but it still depends on access to the full Kaiju Coder 7 weights.
+## Recommended Test Prompt
+Run this from an empty project folder:
+```text
+Build a launch-ready local service business website and operating pack. Include
+index.html, a Stripe checkout safety plan, a CSV parser with tests, a simple CRM
+schema, a weekly money report, and a safety/provenance note. Write the files,
+not just advice.
+```
+Expected result:
+- files are written in the requested project folder
+- `index.html` is complete HTML
+- business docs start with Markdown H1 headings
+- code includes a test or smoke-check command where practical
+- no fake API keys, OAuth tokens, payment secrets, or private customer data
+## Current Recommended Defaults
+- Public model id: `kaiju-coder-7`
+- OpenCode context: `16384`
+- Output cap for public testing: `2500`
+- Current reliable product path: model plus deterministic business-owner
+  harness plus verifier
+- Raw multi-file OpenCode generation: still too slow for broad paid API claims
+- Paid API: not public until launch preflight passes
+## What Not To Claim Yet
+Do not claim:
+- that raw model weights alone reliably build every business-owner artifact
+- that a paid hosted API is generally available
+- that persisted quantized weights exist
+- that 32k context is the current live default
+Do claim:
+- Kaiju Coder 7 has a working local/OpenCode release candidate
+- the current tested OpenCode default is 16k context
+- the helper package includes a lean agent and compaction loop guard
+- the paid API scaffold has tests and a launch preflight, but is not yet public
+- the packaged public smoke verifies a fresh OpenCode one-file write before
+  public claims are refreshed
+## Current Blockers Before Public Release
+- Hugging Face repo creation still requires a write-capable token or namespace.
+- Full merged model upload has not completed; the merged folder must first have
+  the metadata packet synced by `prepare_hf_merged_model_metadata.sh`.
+- Public paid API launch needs real Cloudflare D1/KV/R2 bindings, Wrangler
+  secret verification, Stripe webhook staging evidence, staging traffic, latency
+  evidence, and rollback proof.
+- Human review is still required before public upload.

QUANTIZATION_PLAN.md ADDED Viewed

	@@ -0,0 +1,96 @@

+# Kaiju Coder 7 Quantization Plan
+Kaiju Coder 7 needs a public-friendly quantized variant before broad local
+OpenCode release. The merged full model is too large and slow for most users.
+## Current Prerequisite Check
+Checked on 2026-06-03 with:
+```bash
+python3 scripts/check_kaiju_quantization_prereqs.py
+```
+- Local Mac default Python: no `torch`, `transformers`, `safetensors`,
+  `llmcompressor`, `auto_gptq`, `autoawq`, or `bitsandbytes`.
+- Gojira-B default Python: same libraries unavailable outside model-serving
+  containers.
+- SGLang container: has `torch 2.9.1+cu130`, `transformers 5.3.0`,
+  `safetensors 0.7.0`, and `huggingface_hub 1.10.2`; missing
+  `llmcompressor`, `autoawq`, `auto_gptq`, and `bitsandbytes`.
+- HF CLI: unavailable in the current shell.
+This does not block quantization, but it means persistent weight quantization
+must run in a pinned container or a dedicated virtual environment, not the
+default shell.
+## Gojira-B Probe Evidence
+Run:
+```bash
+./scripts/probe-gojira-b-kaiju-quantization.sh
+```
+Findings on 2026-06-03:
+- Merged model artifact is `51G`.
+- Architecture is `qwen3_5` / `Qwen3_5ForConditionalGeneration`.
+- Text config uses both `linear_attention` and `full_attention` layers.
+- Standard `vllm/vllm-openai:latest` cannot load the config because its
+  Transformers build does not recognize `qwen3_5`.
+- `gojira/vllm-openai-ray:nightly` can load the config.
+- vLLM serving requires the text-only launch path for this checkpoint because
+  the public text-serving merge does not include visual encoder weights.
+- vLLM bitsandbytes runtime quantization works at 8k and 16k with the Gojira
+  nightly image, `pandas` preinstall, `--language-model-only`,
+  `--quantization bitsandbytes`, and `--load-format bitsandbytes`.
+- The 16k runtime bitsandbytes business-document smoke passed:
+  `runs/benchmarks/20260603T161316Z-kaiju-coder-7-serving/summary.md`,
+  `53.44s`, `1,610` chars, `30.127` chars/s.
+- The 16k runtime bitsandbytes OpenCode one-file smoke passed after adding
+  vLLM `--enable-auto-tool-choice`:
+  `bash scripts/run_kaiju_quantized_opencode_smoke.sh` wrote
+  `/tmp/kaiju-opencode-quantized-smoke/hello.txt` with exactly
+  `Kaiju Coder 7 quantized runtime ok`.
+## Candidate Order
+1. **FP8/AWQ-style GPU serving candidate**
+   - Best for hosted API or serious local GPU users.
+   - Benchmark against current SGLang merged full model.
+   - Must keep model id `kaiju-coder-7` in serving docs.
+   - Current working runtime candidate: vLLM bitsandbytes at `8192` and
+     `16384`, documented in `release/quantized-runtime/README.md`.
+2. **GGUF/llama.cpp candidate**
+   - Best for broad local distribution if the architecture converts cleanly.
+   - Publish only if a real local smoke test passes.
+3. **MLX candidate**
+   - Best for Apple Silicon users if conversion supports this architecture.
+   - Useful for Richard's local testing and Kiyomi/OpenCode demos.
+## Quantization Success Gate
+A quantized candidate is not release-ready until it passes:
+- `/v1/models` or local runtime identifies it as Kaiju Coder 7.
+- Direct identity and code smoke pass.
+- At least one business-owner document task passes.
+- OpenCode one-file write smoke passes.
+- Latency and memory are recorded in `release/SERVING_BENCHMARKS.md`.
+- Model card states exact quantization format, hardware tested, and known
+  quality tradeoffs.
+The runtime bitsandbytes candidate has passed the direct identity and code smoke
+at 8k and 16k, a 16k business-owner document task, and an OpenCode one-file
+write smoke. It can be documented as an advanced runtime-quantized OpenCode
+path, but not as a public quantized-weights release.
+## Next Concrete Step
+Create a pinned Docker/UV quantization environment on Gojira-B with the
+Qwen3.5-capable Transformers/runtime stack plus one persistent-weight
+quantization package at a time. Do not upload a quantized-weights repo until a
+smoke-tested persisted artifact exists.

README.md ADDED Viewed

	@@ -0,0 +1,121 @@

+# Kaiju Coder 7 by Kiyomi - Adapter Model Card
+This model card is for the LoRA adapter package, not a standalone base model.
+## Summary
+Kaiju Coder 7 by Kiyomi is an RMDW/Kiyomi business-owner coding adapter trained on reviewed, RMDW-owned or RMDW-authored examples. It is designed for practical small-business build work: websites, proposals, intake/CRM flows, Stripe/payment implementation planning, reports, ROI dashboards, automations, operator handbooks, lead generation, sales follow-up, repo patches, and Kiyomi 7.7.7 style AI-company setup packs.
+The current release-candidate product path is:
+```text
+Qwen3.6-27B base
+-> Kaiju v1.8 LoRA adapter
+-> merged full-model artifact for raw local serving
+-> Kaiju system prompt
+-> deterministic business-owner harnesses
+-> verifier/static checks
+```
+Do not describe this package as raw weights alone producing every final artifact. The deterministic harness is part of the tested product path.
+## Base Model
+- Base model: `Qwen/Qwen3.6-27B`
+- Checked upstream revision: `6a9e13bd6fc8f0983b9b99948120bc37f49c13e9`
+- Upstream license metadata: `apache-2.0`
+- Upstream license copy: `release/upstream/qwen3.6-27b/LICENSE`
+Attribution wording:
+```text
+Kaiju Coder 7 by Kiyomi is fine-tuned from Qwen under Apache 2.0.
+```
+Do not imply endorsement by Qwen, Alibaba, or upstream authors.
+## Adapter
+- Adapter path: `runs/qwen36-27b-lora-v1.8-business-owner/adapter`
+- Adapter type: LoRA / PEFT
+- LoRA rank: `16`
+- LoRA alpha: `32`
+- LoRA dropout: `0.02`
+- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
+- Trainable parameter count: approximately `79.7M`
+## Merged Local Artifact
+- Remote merged path: `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`
+- Size: `51G`
+- Shards: `14` safetensor shards plus tokenizer/config sidecars
+- Served model name: `kaiju-coder-7`
+- Merge script: `scripts/run-gojira-b-qwen36-lora-merge.sh`
+- Serving script: `scripts/start-qwen36-merged-sglang.sh`
+## Training
+- Dataset build: `datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl`
+- Reviewed candidate examples: `1,689`
+- SFT rows after controlled business-owner oversampling: `1,881`
+- Train examples: `1,769`
+- Eval examples: `112`
+- Training runtime: `11666.7564s`
+- Training loss: `0.9281658741335074`
+- Max training length: `2048`
+- Training config: `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json`
+## Data Provenance
+Training data is source-backed and RMDW-owned or RMDW-authored. Client-site repositories are used only as generalized pattern/eval sources unless explicitly reviewed for training eligibility.
+Relevant release files:
+- `release/SOURCE_INVENTORY.md`
+- `release/source-inventory.json`
+- `release/DATA_PROVENANCE_DRAFT.md`
+- `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl`
+Excluded from training:
+- Raw secrets, API keys, OAuth tokens, private keys, cookies, and credentials.
+- Closed-model answers from OpenAI, Anthropic, Gemini, or similar providers as supervised completions unless terms clearly allow it.
+- Private client data, customer notes, contracts, raw support logs, and client-specific website copy without explicit review and consent.
+## Evaluation Snapshot
+Local product-path evidence:
+- Unit tests: `65` passing.
+- Full local RC smoke: passed.
+- Router hard harness: `23/23`.
+- Router static checks: `23/23`.
+- Business-suite prompts: `2/2`.
+- Local API harness: website and business-suite artifacts pass.
+Merged serving evidence:
+- Endpoint: `http://100.109.109.14:18083/v1`
+- Served model: `kaiju-coder-7`
+- Tested context: `32768` on Gojira-B, with `16384` documented as the lower-load fallback.
+- Probe: `1,155` visible chars in `60.17s`.
+- Proposal rerun: `1/1` paid-ready, `4.0/4.0`, `4,014` chars in `212.72s`.
+- Jah credits backend: `4.0/4.0`, `9,718` chars in `566.36s`.
+- OpenCode customer-readiness harness: `4/4` tasks passed, `28/28` required files written, including source/provenance and release-claim safety review.
+- vLLM nightly serving probe: passed at `16384` after `pandas` preinstall and
+  `--language-model-only`, but not faster enough to replace SGLang.
+- Runtime-quantized vLLM bitsandbytes: passed at `8192` and `16384`; 16k code
+  patch completed in `11.3s`, and logs reported about `17.8 GiB` model memory.
+Known comparison caveat:
+- Dynamic SGLang LoRA serving is not release evidence for this adapter: adapter-name-only output can be base-equivalent, and corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes with a fused-module LoRA buffer shape mismatch.
+- Do not claim raw-weight superiority until broader base-Qwen and GLM/current-production comparisons are complete.
+## Limitations
+- Raw full-website generation has not yet passed the merged-model release sweep and should remain harness-first for paid delivery.
+- The deterministic harness remains the practical paid website workflow.
+- The adapter needs a strong app layer for file editing, tool use, auth, billing, rate limits, logging, and rollback.
+- Human review is still required before any public upload or paid production claim.
+- Not intended for high-risk medical, legal, financial, or safety-critical decisions without expert review.

SERVING_BENCHMARKS.md ADDED Viewed

	@@ -0,0 +1,358 @@

+# Kaiju Coder 7 Serving Benchmarks
+This file records serving evidence for public download and paid API decisions.
+The model id must remain `kaiju-coder-7`.
+## Current Live Runtime
+- Host: Gojira-B over Tailscale
+- Base URL: `http://100.109.109.14:18083/v1`
+- Serving stack: SGLang merged full model
+- Current verified post-quantization restored context: `16384`
+- Tested high-context target: `32768`
+- Current container: `qwen36-merged-sglang-18083`
+- Current caveat: direct raw generation is slow for multi-file OpenCode work.
+## Benchmark Command
+For current-context latency without restart:
+```bash
+python3 scripts/benchmark_kaiju_serving.py \
+  --contexts 12288 \
+  --prompts identity business_doc code_patch \
+  --max-tokens 768 \
+  --timeout 420
+```
+For context restart benchmarking:
+```bash
+python3 scripts/benchmark_kaiju_serving.py \
+  --restart \
+  --contexts 12288 16384 24576 32768 \
+  --prompts identity business_doc \
+  --max-tokens 768 \
+  --timeout 420 \
+  --ready-timeout 1200
+```
+Use `--contexts 16384` for the current restored Gojira-B endpoint. Use
+`32768` when explicitly testing the high-context target; it has passed earlier
+benchmarks but should be re-confirmed after a fresh restart before calling it
+the live default.
+## Current 12k Direct API Benchmark
+Command:
+```bash
+python3 scripts/benchmark_kaiju_serving.py \
+  --contexts 12288 \
+  --prompts identity code_patch \
+  --max-tokens 256 \
+  --timeout 300
+```
+Run: `runs/benchmarks/20260603T135017Z-kaiju-coder-7-serving/summary.md`
+| Context | Prompt | OK | Seconds | Chars | Chars/s |
+| --- | --- | --- | ---: | ---: | ---: |
+| 12288 | identity | True | 2.41 | 26 | 10.788 |
+| 12288 | code_patch | True | 57.61 | 860 | 14.928 |
+Interpretation: direct API calls are usable for short tasks, but latency is too
+high for a paid raw-code API unless outputs are streamed and route-specific
+limits are enforced.
+## 16k Context Benchmark
+16k was tested to reduce OpenCode compaction pressure.
+Commands:
+```bash
+python3 scripts/benchmark_kaiju_serving.py \
+  --restart \
+  --contexts 16384 \
+  --prompts identity \
+  --max-tokens 128 \
+  --timeout 300 \
+  --ready-timeout 1200
+python3 scripts/benchmark_kaiju_serving.py \
+  --contexts 16384 \
+  --prompts code_patch \
+  --max-tokens 128 \
+  --timeout 300
+```
+Runs:
+- `runs/benchmarks/20260603T135651Z-kaiju-coder-7-serving/summary.md`
+- `runs/benchmarks/20260603T140318Z-kaiju-coder-7-serving/summary.md`
+| Context | Prompt | OK | Load Wait | Seconds | Chars | Chars/s |
+| --- | --- | --- | ---: | ---: | ---: | ---: |
+| 16384 | identity | True | 354.16 | 14.9 | 26 | 1.745 |
+| 16384 | code_patch | True | n/a | 28.99 | 416 | 14.35 |
+Interpretation: `16384` is a stable lower-load fallback and still leaves more
+room above OpenCode's prompt/tool overhead than the original 12k setting.
+## 24k And 32k Context Benchmarks
+24k and 32k were tested after 16k proved stable. Both loaded and returned the
+same code-patch latency profile as 16k on the short patch benchmark.
+Commands:
+```bash
+python3 scripts/benchmark_kaiju_serving.py \
+  --restart \
+  --contexts 24576 \
+  --prompts identity \
+  --max-tokens 128 \
+  --timeout 300 \
+  --ready-timeout 1200
+python3 scripts/benchmark_kaiju_serving.py \
+  --contexts 24576 \
+  --prompts code_patch \
+  --max-tokens 128 \
+  --timeout 300
+python3 scripts/benchmark_kaiju_serving.py \
+  --restart \
+  --contexts 32768 \
+  --prompts identity \
+  --max-tokens 64 \
+  --timeout 300 \
+  --ready-timeout 1200
+python3 scripts/benchmark_kaiju_serving.py \
+  --contexts 32768 \
+  --prompts code_patch \
+  --max-tokens 128 \
+  --timeout 300
+```
+Runs:
+- `runs/benchmarks/20260603T141559Z-kaiju-coder-7-serving/summary.md`
+- `runs/benchmarks/20260603T142354Z-kaiju-coder-7-serving/summary.md`
+- `runs/benchmarks/20260603T142439Z-kaiju-coder-7-serving/summary.md`
+- `runs/benchmarks/20260603T143256Z-kaiju-coder-7-serving/summary.md`
+| Context | Prompt | OK | Load Wait | Seconds | Chars | Chars/s |
+| --- | --- | --- | ---: | ---: | ---: | ---: |
+| 24576 | identity | True | 439.54 | 16.84 | 26 | 1.544 |
+| 24576 | code_patch | True | n/a | 29.03 | 416 | 14.33 |
+| 32768 | identity | True | 386.53 | 16.27 | 26 | 1.598 |
+| 32768 | code_patch | True | n/a | 28.99 | 416 | 14.35 |
+Interpretation: `32768` is a proven high-context target from this benchmark set,
+but it is not the currently parked live endpoint after the later
+quantized-runtime testing. The current Gojira-B/OpenCode profile should stay at
+`16384` until `32768` is freshly restarted and re-confirmed. Keep `12288` for
+direct API smoke tests and constrained hardware.
+Restored-service 32k direct API smoke after vLLM testing:
+- Run: `runs/benchmarks/20260603T155233Z-kaiju-coder-7-serving/summary.md`
+- `/v1/models`: `kaiju-coder-7`, max model len `32768`
+| Context | Prompt | OK | Seconds | Chars | Chars/s |
+| --- | --- | --- | ---: | ---: | ---: |
+| 32768 | identity | True | 2.92 | 26 | 8.904 |
+| 32768 | business_doc | True | 94.28 | 1737 | 18.424 |
+Interpretation: the restored default endpoint is usable for business-owner
+document work, but a long proposal response still takes about 94 seconds. Paid
+routes must stream, cap output, queue carefully, and prefer verified
+artifact routes over raw open-ended generation.
+## OpenCode Customer-Readiness Evidence
+Final restored-service small OpenCode smoke:
+```bash
+opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
+  --dir /tmp/kaiju-opencode-32k-final-smoke \
+  'Create hello.txt with exactly: Kaiju Coder 7 final 32k ok'
+```
+Result: passed. OpenCode wrote `hello.txt` with exactly
+`Kaiju Coder 7 final 32k ok`.
+Current restored 16k OpenCode smoke after quantized-vLLM testing:
+```bash
+mkdir -p /tmp/kaiju-opencode-fresh-public-smoke
+opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
+  --dir /tmp/kaiju-opencode-fresh-public-smoke \
+  --dangerously-skip-permissions \
+  'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'
+```
+Result: passed. OpenCode wrote `hello.txt` with exactly
+`Kaiju Coder 7 fresh public smoke ok` in
+`/tmp/kaiju-opencode-fresh-public-smoke`, and `/v1/models` returned
+`kaiju-coder-7` with max model len `16384`.
+Current restored 16k direct API identity smoke:
+- Run: `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`
+- `/v1/models`: `kaiju-coder-7`, max model len `16384`
+| Context | Prompt | OK | Seconds | Chars | Chars/s |
+| --- | --- | --- | ---: | ---: | ---: |
+| 16384 | identity | True | 2.3 | 26 | 11.304 |
+Command:
+```bash
+python3 scripts/run_kaiju_opencode_customer_pack.py
+```
+Latest harnessed product-path result on 2026-06-03:
+- Run: `runs/opencode-customer-readiness/20260603T185835Z/summary.md`
+- Mode: `harnessed`
+- Status: `4/4` passed
+- Tasks:
+  - `fade-flow-service-site`
+  - `kiyomi-owner-operating-pack`
+  - `paid-api-safety-scaffold`
+  - `release-provenance-safety-review`
+- Required files written: `28/28`
+- Forbidden secret-looking tokens: none found by verifier
+Loop-guarded OpenCode install smoke:
+- Command: `python3 scripts/install_kaiju_opencode_profile.py`, then
+  `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'`
+- Result: passed. OpenCode wrote `loopguard.txt` in the requested directory with
+  exactly `Kaiju Coder 7 loop guard installed` and exited cleanly.
+- Installed guard: `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`
+Raw OpenCode-agent result on 2026-06-03:
+- Task: `fade-flow-service-site`
+- Status: timed out after `900s`
+- Required files written: `0`
+- Observed Gojira-B decode throughput while running: about `4.4` tokens/sec
+- Follow-up runner fix: workspaces now run outside the repo and pass `opencode
+  run --dir <workspace>` explicitly.
+- Structured follow-up run:
+  `runs/opencode-customer-readiness/20260603T135520Z/results.jsonl`
+  timed out after `60s`, wrote `0` files, and recorded `pwd` as the intended
+  temp workspace.
+- 16k/stricter-agent follow-up runs:
+  - `runs/opencode-customer-readiness/20260603T140650Z/results.jsonl`
+    timed out after `120s`, wrote `0` files, and recorded the intended temp
+    workspace.
+  - `runs/opencode-customer-readiness/20260603T140908Z/results.jsonl`
+    timed out after `120s`, wrote `0` files after adding stricter "write first
+    file immediately" prompt guidance.
+- Interpretation: the lean OpenCode agent fits and can write small files.
+  Harnessed file-plan delivery passes the customer pack. Current raw multi-file
+  OpenCode generation is still not public/API ready, so public and paid claims
+  must describe the reliable product path as model plus deterministic harness
+  and verifier.
+## Recommendation Until Faster Serving Is Proven
+- Public local release can proceed only with clear speed/hardware caveats.
+- Paid API should route business-owner deliverables through deterministic
+  harnesses and verifiers, not raw OpenCode multi-file generation.
+- Quantized candidates and/or a smaller distilled variant are required for
+  broad public OpenCode usability.
+## vLLM Serving Probe
+vLLM was tested as the practical alternative serving path after SGLang. The
+standard `vllm/vllm-openai:latest` image cannot read the merged checkpoint's
+`qwen3_5` config. The Gojira nightly image can read it, but needed two launch
+fixes for this checkpoint:
+- preinstall `pandas`, because the Qwen3.5 model path imports it in this image
+- pass `--language-model-only`, because the merged text-serving checkpoint does
+  not include the visual encoder weights expected by the multimodal config
+Guarded benchmark command:
+```bash
+KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_READY_TIMEOUT=900 \
+  ./scripts/run-gojira-b-vllm-serving-benchmark.sh
+```
+Run: `runs/benchmarks/20260603T151244Z-kaiju-coder-7-serving/summary.md`
+| Stack | Context | Prompt | OK | Seconds | Chars | Chars/s |
+| --- | ---: | --- | --- | ---: | ---: | ---: |
+| vLLM nightly | 16384 | identity | True | 19.99 | 26 | 1.301 |
+| vLLM nightly | 16384 | code_patch | True | 28.8 | 416 | 14.444 |
+Interpretation: vLLM now runs Kaiju Coder 7 at 16k, but it is not clearly
+faster than SGLang on the current smoke prompts. Keep SGLang as the recommended
+runtime because it has stable OpenCode smoke evidence, a simpler launch path,
+and historical 32k proof. Keep the live/default OpenCode profile at 16k until
+32k is freshly re-confirmed. Keep the vLLM scripts for future nightly-image or
+quantized-weight testing.
+## vLLM bitsandbytes Runtime-Quantized Candidate
+The first working quantized local variant is a runtime bitsandbytes vLLM path.
+It does not create separate quantized weights yet; it loads the full merged
+model through vLLM's bitsandbytes loader.
+Command:
+```bash
+KAIJU_VLLM_CONTEXT=16384 \
+KAIJU_VLLM_READY_TIMEOUT=1200 \
+KAIJU_VLLM_QUANTIZATION=bitsandbytes \
+KAIJU_VLLM_LOAD_FORMAT=bitsandbytes \
+  ./scripts/run-gojira-b-vllm-serving-benchmark.sh
+```
+Runs:
+- `runs/benchmarks/20260603T153257Z-kaiju-coder-7-serving/summary.md`
+- `runs/benchmarks/20260603T154450Z-kaiju-coder-7-serving/summary.md`
+- `runs/benchmarks/20260603T161316Z-kaiju-coder-7-serving/summary.md`
+- `runs/benchmarks/20260603T165512Z-kaiju-coder-7-serving/summary.md`
+| Stack | Context | Prompt | OK | Seconds | Chars | Chars/s |
+| --- | ---: | --- | --- | ---: | ---: | ---: |
+| vLLM bitsandbytes | 8192 | identity | True | 21.19 | 26 | 1.227 |
+| vLLM bitsandbytes | 8192 | code_patch | True | 11.31 | 424 | 37.489 |
+| vLLM bitsandbytes | 16384 | identity | True | 19.51 | 26 | 1.333 |
+| vLLM bitsandbytes | 16384 | code_patch | True | 11.3 | 416 | 36.814 |
+| vLLM bitsandbytes | 16384 | business_doc | True | 53.44 | 1610 | 30.127 |
+| vLLM bitsandbytes | 16384 | identity | True | 19.65 | 26 | 1.323 |
+Gojira-B vLLM logs reported about `17.8 GiB` model memory for the bitsandbytes
+load at both 8k and 16k, compared with about `50.22 GiB` for the unquantized
+vLLM load. Code-patch latency improved materially on this smoke prompt.
+Business-document latency improved versus the restored 32k SGLang business-doc
+smoke (`53.44s` at 16k vLLM bitsandbytes versus `94.28s` at 32k SGLang).
+Identity latency remains slower than SGLang.
+Quantized OpenCode one-file smoke passed after launching vLLM with
+`--enable-auto-tool-choice` plus `--tool-call-parser qwen3_coder` and running:
+```bash
+bash scripts/run_kaiju_quantized_opencode_smoke.sh
+```
+Result: OpenCode wrote `/tmp/kaiju-opencode-quantized-smoke/hello.txt` with
+exactly `Kaiju Coder 7 quantized runtime ok`.
+Recommendation: keep SGLang as the default public/OpenCode runtime and keep the
+currently installed OpenCode profile at 16k unless the 32k target has just been
+restarted and re-confirmed. Treat vLLM bitsandbytes as the current working
+quantized local candidate for advanced GPU users and future paid API speed
+experiments. It now has direct identity/code/business-doc evidence plus an
+OpenCode one-file smoke, but it is not a persisted quantized-weights repo.

SOURCE_INVENTORY.md ADDED Viewed

	@@ -0,0 +1,41 @@

+# Kaiju Source Inventory
+Generated from GitHub source-of-truth repositories plus the requested local RMDW wiki snapshot. This inventory defines what may become Kaiju training data, what is eval-only, and what must stay excluded.
+## Global Training Rules
+- Do not train on raw secrets, API keys, OAuth tokens, cookies, private keys, or credential files.
+- Do not train on closed-model responses from OpenAI, Anthropic, Gemini, or similar providers unless the terms clearly allow it.
+- Do not train on client-specific private data without explicit review and consent.
+- Preserve repository name, commit SHA, source path, license, and reviewer status for every promoted dataset row.
+## GitHub Repository Inventory
+| Repo | SHA | Role | Training use | Required gates | Exclusions | Notes |
+|---|---|---|---|---|---|---|
+| [RichardEchols/kaiju-coder](https://github.com/RichardEchols/kaiju-coder) | `3d57eae92ad5` | model lab, harness, evals, training scripts | candidate-after-review | secret-scan, closed-model-output-check, license-review | runs, models, .secrets, private datasets, raw logs | Use repo-owned harnesses, evals, docs, scripts, and curated datasets. Exclude weights, generated runs, and local secrets. |
+| [RichardEchols/Kiyomi-7.7.7](https://github.com/RichardEchols/Kiyomi-7.7.7) | `294b31008135` | business-owner AI-company module contracts | candidate-after-review | secret-scan, closed-model-output-check, private-data-review | credentials, tokens, private client state, closed-model transcripts | Use module contracts, templates, acceptance gates, and owner-facing task structure as high-signal business-owner curriculum. |
+| [RichardEchols/kiyomi-agent](https://github.com/RichardEchols/kiyomi-agent) | `b192c910f3f7` | business OS wrapper and local-agent patterns | candidate-after-review | secret-scan, closed-model-output-check, private-data-review | credentials, tokens, local runtime state, private support logs | Use architecture, docs, scripts, and safe wrapper patterns. Do not train on runtime secrets or private logs. |
+| [RichardEchols/rmdw-site](https://github.com/RichardEchols/rmdw-site) | `df089dc3b2d3` | public RMDW offer, site, and conversion surface | candidate-after-review | secret-scan, closed-model-output-check, public-copy-review | environment files, deployment secrets, analytics tokens | Use public offer copy, app structure, pricing/CTA patterns, and website implementation patterns. |
+| [RichardEchols/makotoair](https://github.com/RichardEchols/makotoair) | `7568f07fea6e` | client website implementation pattern | eval-and-patterns-only | secret-scan, client-data-review, consent-review | client-specific, contact data, contracts, private business details | Use as eval/pattern inspiration for local service business sites. Do not bulk-train on client-specific text without explicit review. |
+| [RichardEchols/Mezzal-Construction](https://github.com/RichardEchols/Mezzal-Construction) | `e8f2eede0405` | client website implementation pattern | eval-and-patterns-only | secret-scan, client-data-review, consent-review | client-specific, contact data, contracts, private business details | Use as eval/pattern inspiration for premium contractor site work. Do not bulk-train on client-specific text without explicit review. |
+| [RichardEchols/rmdw-agent-wiki](https://github.com/RichardEchols/rmdw-agent-wiki) | `ae1b8e85d3fe` | RMDW/Kiyomi operational wiki | selective-reference-only | secret-scan, credentials-redaction, private-data-review, closed-model-output-check | credentials.md, customers.md, raw, contracts, private client notes, support logs | Use only redacted strategy/product notes and documented decisions. Never use raw credentials or private client data. |
+## Local Source Inventory
+Local files are context snapshots, not the source of truth. Promote local wiki material into training only after explicit review, redaction, and either sync/diff against the GitHub wiki or a documented reviewer exception.
+| Source | Path | Git repo | Files | Training use | Required gates | Excluded paths present | Safe reference candidates | Notes |
+|---|---|---:|---:|---|---|---|---|---|
+| RMDW-Wiki-local | `/Users/richardecholsai7/Documents/RMDW-Wiki` | no | 93 | selective-reference-only | secret-scan, credentials-redaction, private-data-review, sync-or-diff-against-github | credentials.md, customers.md, customers/, raw/ | README.md, kaiju-coder-build-log.md, kaiju-coder-business-plan.md, kaiju-coder-soul.md, kiyomi-agent-build-log.md, pricing-history.md, product/kiyomi-private-ai-workstation.md, ops/product-ops-automation.md, client-acquisition-engine/README.md | Use as a local context snapshot only after explicit row-level review. Do not treat unsynced local files as the authoritative training source. |
+## Training Eligibility Meaning
+- `candidate-after-review`: source can produce training or eval examples only after secret scanning, closed-model-output review, and row-level provenance.
+- `eval-and-patterns-only`: use for hard eval prompts, harness behavior, screenshots, or generalized patterns. Do not bulk-train on client-specific source text.
+- `selective-reference-only`: use narrowly after redaction. Treat credentials, customer notes, and raw operational data as excluded by default.
+- Local snapshots require review against the GitHub source of truth before promotion into dataset rows.
+## Next Dataset Step
+Generate candidate examples only from reviewed paths, attach this inventory SHA or local snapshot data to each row, then run `scripts/validate_training_data.py` before any training run.

UPSTREAM_LICENSE_CHECK.md ADDED Viewed

	@@ -0,0 +1,38 @@

+# Upstream License Check
+Date: 2026-06-03
+This is an engineering release check, not legal advice.
+## Base Model
+- Upstream model: `Qwen/Qwen3.6-27B`
+- Hugging Face URL: `https://huggingface.co/Qwen/Qwen3.6-27B`
+- Checked revision from Hugging Face API: `6a9e13bd6fc8f0983b9b99948120bc37f49c13e9`
+- Hugging Face license metadata: `apache-2.0`
+- Local license copy: `release/upstream/qwen3.6-27b/LICENSE`
+- Common upstream notice files checked: `NOTICE`, `NOTICE.txt`, `NOTICE.md`
+- Notice result: no common notice file found at the checked upstream paths
+## Release Obligations To Preserve
+- Include the upstream Apache 2.0 license with the adapter release package.
+- Keep the upstream base model name and revision in the model card.
+- State that Kaiju Coder is fine-tuned from Qwen; do not imply Qwen, Alibaba, or upstream-author endorsement.
+- Include a modification note for the LoRA adapter and RMDW/Kiyomi training/eval package.
+- Retain warranty and limitation language through the included Apache 2.0 license.
+## Current Packaging Status
+Passed for release review:
+- Upstream license file copied locally.
+- Upstream revision recorded.
+- Upstream license metadata recorded.
+- Notice check recorded.
+Still requires human release review:
+- Confirm no upstream files changed before upload.
+- Confirm the final Hugging Face repository includes the copied license file and model card.
+- Confirm public wording avoids endorsement or trademark confusion.

adapter_config.json ADDED Viewed

	@@ -0,0 +1,48 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "/workspace/kaiju-coder/models/Qwen3.6-27B",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.02,
+  "lora_ga_config": null,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.19.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "up_proj",
+    "v_proj",
+    "q_proj",
+    "o_proj",
+    "down_proj",
+    "gate_proj",
+    "k_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_bdlora": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e4c8842d3e7cc98303a0eab68c6c24cd0bb526e95834395e828d2440a929c85
+size 318835672

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,154 @@

+{%- set image_count = namespace(value=0) %}
+{%- set video_count = namespace(value=0) %}
+{%- macro render_content(content, do_vision_count, is_system_content=false) %}
+    {%- if content is string %}
+        {{- content }}
+    {%- elif content is iterable and content is not mapping %}
+        {%- for item in content %}
+            {%- if 'image' in item or 'image_url' in item or item.type == 'image' %}
+                {%- if is_system_content %}
+                    {{- raise_exception('System message cannot contain images.') }}
+                {%- endif %}
+                {%- if do_vision_count %}
+                    {%- set image_count.value = image_count.value + 1 %}
+                {%- endif %}
+                {%- if add_vision_id %}
+                    {{- 'Picture ' ~ image_count.value ~ ': ' }}
+                {%- endif %}
+                {{- '<|vision_start|><|image_pad|><|vision_end|>' }}
+            {%- elif 'video' in item or item.type == 'video' %}
+                {%- if is_system_content %}
+                    {{- raise_exception('System message cannot contain videos.') }}
+                {%- endif %}
+                {%- if do_vision_count %}
+                    {%- set video_count.value = video_count.value + 1 %}
+                {%- endif %}
+                {%- if add_vision_id %}
+                    {{- 'Video ' ~ video_count.value ~ ': ' }}
+                {%- endif %}
+                {{- '<|vision_start|><|video_pad|><|vision_end|>' }}
+            {%- elif 'text' in item %}
+                {{- item.text }}
+            {%- else %}
+                {{- raise_exception('Unexpected item type in content.') }}
+            {%- endif %}
+        {%- endfor %}
+    {%- elif content is none or content is undefined %}
+        {{- '' }}
+    {%- else %}
+        {{- raise_exception('Unexpected content type.') }}
+    {%- endif %}
+{%- endmacro %}
+{%- if not messages %}
+    {{- raise_exception('No messages provided.') }}
+{%- endif %}
+{%- if tools and tools is iterable and tools is not mapping %}
+    {{- '<|im_start|>system\n' }}
+    {{- "# Tools\n\nYou have access to the following functions:\n\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>" }}
+    {{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
+    {%- if messages[0].role == 'system' %}
+        {%- set content = render_content(messages[0].content, false, true)|trim %}
+        {%- if content %}
+            {{- '\n\n' + content }}
+        {%- endif %}
+    {%- endif %}
+    {{- '<|im_end|>\n' }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {%- set content = render_content(messages[0].content, false, true)|trim %}
+        {{- '<|im_start|>system\n' + content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for message in messages[::-1] %}
+    {%- set index = (messages|length - 1) - loop.index0 %}
+    {%- if ns.multi_step_tool and message.role == "user" %}
+        {%- set content = render_content(message.content, false)|trim %}
+        {%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %}
+            {%- set ns.multi_step_tool = false %}
+            {%- set ns.last_query_index = index %}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if ns.multi_step_tool %}
+    {{- raise_exception('No user query found in messages.') }}
+{%- endif %}
+{%- for message in messages %}
+    {%- set content = render_content(message.content, true)|trim %}
+    {%- if message.role == "system" %}
+        {%- if not loop.first %}
+            {{- raise_exception('System message must be at the beginning.') }}
+        {%- endif %}
+    {%- elif message.role == "user" %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {%- set reasoning_content = '' %}
+        {%- if message.reasoning_content is string %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in content %}
+                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
+                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- set reasoning_content = reasoning_content|trim %}
+        {%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %}
+            {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
+        {%- else %}
+            {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- endif %}
+        {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if tool_call.function is defined %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {%- if loop.first %}
+                    {%- if content|trim %}
+                        {{- '\n\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
+                    {%- else %}
+                        {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
+                    {%- endif %}
+                {%- else %}
+                    {{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
+                {%- endif %}
+                {%- if tool_call.arguments is defined %}
+                    {%- for args_name, args_value in tool_call.arguments|items %}
+                        {{- '<parameter=' + args_name + '>\n' }}
+                        {%- set args_value = args_value | string if args_value is string else args_value | tojson | safe %}
+                        {{- args_value }}
+                        {{- '\n</parameter>\n' }}
+                    {%- endfor %}
+                {%- endif %}
+                {{- '</function>\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.previtem and loop.previtem.role != "tool" %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- content }}
+        {{- '\n</tool_response>' }}
+        {%- if not loop.last and loop.nextitem.role != "tool" %}
+            {{- '<|im_end|>\n' }}
+        {%- elif loop.last %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- else %}
+        {{- raise_exception('Unexpected message role.') }}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+    {%- if enable_thinking is defined and enable_thinking is false %}
+        {{- '<think>\n\n</think>\n\n' }}
+    {%- else %}
+        {{- '<think>\n' }}
+    {%- endif %}
+{%- endif %}

cloudflare-bindings.example.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+  "d1_database": {
+    "binding": "KAIJU_BILLING_DB",
+    "database_name": "kaiju_api_billing",
+    "database_id": "replace_with_real_d1_database_id"
+  },
+  "kv_namespace": {
+    "binding": "KAIJU_RATE_LIMIT_KV",
+    "id": "replace_with_real_kv_namespace_id"
+  },
+  "r2_bucket": {
+    "binding": "KAIJU_ARTIFACT_BUCKET",
+    "bucket_name": "kaiju-api-artifacts"
+  },
+  "workers_dev": false
+}

hf-release-permission-evidence.example.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "status": "pending",
+  "checked_at": "replace_with_utc_timestamp",
+  "namespace": "RMDWLLC",
+  "authenticated_user": "replace_with_hf_username",
+  "probe_repo": "RMDWLLC/kaiju-coder-7-permission-probe",
+  "command": "KAIJU_HF_PERMISSION_PROBE_APPLY=1 bash scripts/check_hf_release_permissions.sh",
+  "result": "replace_with_private_model_repo_create_result"
+}

paid-api-launch-evidence.example.json ADDED Viewed

	@@ -0,0 +1,61 @@

+{
+  "public_route_mode": {
+    "status": "pending",
+    "checked_at": "2026-06-03T00:00:00Z",
+    "exposure_mode": "custom_domain",
+    "route": "https://api.example.com",
+    "result": "custom domain resolves to the intended Kaiju Worker"
+  },
+  "wrangler_secrets_verified": {
+    "status": "pending",
+    "checked_at": "2026-06-03T00:00:00Z",
+    "command": "wrangler secret list",
+    "observed_names": [
+      "KAIJU_ORIGIN_URL",
+      "KAIJU_ORIGIN_SECRET",
+      "KAIJU_STRIPE_WEBHOOK_SECRET"
+    ],
+    "notes": "Record only secret names. Never include secret values."
+  },
+  "d1_migration_applied": {
+    "status": "pending",
+    "checked_at": "2026-06-03T00:00:00Z",
+    "command": "wrangler d1 migrations apply kaiju-billing --remote",
+    "migration": "0001_paid_api.sql",
+    "result": "success"
+  },
+  "stripe_checkout_topup_staging": {
+    "status": "pending",
+    "checked_at": "2026-06-03T00:00:00Z",
+    "mode": "test",
+    "webhook_event": "checkout.session.completed",
+    "credited_api_key_id": "key_staging_001",
+    "idempotency_checked": true,
+    "notes": "Do not include Stripe secret keys or webhook signing secrets."
+  },
+  "worker_to_gojira_staging_request": {
+    "status": "pending",
+    "checked_at": "2026-06-03T00:00:00Z",
+    "route": "/v1/chat/completions",
+    "model": "kaiju-coder-7",
+    "http_status": 200,
+    "streamed": true,
+    "request_id": "req_staging_001",
+    "notes": "Do not include bearer tokens or full private prompts."
+  },
+  "rollback_exercised": {
+    "status": "pending",
+    "checked_at": "2026-06-03T00:00:00Z",
+    "command": "wrangler rollback",
+    "result": "success"
+  },
+  "paid_route_latency": {
+    "status": "pending",
+    "checked_at": "2026-06-03T00:00:00Z",
+    "route": "/v1/chat/completions",
+    "sample_count": 5,
+    "p95_ms": 90000,
+    "max_acceptable_ms": 120000,
+    "notes": "Use staging traffic. Record coarse metrics only."
+  }
+}

scripts/apply_paid_api_cloudflare_bindings.py ADDED Viewed

	@@ -0,0 +1,162 @@

+#!/usr/bin/env python3
+"""Apply reviewed Cloudflare D1/KV/R2 bindings to the Kaiju Worker config.
+The script is preview-only by default. It accepts resource IDs/names, not
+secrets, and refuses placeholder or secret-looking values.
+"""
+from __future__ import annotations
+import argparse
+import json
+import re
+import sys
+from pathlib import Path
+from typing import Any
+ROOT = Path(__file__).resolve().parents[1]
+DEFAULT_BINDINGS = ROOT / "release/cloudflare-bindings.json"
+DEFAULT_WRANGLER = ROOT / "gateway/cloudflare-worker/wrangler.jsonc"
+SECRET_PATTERNS = [
+    ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
+    ("anthropic_api_key", re.compile(r"\bsk-ant-[A-Za-z0-9_-]{20,}\b")),
+    ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
+    ("stripe_webhook_secret", re.compile(r"\bwhsec_[A-Za-z0-9]{16,}\b")),
+    ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
+    ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
+    ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
+    ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
+]
+def strip_jsonc(text: str) -> str:
+    text = re.sub(r"/\*.*?\*/", "", text, flags=re.DOTALL)
+    lines = []
+    for line in text.splitlines():
+        in_string = False
+        escaped = False
+        output = []
+        index = 0
+        while index < len(line):
+            char = line[index]
+            nxt = line[index + 1] if index + 1 < len(line) else ""
+            if char == "\\" and in_string:
+                escaped = not escaped
+                output.append(char)
+            elif char == '"' and not escaped:
+                in_string = not in_string
+                output.append(char)
+            elif char == "/" and nxt == "/" and not in_string:
+                break
+            else:
+                escaped = False
+                output.append(char)
+            index += 1
+        lines.append("".join(output))
+    return re.sub(r",\s*([}\]])", r"\1", "\n".join(lines))
+def load_json_or_jsonc(path: Path) -> dict[str, Any]:
+    text = path.read_text(encoding="utf-8")
+    try:
+        return json.loads(strip_jsonc(text))
+    except json.JSONDecodeError as exc:
+        raise SystemExit(f"{path} is not valid JSON/JSONC: {exc}") from exc
+def secret_findings(payload: Any) -> list[str]:
+    rendered = json.dumps(payload, sort_keys=True)
+    return sorted({label for label, pattern in SECRET_PATTERNS if pattern.search(rendered)})
+def safe_value(value: Any, *, name: str, pattern: str, allow_placeholder: bool = False) -> str:
+    text = str(value or "").strip()
+    if not text:
+        raise SystemExit(f"Missing required Cloudflare binding value: {name}")
+    if not allow_placeholder and text.startswith("replace_with_"):
+        raise SystemExit(f"Refusing placeholder Cloudflare binding value for {name}: {text}")
+    if not re.fullmatch(pattern, text):
+        raise SystemExit(f"Unsafe Cloudflare binding value for {name}: {text!r}")
+    return text
+def build_bindings(raw: dict[str, Any]) -> dict[str, Any]:
+    findings = secret_findings(raw)
+    if findings:
+        raise SystemExit("Refusing secret-looking Cloudflare binding input: " + ", ".join(findings))
+    d1 = raw.get("d1_database") or {}
+    kv = raw.get("kv_namespace") or {}
+    r2 = raw.get("r2_bucket") or {}
+    result: dict[str, Any] = {
+        "d1_databases": [
+            {
+                "binding": safe_value(d1.get("binding", "KAIJU_BILLING_DB"), name="d1_database.binding", pattern=r"[A-Z0-9_]{3,64}"),
+                "database_name": safe_value(d1.get("database_name", "kaiju_api_billing"), name="d1_database.database_name", pattern=r"[A-Za-z0-9._-]{3,128}"),
+                "database_id": safe_value(d1.get("database_id"), name="d1_database.database_id", pattern=r"[A-Za-z0-9_-]{12,128}"),
+            }
+        ],
+        "kv_namespaces": [
+            {
+                "binding": safe_value(kv.get("binding", "KAIJU_RATE_LIMIT_KV"), name="kv_namespace.binding", pattern=r"[A-Z0-9_]{3,64}"),
+                "id": safe_value(kv.get("id"), name="kv_namespace.id", pattern=r"[A-Za-z0-9_-]{12,128}"),
+            }
+        ],
+        "r2_buckets": [
+            {
+                "binding": safe_value(r2.get("binding", "KAIJU_ARTIFACT_BUCKET"), name="r2_bucket.binding", pattern=r"[A-Z0-9_]{3,64}"),
+                "bucket_name": safe_value(r2.get("bucket_name", "kaiju-api-artifacts"), name="r2_bucket.bucket_name", pattern=r"[a-z0-9][a-z0-9.-]{1,61}[a-z0-9]"),
+            }
+        ],
+    }
+    if "workers_dev" in raw:
+        if not isinstance(raw["workers_dev"], bool):
+            raise SystemExit("workers_dev must be true or false")
+        result["workers_dev"] = raw["workers_dev"]
+    return result
+def apply_bindings(config: dict[str, Any], bindings: dict[str, Any]) -> dict[str, Any]:
+    updated = dict(config)
+    for key in ["d1_databases", "kv_namespaces", "r2_buckets"]:
+        updated[key] = bindings[key]
+    if "workers_dev" in bindings:
+        updated["workers_dev"] = bindings["workers_dev"]
+    return updated
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--bindings-file", type=Path, default=DEFAULT_BINDINGS)
+    parser.add_argument("--wrangler-config", type=Path, default=DEFAULT_WRANGLER)
+    parser.add_argument("--out", type=Path, help="Write preview to this file instead of stdout. With --write, defaults to --wrangler-config.")
+    parser.add_argument("--write", action="store_true", help="Update wrangler.jsonc. Default is preview only.")
+    return parser.parse_args()
+def main() -> int:
+    args = parse_args()
+    raw = load_json_or_jsonc(args.bindings_file)
+    config = load_json_or_jsonc(args.wrangler_config)
+    updated = apply_bindings(config, build_bindings(raw))
+    rendered = json.dumps(updated, indent=2, sort_keys=False) + "\n"
+    if args.write:
+        target = args.out or args.wrangler_config
+        target.parent.mkdir(parents=True, exist_ok=True)
+        target.write_text(rendered, encoding="utf-8")
+        print(f"Wrote reviewed Cloudflare bindings to {target}")
+    elif args.out:
+        args.out.parent.mkdir(parents=True, exist_ok=True)
+        args.out.write_text(rendered, encoding="utf-8")
+        print(f"Wrote preview Cloudflare config to {args.out}")
+    else:
+        print(rendered, end="")
+        print("Preview only. Pass --write to update wrangler.jsonc.", file=sys.stderr)
+    return 0
+if __name__ == "__main__":
+    raise SystemExit(main())

scripts/check_hf_release_permission_evidence.py ADDED Viewed

	@@ -0,0 +1,164 @@

+#!/usr/bin/env python3
+"""Validate sanitized Hugging Face repo-create permission evidence.
+The evidence file must not contain tokens or credentials. It records only that
+the authenticated account successfully created a private model repo probe for
+the intended namespace.
+"""
+from __future__ import annotations
+import argparse
+import json
+import re
+from dataclasses import asdict, dataclass
+from pathlib import Path
+from typing import Any
+ROOT = Path(__file__).resolve().parents[1]
+DEFAULT_EVIDENCE = ROOT / "release/hf-release-permission-evidence.json"
+EXAMPLE_EVIDENCE = ROOT / "release/hf-release-permission-evidence.example.json"
+SECRET_PATTERNS = [
+    ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
+    ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
+    ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
+    ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
+    ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
+    ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
+]
+@dataclass
+class Check:
+    name: str
+    status: str
+    detail: str
+def read_json(path: Path) -> tuple[dict[str, Any], Check | None]:
+    if not path.is_file():
+        return {}, Check("HF permission evidence file", "fail", f"missing file: {path}")
+    text = path.read_text(encoding="utf-8")
+    findings = sorted({label for label, pattern in SECRET_PATTERNS if pattern.search(text)})
+    if findings:
+        return {}, Check("HF permission evidence file", "fail", "secret-looking values found: " + ", ".join(findings))
+    try:
+        payload = json.loads(text)
+    except json.JSONDecodeError as exc:
+        return {}, Check("HF permission evidence file", "fail", f"invalid JSON: {exc}")
+    if not isinstance(payload, dict):
+        return {}, Check("HF permission evidence file", "fail", f"{path} must contain a JSON object")
+    return payload, Check("HF permission evidence file", "pass", f"loaded sanitized evidence from {path}")
+def text_field(payload: dict[str, Any], field: str) -> str:
+    return str(payload.get(field) or "").strip()
+def has_placeholder(value: str) -> bool:
+    return "replace_with_" in value.lower()
+def validate(path: Path) -> list[Check]:
+    payload, file_check = read_json(path)
+    checks = [file_check] if file_check else []
+    if not payload:
+        return checks
+    required = ["status", "checked_at", "namespace", "authenticated_user", "probe_repo", "command", "result"]
+    missing = [field for field in required if not text_field(payload, field)]
+    if missing:
+        checks.append(Check("HF permission evidence fields", "fail", "missing fields: " + ", ".join(missing)))
+        return checks
+    checks.append(Check("HF permission evidence fields", "pass", f"{len(required)} required fields present"))
+    placeholders = [field for field in required if has_placeholder(text_field(payload, field))]
+    if placeholders:
+        checks.append(Check("HF permission placeholders", "fail", "replace placeholder values: " + ", ".join(placeholders)))
+    else:
+        checks.append(Check("HF permission placeholders", "pass", "no placeholder values remain"))
+    namespace = text_field(payload, "namespace")
+    user = text_field(payload, "authenticated_user")
+    probe_repo = text_field(payload, "probe_repo")
+    command = text_field(payload, "command")
+    result = text_field(payload, "result").lower()
+    checked_at = text_field(payload, "checked_at")
+    if payload.get("status") != "pass":
+        checks.append(Check("HF permission status", "fail", f"status is {payload.get('status')!r}; expected pass"))
+    else:
+        checks.append(Check("HF permission status", "pass", "repo-create permission probe passed"))
+    if not re.fullmatch(r"[A-Za-z0-9][A-Za-z0-9_.-]{1,95}", namespace):
+        checks.append(Check("HF namespace format", "fail", f"unsafe namespace: {namespace!r}"))
+    elif probe_repo != f"{namespace}/kaiju-coder-7-permission-probe":
+        checks.append(Check("HF probe repo", "fail", f"probe_repo must be {namespace}/kaiju-coder-7-permission-probe"))
+    else:
+        checks.append(Check("HF probe repo", "pass", probe_repo))
+    if not re.fullmatch(r"[A-Za-z0-9][A-Za-z0-9_.-]{1,95}", user):
+        checks.append(Check("HF authenticated user format", "fail", f"unsafe authenticated_user: {user!r}"))
+    else:
+        checks.append(Check("HF authenticated user format", "pass", user))
+    if not re.match(r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z$", checked_at):
+        checks.append(Check("HF permission checked_at", "fail", "checked_at must be UTC like 2026-06-03T19:00:00Z"))
+    else:
+        checks.append(Check("HF permission checked_at", "pass", checked_at))
+    if "hf repos create" not in command and "check_hf_release_permissions.sh" not in command:
+        checks.append(Check("HF permission command", "fail", "command must name the repo-create probe command"))
+    elif "auth list" in command:
+        checks.append(Check("HF permission command", "fail", "command must not include auth list output"))
+    else:
+        checks.append(Check("HF permission command", "pass", "repo-create probe command recorded"))
+    if "succeeded" not in result and "passed" not in result:
+        checks.append(Check("HF permission result", "fail", "result must record that private model repo creation succeeded"))
+    else:
+        checks.append(Check("HF permission result", "pass", "private model repo-create permission recorded"))
+    return checks
+def summarize(checks: list[Check]) -> dict[str, Any]:
+    return {
+        "ready": not any(check.status == "fail" for check in checks),
+        "summary": {
+            "pass": sum(1 for check in checks if check.status == "pass"),
+            "fail": sum(1 for check in checks if check.status == "fail"),
+            "manual": sum(1 for check in checks if check.status == "manual"),
+        },
+        "checks": [asdict(check) for check in checks],
+    }
+def print_text(result: dict[str, Any]) -> None:
+    print(f"Kaiju Coder 7 HF permission evidence: ready={result['ready']}")
+    print(
+        "Summary: "
+        f"{result['summary']['pass']} pass, "
+        f"{result['summary']['fail']} fail, "
+        f"{result['summary']['manual']} manual"
+    )
+    for check in result["checks"]:
+        print(f"[{check['status']}] {check['name']} - {check['detail']}")
+def main() -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--evidence-file", type=Path, default=DEFAULT_EVIDENCE)
+    parser.add_argument("--json", action="store_true")
+    args = parser.parse_args()
+    result = summarize(validate(args.evidence_file))
+    if args.json:
+        print(json.dumps(result, indent=2))
+    else:
+        print_text(result)
+    return 0 if result["ready"] else 1
+if __name__ == "__main__":
+    raise SystemExit(main())

scripts/check_hf_uploaded_release.py ADDED Viewed

	@@ -0,0 +1,384 @@

+#!/usr/bin/env python3
+"""Verify uploaded Kaiju Coder 7 Hugging Face repos after private upload.
+The default mode is a dry run that prints the exact checks without downloading
+or reading auth tokens. Pass --apply after Hugging Face namespace permission and
+human review are complete. Private repos are verified through the existing HF
+CLI login; tokens are never accepted as arguments or printed.
+"""
+from __future__ import annotations
+import argparse
+import json
+import shutil
+import subprocess
+import sys
+import tempfile
+import urllib.error
+import urllib.request
+from dataclasses import asdict, dataclass
+from pathlib import Path
+from typing import Any
+MODEL_ID = "kaiju-coder-7"
+DEFAULT_NAMESPACE = "RMDWLLC"
+DEFAULT_BASE_URL = "http://100.109.109.14:18083/v1"
+@dataclass(frozen=True)
+class RepoSpec:
+    key: str
+    suffix: str
+    label: str
+    required_files: tuple[str, ...]
+    marker_files: tuple[tuple[str, tuple[str, ...]], ...]
+    def repo_id(self, namespace: str) -> str:
+        return f"{namespace}/{self.suffix}"
+@dataclass
+class Check:
+    name: str
+    status: str
+    detail: str
+REPOS: tuple[RepoSpec, ...] = (
+    RepoSpec(
+        key="adapter",
+        suffix="kaiju-coder-7-adapter",
+        label="adapter repo",
+        required_files=(
+            "README.md",
+            "adapter_config.json",
+            "adapter_model.safetensors",
+            "DATA_PROVENANCE_DRAFT.md",
+            "SOURCE_INVENTORY.md",
+            "EVAL_SCOREBOARD.md",
+            "SERVING_BENCHMARKS.md",
+            "PAID_API_READINESS.md",
+            "PUBLIC_TESTING_QUICKSTART.md",
+            "FINAL_RELEASE_REPORT.md",
+            "GOAL_COMPLETION_AUDIT.md",
+            "UPSTREAM_LICENSE_CHECK.md",
+            "upstream/qwen3.6-27b/LICENSE",
+            "scripts/check_hf_uploaded_release.py",
+            "scripts/check_hf_release_permission_evidence.py",
+        ),
+        marker_files=(
+            ("README.md", ("Kaiju Coder 7", MODEL_ID)),
+            ("PUBLIC_TESTING_QUICKSTART.md", ("Kaiju Coder 7 Public Testing Quickstart", MODEL_ID)),
+            ("FINAL_RELEASE_REPORT.md", ("Kaiju Coder 7 Final Release Report", "Public Launch Blockers")),
+        ),
+    ),
+    RepoSpec(
+        key="opencode",
+        suffix="kaiju-coder-7-opencode",
+        label="OpenCode helper repo",
+        required_files=(
+            "README.md",
+            "PUBLIC_TESTING_QUICKSTART.md",
+            "opencode.kaiju-coder-7.jsonc",
+            ".opencode/agents/kaiju-coder-7.md",
+            "scripts/install_kaiju_opencode_profile.py",
+            "scripts/opencode-kaiju-no-autocontinue.mjs",
+            "scripts/run_kaiju_public_opencode_smoke.py",
+            "scripts/run_kaiju_opencode_customer_pack.py",
+            "scripts/check_hf_uploaded_release.py",
+            "evals/tasks/opencode-customer-readiness.jsonl",
+        ),
+        marker_files=(
+            ("README.md", ("Kaiju Coder 7", "opencode -m kaiju/kaiju-coder-7")),
+            ("opencode.kaiju-coder-7.jsonc", (MODEL_ID, '"context": 16384')),
+            (".opencode/agents/kaiju-coder-7.md", ("You are Kaiju Coder 7", "Confirm the current working directory")),
+            ("scripts/opencode-kaiju-no-autocontinue.mjs", ("experimental.compaction.autocontinue", MODEL_ID)),
+        ),
+    ),
+    RepoSpec(
+        key="quantized-runtime",
+        suffix="kaiju-coder-7-quantized-runtime",
+        label="runtime quantization helper repo",
+        required_files=(
+            "README.md",
+            "PUBLIC_TESTING_QUICKSTART.md",
+            "scripts/start-qwen36-merged-vllm.sh",
+            "scripts/stop-qwen36-merged-vllm.sh",
+            "scripts/run-gojira-b-vllm-serving-benchmark.sh",
+        ),
+        marker_files=(
+            ("README.md", ("Runtime-Quantized Local Candidate", "bitsandbytes", "Kaiju Coder 7")),
+            ("PUBLIC_TESTING_QUICKSTART.md", ("Kaiju Coder 7 Public Testing Quickstart", MODEL_ID)),
+        ),
+    ),
+)
+def shell_join(args: list[str]) -> str:
+    import shlex
+    return " ".join(shlex.quote(arg) for arg in args)
+def run_command(args: list[str], *, cwd: Path | None = None, timeout: int) -> subprocess.CompletedProcess[str]:
+    return subprocess.run(
+        args,
+        cwd=cwd,
+        check=False,
+        text=True,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.STDOUT,
+        timeout=timeout,
+    )
+def read_text(path: Path) -> str:
+    return path.read_text(encoding="utf-8", errors="replace")
+def selected_repos(args: argparse.Namespace) -> list[RepoSpec]:
+    skipped = {
+        "adapter": args.skip_adapter,
+        "opencode": args.skip_opencode,
+        "quantized-runtime": args.skip_quantized_runtime,
+    }
+    return [spec for spec in REPOS if not skipped[spec.key]]
+def add_dry_run_checks(checks: list[Check], repos: list[RepoSpec], namespace: str, download_dir: Path) -> None:
+    checks.append(Check("HF uploaded release mode", "manual", "dry run only; pass --apply to download and verify repos"))
+    for spec in repos:
+        target = download_dir / spec.suffix
+        command = ["hf", "download", spec.repo_id(namespace), "--repo-type", "model", "--local-dir", str(target)]
+        checks.append(Check(f"{spec.label} download command", "manual", shell_join(command)))
+def add_hf_cli_check(checks: list[Check], timeout: int) -> bool:
+    hf_bin = shutil.which("hf")
+    if not hf_bin:
+        checks.append(Check("HF CLI", "fail", "`hf` is not on PATH"))
+        return False
+    result = run_command([hf_bin, "auth", "whoami"], timeout=timeout)
+    if result.returncode == 0 and "user=" in result.stdout:
+        checks.append(Check("HF CLI", "pass", result.stdout.strip().replace("\n", "; ")))
+        return True
+    checks.append(Check("HF CLI", "fail", result.stdout.strip()[:800]))
+    return False
+def download_repo(checks: list[Check], spec: RepoSpec, namespace: str, download_root: Path, timeout: int) -> Path | None:
+    target = download_root / spec.suffix
+    target.mkdir(parents=True, exist_ok=True)
+    command = ["hf", "download", spec.repo_id(namespace), "--repo-type", "model", "--local-dir", str(target)]
+    result = run_command(command, timeout=timeout)
+    if result.returncode == 0:
+        checks.append(Check(f"{spec.label} download", "pass", f"{spec.repo_id(namespace)} downloaded to {target}"))
+        return target
+    checks.append(Check(f"{spec.label} download", "fail", result.stdout.strip()[-1200:]))
+    return None
+def check_required_files(checks: list[Check], spec: RepoSpec, root: Path) -> None:
+    missing = [name for name in spec.required_files if not (root / name).is_file()]
+    if missing:
+        checks.append(Check(f"{spec.label} required files", "fail", "missing: " + ", ".join(missing)))
+    else:
+        checks.append(Check(f"{spec.label} required files", "pass", f"{len(spec.required_files)} files present"))
+def check_markers(checks: list[Check], spec: RepoSpec, root: Path) -> None:
+    failures: list[str] = []
+    for file_name, markers in spec.marker_files:
+        path = root / file_name
+        if not path.is_file():
+            failures.append(f"{file_name} missing")
+            continue
+        text = read_text(path)
+        missing = [marker for marker in markers if marker not in text]
+        if missing:
+            failures.append(f"{file_name} missing {', '.join(missing)}")
+    if failures:
+        checks.append(Check(f"{spec.label} content markers", "fail", "; ".join(failures)))
+    else:
+        checks.append(Check(f"{spec.label} content markers", "pass", "expected Kaiju Coder 7 markers found"))
+def check_public_quickstart_naming(checks: list[Check], spec: RepoSpec, root: Path) -> None:
+    path = root / "PUBLIC_TESTING_QUICKSTART.md"
+    if not path.is_file():
+        return
+    lowered = read_text(path).lower()
+    forbidden = [term for term in ("qwen", "v1.8") if term in lowered]
+    if forbidden:
+        checks.append(Check(f"{spec.label} public naming hygiene", "fail", "contains: " + ", ".join(forbidden)))
+    else:
+        checks.append(Check(f"{spec.label} public naming hygiene", "pass", "public quickstart avoids internal upstream/checkpoint naming"))
+def check_opencode_installer(checks: list[Check], opencode_root: Path, timeout: int) -> None:
+    installer = opencode_root / "scripts/install_kaiju_opencode_profile.py"
+    if not installer.is_file():
+        checks.append(Check("uploaded OpenCode installer dry-run", "fail", f"missing {installer}"))
+        return
+    with tempfile.TemporaryDirectory(prefix="kaiju-uploaded-opencode-config-") as tmp:
+        result = run_command(
+            [sys.executable, str(installer), "--config-dir", tmp, "--dry-run"],
+            cwd=opencode_root,
+            timeout=timeout,
+        )
+    if result.returncode == 0 and "kaiju-no-autocontinue.mjs" in result.stdout and MODEL_ID in result.stdout:
+        checks.append(Check("uploaded OpenCode installer dry-run", "pass", "staged helper installs provider, agent, and loop guard"))
+    else:
+        checks.append(Check("uploaded OpenCode installer dry-run", "fail", result.stdout.strip()[:1000]))
+def run_opencode_smoke(checks: list[Check], opencode_root: Path, base_url: str, timeout: int) -> None:
+    script = opencode_root / "scripts/run_kaiju_public_opencode_smoke.py"
+    if not script.is_file():
+        checks.append(Check("uploaded OpenCode smoke", "fail", f"missing {script}"))
+        return
+    result = run_command([sys.executable, str(script), "--base-url", base_url, "--timeout", str(timeout)], cwd=opencode_root, timeout=timeout + 120)
+    if result.returncode == 0:
+        checks.append(Check("uploaded OpenCode smoke", "pass", "downloaded helper completed live public OpenCode smoke"))
+    else:
+        checks.append(Check("uploaded OpenCode smoke", "fail", result.stdout.strip()[-1200:]))
+def check_public_visibility(checks: list[Check], spec: RepoSpec, namespace: str, timeout: int) -> None:
+    repo_id = spec.repo_id(namespace)
+    url = f"https://huggingface.co/api/models/{repo_id}"
+    request = urllib.request.Request(url, headers={"User-Agent": "kaiju-coder-7-release-check"})
+    try:
+        with urllib.request.urlopen(request, timeout=timeout) as response:
+            if response.status == 200:
+                checks.append(Check(f"{spec.label} public visibility", "pass", f"{repo_id} is publicly readable"))
+                return
+            checks.append(Check(f"{spec.label} public visibility", "fail", f"{url} returned HTTP {response.status}"))
+    except urllib.error.HTTPError as exc:
+        checks.append(Check(f"{spec.label} public visibility", "fail", f"{url} returned HTTP {exc.code}"))
+    except Exception as exc:  # noqa: BLE001 - report network failures clearly.
+        checks.append(Check(f"{spec.label} public visibility", "fail", f"{url} failed: {exc!r}"))
+def verify_downloaded_repo(checks: list[Check], spec: RepoSpec, root: Path, *, installer_timeout: int) -> None:
+    check_required_files(checks, spec, root)
+    check_markers(checks, spec, root)
+    check_public_quickstart_naming(checks, spec, root)
+    if spec.key == "opencode":
+        check_opencode_installer(checks, root, timeout=installer_timeout)
+def summarize(checks: list[Check], *, applied: bool) -> dict[str, Any]:
+    return {
+        "ready": applied and not any(check.status in {"fail", "manual"} for check in checks),
+        "applied": applied,
+        "summary": {
+            "pass": sum(1 for check in checks if check.status == "pass"),
+            "fail": sum(1 for check in checks if check.status == "fail"),
+            "manual": sum(1 for check in checks if check.status == "manual"),
+        },
+        "checks": [asdict(check) for check in checks],
+    }
+def print_text(result: dict[str, Any]) -> None:
+    print(f"Kaiju Coder 7 uploaded HF release verification: ready={result['ready']} applied={result['applied']}")
+    print(
+        "Summary: "
+        f"{result['summary']['pass']} pass, "
+        f"{result['summary']['fail']} fail, "
+        f"{result['summary']['manual']} manual"
+    )
+    for check in result["checks"]:
+        print(f"[{check['status']}] {check['name']} - {check['detail']}")
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--namespace", default=DEFAULT_NAMESPACE)
+    parser.add_argument("--download-dir", type=Path, default=None)
+    parser.add_argument("--apply", action="store_true", help="Download uploaded repos and verify contents.")
+    parser.add_argument("--require-public", action="store_true", help="Require repos to be publicly readable without auth.")
+    parser.add_argument("--run-opencode-smoke", action="store_true", help="Run the downloaded OpenCode helper live smoke.")
+    parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
+    parser.add_argument("--download-timeout", type=int, default=900)
+    parser.add_argument("--installer-timeout", type=int, default=60)
+    parser.add_argument("--public-timeout", type=int, default=15)
+    parser.add_argument("--opencode-timeout", type=int, default=900)
+    parser.add_argument("--skip-adapter", action="store_true")
+    parser.add_argument("--skip-opencode", action="store_true")
+    parser.add_argument("--skip-quantized-runtime", action="store_true")
+    parser.add_argument("--json", action="store_true")
+    return parser.parse_args()
+def main() -> int:
+    args = parse_args()
+    repos = selected_repos(args)
+    checks: list[Check] = []
+    if not repos:
+        checks.append(Check("repo selection", "fail", "all repos were skipped"))
+        result = summarize(checks, applied=args.apply)
+        if args.json:
+            print(json.dumps(result, indent=2))
+        else:
+            print_text(result)
+        return 1
+    if args.download_dir:
+        download_root = args.download_dir
+        download_root.mkdir(parents=True, exist_ok=True)
+        temp_context: Any = None
+    else:
+        temp_context = tempfile.TemporaryDirectory(prefix="kaiju-hf-uploaded-")
+        download_root = Path(temp_context.name)
+    try:
+        if not args.apply:
+            add_dry_run_checks(checks, repos, args.namespace, download_root)
+            result = summarize(checks, applied=False)
+            if args.json:
+                print(json.dumps(result, indent=2))
+            else:
+                print_text(result)
+            return 0
+        if not add_hf_cli_check(checks, timeout=30):
+            result = summarize(checks, applied=True)
+            if args.json:
+                print(json.dumps(result, indent=2))
+            else:
+                print_text(result)
+            return 1
+        downloaded: dict[str, Path] = {}
+        for spec in repos:
+            if args.require_public:
+                check_public_visibility(checks, spec, args.namespace, timeout=args.public_timeout)
+            root = download_repo(checks, spec, args.namespace, download_root, timeout=args.download_timeout)
+            if root:
+                downloaded[spec.key] = root
+                verify_downloaded_repo(checks, spec, root, installer_timeout=args.installer_timeout)
+        if args.run_opencode_smoke:
+            opencode_root = downloaded.get("opencode")
+            if opencode_root:
+                run_opencode_smoke(checks, opencode_root, base_url=args.base_url, timeout=args.opencode_timeout)
+            else:
+                checks.append(Check("uploaded OpenCode smoke", "fail", "OpenCode helper repo was not downloaded"))
+        result = summarize(checks, applied=True)
+        if args.json:
+            print(json.dumps(result, indent=2))
+        else:
+            print_text(result)
+        return 0 if result["ready"] else 1
+    finally:
+        if temp_context is not None:
+            temp_context.cleanup()
+if __name__ == "__main__":
+    raise SystemExit(main())

scripts/check_paid_api_readiness.py ADDED Viewed

	@@ -0,0 +1,518 @@

+#!/usr/bin/env python3
+"""Check Kaiju Coder 7 paid API readiness without reading secrets.
+The scaffold mode should pass for the local Worker implementation. The launch
+mode is intentionally stricter and should fail until real Cloudflare bindings,
+Stripe webhook evidence, staging requests, and rollback proof are attached.
+"""
+from __future__ import annotations
+import argparse
+import json
+import re
+import sys
+from dataclasses import asdict, dataclass
+from pathlib import Path
+from typing import Any
+ROOT = Path(__file__).resolve().parents[1]
+WORKER = ROOT / "gateway/cloudflare-worker"
+WRANGLER = WORKER / "wrangler.jsonc"
+SOURCE = WORKER / "src/index.js"
+TESTS = WORKER / "test/index.test.js"
+MIGRATION = WORKER / "migrations/0001_paid_api.sql"
+PACKAGE = WORKER / "package.json"
+PAID_DOC = ROOT / "release/PAID_API_READINESS.md"
+RESOURCE_SCRIPT = ROOT / "scripts/prepare_paid_api_cloudflare_resources.sh"
+DEFAULT_EVIDENCE = ROOT / "release/paid-api-launch-evidence.json"
+EVIDENCE_EXAMPLE = ROOT / "release/paid-api-launch-evidence.example.json"
+CF_BINDINGS_EXAMPLE = ROOT / "release/cloudflare-bindings.example.json"
+SECRET_PATTERNS = [
+    ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
+    ("anthropic_api_key", re.compile(r"\bsk-ant-[A-Za-z0-9_-]{20,}\b")),
+    ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
+    ("stripe_webhook_secret", re.compile(r"\bwhsec_[A-Za-z0-9]{16,}\b")),
+    ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
+    ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
+    ("google_api_key", re.compile(r"\bAIza[0-9A-Za-z_-]{20,}\b")),
+    ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
+    ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
+]
+@dataclass
+class Check:
+    name: str
+    status: str
+    detail: str
+def file_text(path: Path) -> str:
+    return path.read_text(encoding="utf-8")
+def strip_jsonc(text: str) -> str:
+    text = re.sub(r"/\*.*?\*/", "", text, flags=re.DOTALL)
+    lines = []
+    for line in text.splitlines():
+        in_string = False
+        escaped = False
+        output = []
+        index = 0
+        while index < len(line):
+            char = line[index]
+            nxt = line[index + 1] if index + 1 < len(line) else ""
+            if char == "\\" and in_string:
+                escaped = not escaped
+                output.append(char)
+            elif char == '"' and not escaped:
+                in_string = not in_string
+                output.append(char)
+            elif char == "/" and nxt == "/" and not in_string:
+                break
+            else:
+                escaped = False
+                output.append(char)
+            index += 1
+        lines.append("".join(output))
+    stripped = "\n".join(lines)
+    return re.sub(r",\s*([}\]])", r"\1", stripped)
+def load_wrangler(path: Path = WRANGLER) -> dict[str, Any]:
+    return json.loads(strip_jsonc(file_text(path)))
+def has_real_binding(bindings: list[dict[str, Any]], binding_name: str, id_field: str) -> bool:
+    for binding in bindings:
+        if binding.get("binding") != binding_name:
+            continue
+        value = str(binding.get(id_field, "")).strip()
+        return bool(value) and not value.startswith("replace_with_")
+    return False
+def load_launch_evidence(path: Path) -> tuple[dict[str, Any], Check | None]:
+    if not path.is_file():
+        return {}, None
+    text = file_text(path)
+    findings = [label for label, pattern in SECRET_PATTERNS if pattern.search(text)]
+    if findings:
+        return {}, Check(
+            "paid API launch evidence file",
+            "fail",
+            f"{path} appears to contain secret-looking values: {', '.join(sorted(set(findings)))}",
+        )
+    try:
+        evidence = json.loads(text)
+    except json.JSONDecodeError as exc:
+        return {}, Check("paid API launch evidence file", "fail", f"{path} is invalid JSON: {exc}")
+    if not isinstance(evidence, dict):
+        return {}, Check("paid API launch evidence file", "fail", f"{path} must contain a JSON object")
+    return evidence, Check("paid API launch evidence file", "pass", f"loaded sanitized launch evidence from {path}")
+def evidence_item(
+    checks: list[Check],
+    evidence: dict[str, Any],
+    path: Path,
+    key: str,
+    name: str,
+    required_fields: list[str],
+    validator: Any | None = None,
+) -> None:
+    item = evidence.get(key)
+    if not item:
+        checks.append(Check(name, "manual", f"attach sanitized evidence in {path} key `{key}`"))
+        return
+    if not isinstance(item, dict):
+        checks.append(Check(name, "fail", f"{path} key `{key}` must be an object"))
+        return
+    if item.get("status") != "pass":
+        checks.append(Check(name, "manual", f"{path} key `{key}` status is not pass"))
+        return
+    missing = [field for field in required_fields if item.get(field) in (None, "", [])]
+    if missing:
+        checks.append(Check(name, "manual", f"{path} key `{key}` missing fields: {', '.join(missing)}"))
+        return
+    if validator:
+        validation = validator(item)
+        if validation:
+            checks.append(Check(name, validation[0], validation[1]))
+            return
+    checks.append(Check(name, "pass", f"{path} key `{key}` has required sanitized evidence"))
+def validate_public_route_mode(item: dict[str, Any]) -> tuple[str, str] | None:
+    if item.get("exposure_mode") != "custom_domain":
+        return ("manual", "public route evidence must use exposure_mode=custom_domain before paid launch")
+    route = str(item.get("route", ""))
+    if not route.startswith("https://"):
+        return ("manual", "public route evidence route must be an https URL")
+    return None
+def validate_secrets_verified(item: dict[str, Any]) -> tuple[str, str] | None:
+    required = {"KAIJU_ORIGIN_URL", "KAIJU_ORIGIN_SECRET", "KAIJU_STRIPE_WEBHOOK_SECRET"}
+    observed = set(item.get("observed_names") or [])
+    missing = sorted(required - observed)
+    if missing:
+        return ("manual", "secret-name evidence missing: " + ", ".join(missing))
+    return None
+def validate_d1_migration(item: dict[str, Any]) -> tuple[str, str] | None:
+    if item.get("migration") != "0001_paid_api.sql":
+        return ("manual", "D1 migration evidence must name 0001_paid_api.sql")
+    if item.get("result") not in {"success", "already_applied"}:
+        return ("manual", "D1 migration result must be success or already_applied")
+    return None
+def validate_stripe_staging(item: dict[str, Any]) -> tuple[str, str] | None:
+    if item.get("webhook_event") != "checkout.session.completed":
+        return ("manual", "Stripe evidence must include checkout.session.completed")
+    if item.get("idempotency_checked") is not True:
+        return ("manual", "Stripe evidence must confirm duplicate webhook idempotency")
+    return None
+def validate_staging_request(item: dict[str, Any]) -> tuple[str, str] | None:
+    if item.get("model") != "kaiju-coder-7":
+        return ("fail", "staging request evidence must use model=kaiju-coder-7")
+    if int(item.get("http_status") or 0) != 200:
+        return ("manual", "staging request evidence must show HTTP 200")
+    if item.get("streamed") is not True:
+        return ("manual", "staging request evidence must confirm streaming")
+    return None
+def validate_rollback(item: dict[str, Any]) -> tuple[str, str] | None:
+    if item.get("result") != "success":
+        return ("manual", "rollback evidence must be a successful exercised rollback or route switch")
+    return None
+def validate_latency(item: dict[str, Any]) -> tuple[str, str] | None:
+    p95_ms = float(item.get("p95_ms") or 0)
+    sample_count = int(item.get("sample_count") or 0)
+    max_acceptable_ms = float(item.get("max_acceptable_ms") or 0)
+    if sample_count < 5:
+        return ("manual", "latency evidence needs at least 5 staging samples")
+    if max_acceptable_ms <= 0:
+        return ("manual", "latency evidence must set max_acceptable_ms")
+    if p95_ms <= 0 or p95_ms > max_acceptable_ms:
+        return ("manual", f"p95_ms={p95_ms:g} exceeds max_acceptable_ms={max_acceptable_ms:g}")
+    return None
+def add_marker_check(checks: list[Check], name: str, text: str, markers: list[str], path: Path) -> None:
+    missing = [marker for marker in markers if marker not in text]
+    if missing:
+        checks.append(Check(name, "fail", f"{path} missing markers: {', '.join(missing)}"))
+    else:
+        checks.append(Check(name, "pass", f"{path} contains required markers"))
+def scaffold_checks(wrangler_path: Path = WRANGLER) -> list[Check]:
+    checks: list[Check] = []
+    source = file_text(SOURCE)
+    tests = file_text(TESTS)
+    migration = file_text(MIGRATION)
+    package = json.loads(file_text(PACKAGE))
+    paid_doc = file_text(PAID_DOC)
+    resource_script = file_text(RESOURCE_SCRIPT)
+    wrangler = load_wrangler(wrangler_path)
+    add_marker_check(
+        checks,
+        "model id enforcement",
+        source,
+        [
+            'const DEFAULT_MODEL_ID = "kaiju-coder-7"',
+            "Unsupported model. Use",
+            "payload.model = modelId",
+        ],
+        SOURCE,
+    )
+    add_marker_check(
+        checks,
+        "streaming and thinking controls",
+        source,
+        ["payload.stream = true", "enable_thinking: false", "thinking: false", "streamHeaders"],
+        SOURCE,
+    )
+    add_marker_check(
+        checks,
+        "billing and debit/refund controls",
+        source,
+        ["KAIJU_BILLING_DB", "reserveCredit", "refundCredit", "markUsageDebited"],
+        SOURCE,
+    )
+    add_marker_check(
+        checks,
+        "rate limit controls",
+        source,
+        ["KAIJU_RATE_LIMIT_KV", "rateLimit", "Rate limit exceeded"],
+        SOURCE,
+    )
+    add_marker_check(
+        checks,
+        "secret-like prompt rejection",
+        source,
+        ["SECRET_PATTERNS", "secret_like_content", "Remove them before using Kaiju Coder 7"],
+        SOURCE,
+    )
+    add_marker_check(
+        checks,
+        "stripe top-up webhook",
+        source,
+        ["verifyStripeSignature", "checkout.session.completed", "stripe_topup_credited"],
+        SOURCE,
+    )
+    add_marker_check(
+        checks,
+        "artifact route controls",
+        source,
+        [
+            "KAIJU_ARTIFACT_BUCKET",
+            "uploadArtifact",
+            "downloadArtifact",
+            "artifact_stored",
+            "Artifact appears to contain secrets or credentials",
+        ],
+        SOURCE,
+    )
+    add_marker_check(
+        checks,
+        "paid API tests",
+        tests,
+        [
+            "rejects inactive paid API keys",
+            "rejects paid API requests with insufficient credits before origin fetch",
+            "rate limits authenticated paid API keys before debit",
+            "credits paid API balance from signed Stripe checkout webhook",
+            "rejects secret-looking prompt content before debit",
+        ],
+        TESTS,
+    )
+    add_marker_check(
+        checks,
+        "artifact route tests",
+        tests,
+        [
+            "stores origin-uploaded artifacts in the account/request R2 namespace",
+            "serves customer artifacts only through the authenticated account namespace",
+            "rejects unsafe artifact paths before R2 storage",
+            "rejects secret-looking artifact content before R2 storage",
+        ],
+        TESTS,
+    )
+    add_marker_check(
+        checks,
+        "D1 schema",
+        migration,
+        ["kaiju_api_keys", "kaiju_credit_ledger", "kaiju_usage_events"],
+        MIGRATION,
+    )
+    add_marker_check(
+        checks,
+        "paid readiness docs",
+        paid_doc,
+        [
+            "Do not sell the hosted API",
+            "Harnessed customer-readiness pack",
+            "Raw OpenCode multi-file pack remains a blocker",
+        ],
+        PAID_DOC,
+    )
+    if (
+        package.get("scripts", {}).get("check")
+        == "node --check src/index.js && node --check scripts/create-api-key.mjs && npm test && npm run check:deploy"
+        and package.get("scripts", {}).get("check:deploy") == "npx wrangler deploy --dry-run"
+    ):
+        checks.append(Check("gateway check command", "pass", "npm run check covers syntax, Worker tests, and Wrangler dry-run deploy"))
+    else:
+        checks.append(Check("gateway check command", "fail", "package.json check/check:deploy scripts changed or missing"))
+    if package.get("scripts", {}).get("prepare:cloudflare") == "bash ../../scripts/prepare_paid_api_cloudflare_resources.sh":
+        checks.append(Check("Cloudflare resource prep command", "pass", "npm run prepare:cloudflare is wired"))
+    else:
+        checks.append(Check("Cloudflare resource prep command", "fail", "package.json prepare:cloudflare script is missing"))
+    add_marker_check(
+        checks,
+        "Cloudflare resource prep script",
+        resource_script,
+        [
+            "KAIJU_CF_RESOURCE_APPLY",
+            "wrangler d1 create",
+            "wrangler kv namespace create",
+            "wrangler r2 bucket create",
+            "wrangler d1 migrations apply",
+            "wrangler rollback",
+            "preflight:launch",
+        ],
+        RESOURCE_SCRIPT,
+    )
+    if EVIDENCE_EXAMPLE.is_file():
+        checks.append(Check("paid API launch evidence template", "pass", f"{EVIDENCE_EXAMPLE} exists"))
+    else:
+        checks.append(Check("paid API launch evidence template", "fail", f"missing {EVIDENCE_EXAMPLE}"))
+    if CF_BINDINGS_EXAMPLE.is_file():
+        checks.append(Check("Cloudflare bindings template", "pass", f"{CF_BINDINGS_EXAMPLE} exists"))
+    else:
+        checks.append(Check("Cloudflare bindings template", "fail", f"missing {CF_BINDINGS_EXAMPLE}"))
+    if wrangler.get("name") == "kaiju-api-gateway" and wrangler.get("main") == "src/index.js":
+        checks.append(Check("wrangler scaffold config", "pass", "Worker name and entrypoint are present"))
+    else:
+        checks.append(Check("wrangler scaffold config", "fail", "wrangler name or entrypoint is missing"))
+    return checks
+def launch_checks(evidence_path: Path, wrangler_path: Path = WRANGLER) -> list[Check]:
+    checks = scaffold_checks(wrangler_path)
+    wrangler = load_wrangler(wrangler_path)
+    evidence, evidence_check = load_launch_evidence(evidence_path)
+    if evidence_check and evidence_check.status == "fail":
+        checks.append(evidence_check)
+    if has_real_binding(wrangler.get("d1_databases", []), "KAIJU_BILLING_DB", "database_id"):
+        checks.append(Check("live D1 binding", "pass", "KAIJU_BILLING_DB has a non-placeholder database_id"))
+    else:
+        checks.append(Check("live D1 binding", "fail", "KAIJU_BILLING_DB is missing or still placeholder/commented"))
+    if has_real_binding(wrangler.get("kv_namespaces", []), "KAIJU_RATE_LIMIT_KV", "id"):
+        checks.append(Check("live KV binding", "pass", "KAIJU_RATE_LIMIT_KV has a non-placeholder id"))
+    else:
+        checks.append(Check("live KV binding", "fail", "KAIJU_RATE_LIMIT_KV is missing or still placeholder/commented"))
+    if has_real_binding(wrangler.get("r2_buckets", []), "KAIJU_ARTIFACT_BUCKET", "bucket_name"):
+        checks.append(Check("artifact R2 binding", "pass", "KAIJU_ARTIFACT_BUCKET is configured"))
+    else:
+        checks.append(Check("artifact R2 binding", "fail", "KAIJU_ARTIFACT_BUCKET is missing; artifact routes cannot launch"))
+    if wrangler.get("workers_dev") is False:
+        checks.append(Check("public route mode", "pass", "workers_dev is disabled for custom-domain launch"))
+    else:
+        evidence_item(
+            checks,
+            evidence,
+            evidence_path,
+            "public_route_mode",
+            "public route mode",
+            ["checked_at", "exposure_mode", "route", "result"],
+            validate_public_route_mode,
+        )
+    evidence_item(
+        checks,
+        evidence,
+        evidence_path,
+        "wrangler_secrets_verified",
+        "wrangler secret list confirms KAIJU_ORIGIN_URL, KAIJU_ORIGIN_SECRET, and KAIJU_STRIPE_WEBHOOK_SECRET",
+        ["checked_at", "command", "observed_names"],
+        validate_secrets_verified,
+    )
+    evidence_item(
+        checks,
+        evidence,
+        evidence_path,
+        "d1_migration_applied",
+        "D1 migration 0001_paid_api.sql applied to the live billing database",
+        ["checked_at", "command", "migration", "result"],
+        validate_d1_migration,
+    )
+    evidence_item(
+        checks,
+        evidence,
+        evidence_path,
+        "stripe_checkout_topup_staging",
+        "Stripe Checkout top-up products and webhook endpoint tested with metadata.kaiju_api_key_id",
+        ["checked_at", "mode", "webhook_event", "credited_api_key_id", "idempotency_checked"],
+        validate_stripe_staging,
+    )
+    evidence_item(
+        checks,
+        evidence,
+        evidence_path,
+        "worker_to_gojira_staging_request",
+        "staging request passed through Worker to Gojira-B origin with model=kaiju-coder-7",
+        ["checked_at", "route", "model", "http_status", "streamed", "request_id"],
+        validate_staging_request,
+    )
+    evidence_item(
+        checks,
+        evidence,
+        evidence_path,
+        "rollback_exercised",
+        "rollback command or route switch was exercised and recorded",
+        ["checked_at", "command", "result"],
+        validate_rollback,
+    )
+    evidence_item(
+        checks,
+        evidence,
+        evidence_path,
+        "paid_route_latency",
+        "p95 latency for paid routes is recorded after staging traffic",
+        ["checked_at", "route", "sample_count", "p95_ms", "max_acceptable_ms"],
+        validate_latency,
+    )
+    return checks
+def summarize(checks: list[Check], mode: str) -> dict[str, Any]:
+    hard_fail = any(check.status == "fail" for check in checks)
+    manual = any(check.status == "manual" for check in checks)
+    ready = not hard_fail and (mode == "scaffold" or not manual)
+    return {
+        "mode": mode,
+        "ready": ready,
+        "summary": {
+            "pass": sum(1 for check in checks if check.status == "pass"),
+            "fail": sum(1 for check in checks if check.status == "fail"),
+            "manual": sum(1 for check in checks if check.status == "manual"),
+        },
+        "checks": [asdict(check) for check in checks],
+    }
+def print_text(result: dict[str, Any]) -> None:
+    print(f"Kaiju Coder 7 paid API readiness: mode={result['mode']} ready={result['ready']}")
+    print(
+        "Summary: "
+        f"{result['summary']['pass']} pass, "
+        f"{result['summary']['fail']} fail, "
+        f"{result['summary']['manual']} manual"
+    )
+    for check in result["checks"]:
+        print(f"[{check['status']}] {check['name']} - {check['detail']}")
+def main() -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--mode", choices=["scaffold", "launch"], default="scaffold")
+    parser.add_argument("--evidence-file", type=Path, default=DEFAULT_EVIDENCE)
+    parser.add_argument("--wrangler-config", type=Path, default=WRANGLER)
+    parser.add_argument("--json", action="store_true", help="Print machine-readable JSON.")
+    args = parser.parse_args()
+    checks = scaffold_checks(args.wrangler_config) if args.mode == "scaffold" else launch_checks(args.evidence_file, args.wrangler_config)
+    result = summarize(checks, args.mode)
+    if args.json:
+        print(json.dumps(result, indent=2))
+    else:
+        print_text(result)
+    return 0 if result["ready"] else 1
+if __name__ == "__main__":
+    raise SystemExit(main())

scripts/collect_hf_release_permission_evidence.py ADDED Viewed

	@@ -0,0 +1,156 @@

+#!/usr/bin/env python3
+"""Create sanitized Hugging Face repo-create permission evidence.
+This helper never reads or writes Hugging Face tokens. In apply mode it runs a
+private model repo-create probe for the intended namespace, then writes only
+the sanitized facts required by check_hf_release_permission_evidence.py.
+"""
+from __future__ import annotations
+import argparse
+import json
+import shutil
+import subprocess
+import sys
+import tempfile
+from datetime import datetime, timezone
+from pathlib import Path
+ROOT = Path(__file__).resolve().parents[1]
+DEFAULT_OUT = ROOT / "release/hf-release-permission-evidence.json"
+DEFAULT_NAMESPACE = "RMDWLLC"
+MODEL_ID = "kaiju-coder-7"
+def run(args: list[str]) -> subprocess.CompletedProcess[str]:
+    return subprocess.run(
+        args,
+        cwd=ROOT,
+        check=False,
+        text=True,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.STDOUT,
+    )
+def parse_whoami(text: str) -> str:
+    for part in text.replace("\n", " ").split():
+        if part.startswith("user="):
+            return part.split("=", 1)[1].strip()
+    first = text.strip().splitlines()[0].strip() if text.strip() else ""
+    return first.split()[0] if first else ""
+def validate_payload(payload: dict[str, str]) -> None:
+    with tempfile.TemporaryDirectory() as tmp:
+        evidence_path = Path(tmp) / "hf-release-permission-evidence.json"
+        evidence_path.write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8")
+        result = run(
+            [
+                sys.executable,
+                str(ROOT / "scripts/check_hf_release_permission_evidence.py"),
+                "--evidence-file",
+                str(evidence_path),
+                "--json",
+            ]
+        )
+    if result.returncode != 0:
+        raise RuntimeError("generated evidence did not validate:\n" + result.stdout)
+def build_payload(namespace: str, user: str, probe_repo: str, command: str) -> dict[str, str]:
+    return {
+        "status": "pass",
+        "checked_at": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
+        "namespace": namespace,
+        "authenticated_user": user,
+        "probe_repo": probe_repo,
+        "command": command,
+        "result": "private model repo creation succeeded",
+    }
+def print_json(payload: dict[str, object]) -> None:
+    print(json.dumps(payload, indent=2))
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--namespace", default=DEFAULT_NAMESPACE)
+    parser.add_argument("--out", type=Path, default=DEFAULT_OUT)
+    parser.add_argument("--apply", action="store_true", help="create the private permission probe repo")
+    parser.add_argument("--write", action="store_true", help="write the sanitized evidence file after the probe passes")
+    parser.add_argument("--json", action="store_true")
+    return parser.parse_args()
+def main() -> int:
+    args = parse_args()
+    if shutil.which("hf") is None:
+        print("Missing Hugging Face CLI: hf", file=sys.stderr)
+        print("Install: curl -LsSf https://hf.co/cli/install.sh | bash -s", file=sys.stderr)
+        return 2
+    whoami = run(["hf", "auth", "whoami"])
+    if whoami.returncode != 0:
+        print("hf auth whoami failed. Run `hf auth login` with a write-capable token.", file=sys.stderr)
+        print(whoami.stdout.strip(), file=sys.stderr)
+        return whoami.returncode or 1
+    user = parse_whoami(whoami.stdout)
+    if not user:
+        print("Could not parse authenticated Hugging Face username from `hf auth whoami`.", file=sys.stderr)
+        return 2
+    probe_repo = f"{args.namespace}/{MODEL_ID}-permission-probe"
+    command_args = ["hf", "repos", "create", probe_repo, "--type", "model", "--private", "--exist-ok"]
+    command_text = " ".join(command_args)
+    if not args.apply:
+        preview = {
+            "ready": False,
+            "authenticated_user": user,
+            "namespace": args.namespace,
+            "probe_repo": probe_repo,
+            "next_command": command_text,
+            "detail": "dry run only; pass --apply --write after the intended namespace/token is active",
+        }
+        if args.json:
+            print_json(preview)
+        else:
+            print("Dry run. No repo was created and no evidence file was written.")
+            print(f"Authenticated user: {user}")
+            print(f"Namespace: {args.namespace}")
+            print(f"Probe command: {command_text}")
+            print(f"Write evidence after a successful probe: {args.out}")
+        return 0
+    probe = run(command_args)
+    if probe.returncode != 0:
+        print("Hugging Face private repo-create permission probe failed.", file=sys.stderr)
+        print(probe.stdout.strip(), file=sys.stderr)
+        return probe.returncode or 1
+    payload = build_payload(args.namespace, user, probe_repo, command_text)
+    validate_payload(payload)
+    if args.write:
+        args.out.parent.mkdir(parents=True, exist_ok=True)
+        args.out.write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8")
+        if args.json:
+            print_json({"ready": True, "evidence_file": str(args.out), "evidence": payload})
+        else:
+            print(f"Wrote sanitized Hugging Face permission evidence: {args.out}")
+    else:
+        if args.json:
+            print_json({"ready": True, "evidence_file": None, "evidence": payload})
+        else:
+            print("Permission probe passed. Preview evidence:")
+            print(json.dumps(payload, indent=2))
+            print(f"Pass --write to write: {args.out}")
+    return 0
+if __name__ == "__main__":
+    raise SystemExit(main())

scripts/collect_paid_api_launch_evidence.py ADDED Viewed

	@@ -0,0 +1,286 @@

+#!/usr/bin/env python3
+"""Collect sanitized Kaiju Coder 7 paid API launch evidence.
+This script helps fill release/paid-api-launch-evidence.json without storing
+API keys, secret values, full prompts, or model responses. It is preview-only by
+default; pass --write to update the evidence file.
+"""
+from __future__ import annotations
+import argparse
+import json
+import os
+import re
+import statistics
+import sys
+import time
+import urllib.error
+import urllib.request
+import uuid
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+ROOT = Path(__file__).resolve().parents[1]
+DEFAULT_OUT = ROOT / "release/paid-api-launch-evidence.json"
+MODEL_ID = "kaiju-coder-7"
+DEFAULT_ROUTE = "/v1/chat/completions"
+SECRET_PATTERNS = [
+    ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
+    ("anthropic_api_key", re.compile(r"\bsk-ant-[A-Za-z0-9_-]{20,}\b")),
+    ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
+    ("stripe_webhook_secret", re.compile(r"\bwhsec_[A-Za-z0-9]{16,}\b")),
+    ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
+    ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
+    ("google_api_key", re.compile(r"\bAIza[0-9A-Za-z_-]{20,}\b")),
+    ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
+    ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
+]
+def utc_now() -> str:
+    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
+def load_existing(path: Path) -> dict[str, Any]:
+    if not path.is_file():
+        return {}
+    return json.loads(path.read_text(encoding="utf-8"))
+def secret_findings(text: str) -> list[str]:
+    return sorted({label for label, pattern in SECRET_PATTERNS if pattern.search(text)})
+def assert_sanitized(payload: dict[str, Any]) -> None:
+    rendered = json.dumps(payload, sort_keys=True)
+    findings = secret_findings(rendered)
+    if findings:
+        raise SystemExit("Refusing to write secret-looking evidence: " + ", ".join(findings))
+def api_url(base_url: str, path: str) -> str:
+    return base_url.rstrip("/") + path
+def request_json(url: str, payload: dict[str, Any], api_key: str, request_id: str, timeout: int) -> tuple[int, str, float]:
+    body = json.dumps(payload).encode("utf-8")
+    request = urllib.request.Request(
+        url,
+        data=body,
+        method="POST",
+        headers={
+            "authorization": f"Bearer {api_key}",
+            "content-type": "application/json",
+            "x-request-id": request_id,
+        },
+    )
+    start = time.perf_counter()
+    try:
+        with urllib.request.urlopen(request, timeout=timeout) as response:
+            content_type = response.headers.get("content-type", "")
+            response.read()
+            return response.status, content_type, (time.perf_counter() - start) * 1000
+    except urllib.error.HTTPError as exc:
+        exc.read()
+        return exc.code, exc.headers.get("content-type", ""), (time.perf_counter() - start) * 1000
+def probe_health(base_url: str, timeout: int) -> tuple[int, float] | None:
+    start = time.perf_counter()
+    try:
+        with urllib.request.urlopen(api_url(base_url, "/health"), timeout=timeout) as response:
+            response.read()
+            return response.status, (time.perf_counter() - start) * 1000
+    except Exception:
+        return None
+def percentile_95(values: list[float]) -> float:
+    if len(values) == 1:
+        return values[0]
+    try:
+        return statistics.quantiles(values, n=20, method="inclusive")[18]
+    except statistics.StatisticsError:
+        return max(values)
+def run_staging_samples(args: argparse.Namespace) -> tuple[dict[str, Any] | None, dict[str, Any] | None]:
+    if args.skip_live_request:
+        return None, None
+    if not args.api_base_url:
+        raise SystemExit("--api-base-url is required unless --skip-live-request is set")
+    api_key = os.environ.get(args.api_key_env)
+    if not api_key:
+        raise SystemExit(f"{args.api_key_env} is not set; refusing to read API keys from arguments")
+    latencies: list[float] = []
+    first_request_id = ""
+    first_status = 0
+    first_streamed = False
+    url = api_url(args.api_base_url, DEFAULT_ROUTE)
+    sample_count = max(args.live_samples, 1)
+    for index in range(sample_count):
+        request_id = f"kaiju-paid-staging-{uuid.uuid4()}"
+        payload = {
+            "model": MODEL_ID,
+            "stream": True,
+            "max_tokens": 48,
+            "messages": [
+                {
+                    "role": "user",
+                    "content": "Return a short Kaiju Coder 7 paid API staging smoke response.",
+                }
+            ],
+        }
+        status, content_type, latency_ms = request_json(url, payload, api_key, request_id, args.timeout)
+        if index == 0:
+            first_request_id = request_id
+            first_status = status
+            first_streamed = "event-stream" in content_type.lower()
+        if status == 200:
+            latencies.append(latency_ms)
+    request_evidence = {
+        "status": "pass" if first_status == 200 and first_streamed else "pending",
+        "checked_at": utc_now(),
+        "route": DEFAULT_ROUTE,
+        "model": MODEL_ID,
+        "http_status": first_status,
+        "streamed": first_streamed,
+        "request_id": first_request_id,
+    }
+    latency_evidence = {
+        "status": "pass" if len(latencies) >= 5 else "pending",
+        "checked_at": utc_now(),
+        "route": DEFAULT_ROUTE,
+        "sample_count": len(latencies),
+        "p95_ms": round(percentile_95(latencies), 2) if latencies else 0,
+        "max_acceptable_ms": args.max_acceptable_ms,
+    }
+    return request_evidence, latency_evidence
+def add_optional_manual_evidence(evidence: dict[str, Any], args: argparse.Namespace) -> None:
+    checked_at = args.checked_at or utc_now()
+    if args.public_route_ok:
+        health = probe_health(args.api_base_url, args.timeout) if args.api_base_url else None
+        evidence["public_route_mode"] = {
+            "status": "pass",
+            "checked_at": checked_at,
+            "exposure_mode": "custom_domain",
+            "route": args.api_base_url,
+            "result": "custom domain resolves to the intended Kaiju Worker"
+            + (f"; /health={health[0]} in {health[1]:.0f}ms" if health else ""),
+        }
+    if args.wrangler_secret_name:
+        evidence["wrangler_secrets_verified"] = {
+            "status": "pass",
+            "checked_at": checked_at,
+            "command": "wrangler secret list",
+            "observed_names": sorted(set(args.wrangler_secret_name)),
+        }
+    if args.d1_migration_result:
+        evidence["d1_migration_applied"] = {
+            "status": "pass",
+            "checked_at": checked_at,
+            "command": args.d1_migration_command,
+            "migration": "0001_paid_api.sql",
+            "result": args.d1_migration_result,
+        }
+    if args.stripe_checkout_topup_pass:
+        evidence["stripe_checkout_topup_staging"] = {
+            "status": "pass",
+            "checked_at": checked_at,
+            "mode": args.stripe_mode,
+            "webhook_event": "checkout.session.completed",
+            "credited_api_key_id": args.credited_api_key_id,
+            "idempotency_checked": args.stripe_idempotency_checked,
+        }
+    if args.rollback_result:
+        evidence["rollback_exercised"] = {
+            "status": "pass",
+            "checked_at": checked_at,
+            "command": args.rollback_command,
+            "result": args.rollback_result,
+        }
+    if args.staging_request_id:
+        evidence["worker_to_gojira_staging_request"] = {
+            "status": "pass",
+            "checked_at": checked_at,
+            "route": DEFAULT_ROUTE,
+            "model": MODEL_ID,
+            "http_status": args.staging_http_status,
+            "streamed": args.staging_streamed,
+            "request_id": args.staging_request_id,
+        }
+    if args.paid_route_p95_ms is not None:
+        evidence["paid_route_latency"] = {
+            "status": "pass",
+            "checked_at": checked_at,
+            "route": DEFAULT_ROUTE,
+            "sample_count": args.paid_route_sample_count,
+            "p95_ms": args.paid_route_p95_ms,
+            "max_acceptable_ms": args.max_acceptable_ms,
+        }
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--out", type=Path, default=DEFAULT_OUT)
+    parser.add_argument("--write", action="store_true", help="Write the evidence file. Default is preview only.")
+    parser.add_argument("--merge-existing", action="store_true", help="Merge with existing evidence at --out.")
+    parser.add_argument("--checked-at", help="Override checked_at timestamp for manual evidence.")
+    parser.add_argument("--api-base-url", default="", help="Public paid API base URL, for example https://api.example.com.")
+    parser.add_argument("--api-key-env", default="KAIJU_PAID_API_KEY", help="Environment variable containing the staging API key.")
+    parser.add_argument("--timeout", type=int, default=120)
+    parser.add_argument("--skip-live-request", action="store_true", help="Do not call the paid API.")
+    parser.add_argument("--live-samples", type=int, default=5)
+    parser.add_argument("--max-acceptable-ms", type=float, default=120_000)
+    parser.add_argument("--public-route-ok", action="store_true", help="Record public custom-domain route evidence.")
+    parser.add_argument("--wrangler-secret-name", action="append", default=[], help="Observed Wrangler secret name. Repeatable.")
+    parser.add_argument("--d1-migration-result", choices=["success", "already_applied"])
+    parser.add_argument(
+        "--d1-migration-command",
+        default="wrangler d1 migrations apply KAIJU_BILLING_DB --remote",
+    )
+    parser.add_argument("--stripe-checkout-topup-pass", action="store_true")
+    parser.add_argument("--stripe-mode", default="test")
+    parser.add_argument("--credited-api-key-id", default="key_staging_001")
+    parser.add_argument("--stripe-idempotency-checked", action="store_true")
+    parser.add_argument("--rollback-result", choices=["success"])
+    parser.add_argument("--rollback-command", default="wrangler rollback")
+    parser.add_argument("--staging-request-id", help="Sanitized request id from a separate staging request.")
+    parser.add_argument("--staging-http-status", type=int, default=200)
+    parser.add_argument("--staging-streamed", action="store_true")
+    parser.add_argument("--paid-route-p95-ms", type=float)
+    parser.add_argument("--paid-route-sample-count", type=int, default=5)
+    return parser.parse_args()
+def main() -> int:
+    args = parse_args()
+    evidence = load_existing(args.out) if args.merge_existing else {}
+    add_optional_manual_evidence(evidence, args)
+    request_evidence, latency_evidence = run_staging_samples(args)
+    if request_evidence:
+        evidence["worker_to_gojira_staging_request"] = request_evidence
+    if latency_evidence:
+        evidence["paid_route_latency"] = latency_evidence
+    assert_sanitized(evidence)
+    rendered = json.dumps(evidence, indent=2, sort_keys=True) + "\n"
+    if args.write:
+        args.out.parent.mkdir(parents=True, exist_ok=True)
+        args.out.write_text(rendered, encoding="utf-8")
+        print(f"Wrote sanitized paid API evidence to {args.out}")
+    else:
+        print(rendered, end="")
+        print("Preview only. Pass --write to update the evidence file.", file=sys.stderr)
+    return 0
+if __name__ == "__main__":
+    raise SystemExit(main())

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f399b3cd12fa270d51457bb749fb30863521e8359b8a27059c71b6c2f7d6dd6c
+size 19989424

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "add_prefix_space": false,
+  "audio_bos_token": "<|audio_start|>",
+  "audio_eos_token": "<|audio_end|>",
+  "audio_token": "<|audio_pad|>",
+  "backend": "tokenizers",
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "image_token": "<|image_pad|>",
+  "is_local": true,
+  "local_files_only": false,
+  "model_max_length": 262144,
+  "model_specific_special_tokens": {
+    "audio_bos_token": "<|audio_start|>",
+    "audio_eos_token": "<|audio_end|>",
+    "audio_token": "<|audio_pad|>",
+    "image_token": "<|image_pad|>",
+    "video_token": "<|video_pad|>",
+    "vision_bos_token": "<|vision_start|>",
+    "vision_eos_token": "<|vision_end|>"
+  },
+  "pad_token": "<|endoftext|>",
+  "pretokenize_regex": "(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?[\\p{L}\\p{M}]+|\\p{N}| ?[^\\s\\p{L}\\p{M}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null,
+  "video_token": "<|video_pad|>",
+  "vision_bos_token": "<|vision_start|>",
+  "vision_eos_token": "<|vision_end|>"
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4134c1b08a1c0abe45425df32940332d5ab998f2811aab5fc7525f465f6ba60b
+size 5329

upstream/qwen3.6-27b/LICENSE ADDED Viewed

	@@ -0,0 +1,202 @@

+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright 2026 Alibaba Cloud
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.