restokes92 commited on
Commit
d95f073
·
verified ·
1 Parent(s): c239309

Upload Kaiju Coder 7 adapter release package

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
COMPLETION_AUDIT.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Business-Owner Release Completion Audit
2
+
3
+ Date: 2026-06-03
4
+
5
+ This audit maps the active goal to current evidence. It is intentionally
6
+ conservative: the product-path harness is release-candidate ready for local
7
+ testing, the fresh v1.8 Qwen 3.6 LoRA adapter exists, and a merged full-model
8
+ artifact serves locally on Gojira-B. Dynamic SGLang LoRA serving is not counted
9
+ as release evidence because the corrected LoRA selector crashes on this
10
+ adapter. Human review, website latency/SLA decisions, broader comparison evals,
11
+ and Hugging Face write permissions are still required before publishing
12
+ externally.
13
+
14
+ ## Requirement Status
15
+
16
+ | Requirement | Current evidence | Status |
17
+ |---|---|---|
18
+ | Continue from `RichardEchols/kaiju-coder`, not a restart | Branch `codex/kaiju-business-owner-rc` is based on `3d57eae92ad523519473f0ff3eca6661a9736de3`, matching `origin/main`. | Passed |
19
+ | GitHub and local source inventory for Kaiju, Kiyomi, RMDW, Makoto, Mezzal, and wiki sources | `release/SOURCE_INVENTORY.md` and `release/source-inventory.json` generated from GitHub metadata, `git ls-remote` SHAs, and the requested local `/Users/richardecholsai7/Documents/RMDW-Wiki` snapshot marked non-authoritative/selective-reference-only. | Passed |
20
+ | Legally reusable, provenance-preserving dataset update | `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl` adds reviewed RMDW-owned examples with `source_repos`, `source_paths`, and `provenance_notes`. | Passed |
21
+ | Dataset validation | `python3 scripts/validate_training_data.py --min-examples 350` passes with `1,689` reviewed examples across `14` files. | Passed |
22
+ | v1.7 business-owner SFT build | `python3 scripts/build_v17_business_owner_sft_dataset.py` writes `1,881` rows and `192` controlled business-owner repeats. | Passed |
23
+ | Hard evals for business-owner workflows | `evals/tasks/router-hard-harness.jsonl` includes `business_suite` prompts; latest local RC smoke run produced `23/23` static pass. | Passed |
24
+ | Local Kaiju product path runs | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` validates data, builds SFT, smokes the local API harness, runs router hard eval, and runs static checks. | Passed |
25
+ | Complete Kiyomi 7.7.7 AI-company artifact generation | `business_suite` route writes a 19-file pack including launch kit, content engine, connector checklist, intake CRM, reporting, automations, operator handbook, leads, sales, ROI dashboard, and Workshop artifact. | Passed |
26
+ | Secret/private-data guardrails | Dataset validation scans common secret patterns; verifier checks `no_hardcoded_secrets`; source inventory excludes credentials, tokens, private client data, and raw logs. | Passed |
27
+ | Release artifacts | `release/MODEL_CARD_DRAFT.md`, `release/HF_ADAPTER_MODEL_CARD.md`, `release/DATA_PROVENANCE_DRAFT.md`, `release/EVAL_SCOREBOARD.md`, `release/LOCAL_TEST_INSTRUCTIONS.md`, `release/HUGGINGFACE_RELEASE_DRAFT.md`, `release/FINAL_RELEASE_REPORT.md`, `release/UPSTREAM_LICENSE_CHECK.md`, and this audit. | Passed |
28
+ | Fresh Qwen 3.6 v1.7 fine-tune | After clearing old ComfyUI/Ollama workloads from Gojira B, training finished with `metrics.json`, train runtime `1663.7101s`, train loss `1.7260706673065822`, and an adapter directory. | Passed |
29
+ | Local inference against new v1.7 checkpoint | SGLang served `kaiju_v17_business_owner` over Tailscale at `http://100.109.109.14:18083/v1` with `context=4096` and `mem_fraction=0.90`; website and proposal smoke tasks returned non-empty outputs. | Passed |
30
+ | Stronger Qwen 3.6 v1.8 fine-tune | Gojira B was cleared of ComfyUI/SGLang/Ollama GPU conflicts; v1.8 finished with `metrics.json`, train runtime `11666.7564s`, train loss `0.9281658741335074`, and an adapter directory. | Passed |
31
+ | v1.8 adapter merged into full model | `scripts/run-gojira-b-qwen36-lora-merge.sh` merged `/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter` into `/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`; remote artifact is `51G` with `14` safetensor shards and preserved base config/processor sidecars. | Passed |
32
+ | Local inference against v1.8 merged checkpoint | `scripts/start-qwen36-merged-sglang.sh` serves `kaiju-coder-7` over Tailscale at `http://100.109.109.14:18083/v1`; current restored live endpoint reports max model len `16384`. Prior benchmarks proved 12k/16k/24k/32k startup and smoke evidence, with 32k treated as the high-context target rather than the currently parked runtime. | Passed |
33
+ | v1.8 merged business-owner eval | Probe returned `1,155` visible chars in `60.17s`; proposal rerun scored `1/1`, `4.0/4.0`, `4,014` chars in `212.72s`; Jah credits backend scored `4.0/4.0`, `9,718` chars in `566.36s`. | Passed with latency caveat |
34
+ | OpenCode local run path | Local OpenCode provider/agent is installed for `kaiju/kaiju-coder-7` with 16k context and the scoped no-autocontinue plugin at `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`. Fresh public smoke wrote `hello.txt` with exactly `Kaiju Coder 7 fresh public smoke ok`; packaged public verifier `python3 scripts/run_kaiju_public_opencode_smoke.py --timeout 900 --keep-dir` passed `4/4` in `runs/public-opencode-smoke/20260603T182222Z/summary.md`, including wrong-directory leakage checks; loop-guard smoke wrote `loopguard.txt` with exactly `Kaiju Coder 7 loop guard installed`; latest harnessed customer-readiness pack `runs/opencode-customer-readiness/20260603T185835Z/summary.md` passed `4/4` with `28/28` required files, including release provenance and safety review. | Passed for harnessed/product path |
35
+ | Runtime-quantized local path | vLLM bitsandbytes runtime quantization passed identity/code/business-doc smokes at 8k/16k, reported about `17.8 GiB` model memory, and passed OpenCode one-file smoke with exact content `Kaiju Coder 7 quantized runtime ok`. Persisted quantized weights are still pending. | Runtime recipe passed; persisted weights pending |
36
+ | Paid API gateway scaffold | `cd gateway/cloudflare-worker && npm run check` passes `16/16` Worker tests covering bearer auth, inactive keys, insufficient credits, debit/refund, rate limit before debit, model `kaiju-coder-7` enforcement, streaming/thinking/token caps, secret-content rejection without logging, signed Stripe Checkout top-up idempotency, origin-only R2 artifact upload, and account-scoped artifact download. `python3 scripts/check_paid_api_readiness.py --mode scaffold` now passes `17` checks, including the guarded `npm run prepare:cloudflare` resource-prep path, Wrangler dry-run deploy wiring, artifact route controls, sanitized launch-evidence template, and reviewed Cloudflare bindings template. `scripts/apply_paid_api_cloudflare_bindings.py` previews/applies real D1/KV/R2 bindings while refusing placeholders and secret-looking input. `scripts/collect_paid_api_launch_evidence.py` can preview or write the remaining sanitized staging evidence without storing API keys, full prompts, or model responses. `--mode launch` fails by design until real D1/KV/R2 bindings, Wrangler secrets, Stripe webhook staging evidence, paid-route staging request, latency evidence, and rollback proof are attached through `release/paid-api-launch-evidence.json`. | Local scaffold passed; live deployment pending |
37
+ | Dynamic SGLang LoRA selector | Adapter-name-only serving can be base-equivalent; corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes with `LoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16])`. | Not release path |
38
+ | Hugging Face helper repo upload readiness | Adapter, OpenCode helper, and runtime-quantized recipe staging folders build under `/tmp/kaiju-coder-7-hf-staging`; upload script is dry-run safe and namespace-configurable. Apply mode now requires staged checksum/integrity validation and `check_human_release_review.py --mode public` before repo creation. Local `hf` CLI is installed and authenticated as `restokes92`, but private repo creation attempts under `RichardEchols`, `RMDWLLC`, and `restokes92` returned `403 Forbidden`. | Package ready; upload blocked by review/token permissions |
39
+ | Hugging Face merged model upload readiness | `scripts/prepare_hf_merged_model_metadata.sh` stages the model card, quickstarts, provenance, benchmarks, evals, paid API status, final report, upstream license, and `MERGED_MODEL_RELEASE_MANIFEST.json` for the remote merged-model directory. Latest apply-mode metadata sync passed on Gojira-B using passwordless sudo rsync for the root-owned folder. `scripts/upload_hf_merged_model_from_gojira_b.sh` refuses to preview or upload unless that metadata and the `51G`/`14`-shard merged model are present; latest dry run confirmed `Metadata: present` and printed the correct `hf upload-large-folder` command. Apply mode requires `check_human_release_review.py --mode public --require-merged-upload` before remote upload. Gojira-B has `hf` `1.17.0` and auth, but repo creation still needs a write-capable namespace. | Package ready; upload blocked by review/token permissions |
40
+ | Consolidated release readiness check | `python3 scripts/check_kaiju_public_release_readiness.py --mode local` reports local public-testing readiness while keeping Hugging Face namespace permission, paid API launch preflight, and human review as explicit manual blockers. `--mode public` remains red until those external gates pass. The local check calls `scripts/check_hf_staging_integrity.py` to validate staged files, public naming hygiene, raw secret-looking values, and checksums. It also requires `release/FINAL_RELEASE_REPORT.md`, generated by `scripts/generate_kaiju_final_report.py`, and the local `release/bundles/LATEST.json` archive checksum produced by `scripts/create_hf_release_bundle.py`, so the final release state has exact commands, blockers, changed files, first-test instructions, and a reviewable HF bundle. It also calls `scripts/check_human_release_review.py` so `release/HUMAN_RELEASE_REVIEW.md` is the structured human signoff gate. | Local mode passed; public mode pending |
41
+
42
+ ## Commands With Current Passing Evidence
43
+
44
+ ```bash
45
+ python3 -m unittest discover -s tests -p 'test_*.py'
46
+ python3 scripts/run_kaiju_business_owner_rc_smoke.py
47
+ python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed
48
+ python3 scripts/install_kaiju_opencode_profile.py
49
+ mkdir -p /tmp/kaiju-opencode-fresh-public-smoke
50
+ opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-fresh-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'
51
+ opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'
52
+ python3 scripts/check_paid_api_readiness.py --mode scaffold
53
+ python3 -m py_compile scripts/run_kaiju_api_harness_smoke.py scripts/run_kaiju_business_owner_rc_smoke.py scripts/build_v17_business_owner_sft_dataset.py kaiju_harness/business_suite.py kaiju_harness/router.py kaiju_harness/verification.py
54
+ git diff --check
55
+ bash scripts/upload_hf_merged_model_from_gojira_b.sh
56
+ KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh
57
+ python3 scripts/check_hf_staging_integrity.py
58
+ python3 scripts/check_human_release_review.py --mode local
59
+ python3 scripts/generate_kaiju_final_report.py
60
+ python3 scripts/create_hf_release_bundle.py
61
+ python3 scripts/check_kaiju_public_release_readiness.py --mode local
62
+ ```
63
+
64
+ ## Remaining Blocker
65
+
66
+ The fresh v1.8 adapter, merged full-model artifact, and direct merged-model inference path are proven. The current completed local release candidate is:
67
+
68
+ ```text
69
+ Kaiju Coder 7 merged model + deterministic business-owner harness + verifier + source-backed v1.7/v1.8 dataset/release package
70
+ ```
71
+
72
+ That must be described honestly until external release review confirms:
73
+
74
+ - human review of generated artifacts
75
+ - raw website latency/SLA positioning or explicit harness-first website positioning
76
+ - base Qwen and GLM comparison results
77
+ - final human review of upstream license/notice packaging
78
+ - Hugging Face write-capable token or namespace permission
79
+ - Hugging Face repo creation permission for the 51GB merged model upload from
80
+ Gojira-B
81
+ - final Hugging Face upload metadata and public/private release decision
82
+ - live Cloudflare D1/KV/R2 resources, Stripe products/webhook endpoint,
83
+ deployment secrets, staging end-to-end paid API requests, rollback, and
84
+ support boundaries if exposed commercially
DATA_PROVENANCE_DRAFT.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 by Kiyomi - Data Provenance Draft
2
+
3
+ This draft records the current data boundary for release review.
4
+
5
+ ## Policy
6
+
7
+ Kaiju Coder training data must be legally usable for a commercial derivative model.
8
+
9
+ Allowed:
10
+
11
+ - RMDW-authored examples.
12
+ - RMDW-owned repository diffs and documentation.
13
+ - Human-reviewed examples created specifically for Kaiju.
14
+ - Public permissive data only when license review confirms compatibility.
15
+
16
+ Not allowed:
17
+
18
+ - Closed-model answers from OpenAI, Anthropic, Gemini, or similar services as supervised completions.
19
+ - Unreviewed customer data.
20
+ - Private customer code without consent.
21
+ - Secrets, tokens, credentials, cookies, or private keys.
22
+ - Unlicensed scraped code.
23
+
24
+ ## v0.1 Dataset Snapshot
25
+
26
+ - Total reviewed examples: 575
27
+ - Dataset build: `datasets/build/kaiju-sft-v0.1.jsonl`
28
+ - Candidate sources:
29
+ - `datasets/candidates/rmdw-git-patches.jsonl`
30
+ - `datasets/candidates/v0.1-safe-git-backlog.jsonl`
31
+ - `datasets/candidates/v0.1-file-level-git.jsonl`
32
+ - `datasets/candidates/v0.1-wiki-strategy-business-identity.jsonl`
33
+
34
+ ## v1.7 Business-Owner Suite Addendum
35
+
36
+ - Date prepared: 2026-06-03
37
+ - Reviewed examples: 8
38
+ - Candidate file: `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl`
39
+ - Addendum-only SFT build: `datasets/build/kaiju-sft-v1.7-business-owner-suite.jsonl`
40
+ - Training SFT build: `datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl`
41
+ - Training config: `training/configs/qwen36-27b-lora-v1.7.example.json`
42
+ - v1.8 training config: `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json`
43
+ - New task type: `business_suite`
44
+ - Source inventory: `release/SOURCE_INVENTORY.md`, refreshed from GitHub source-of-truth repositories and the requested local RMDW wiki snapshot.
45
+
46
+ This addendum targets Kiyomi 7.7.7 style business-owner work: complete AI-company build packs, premium service websites, intake and CRM flows, sales follow-up, proposals, ROI dashboards, operator handbooks, and Workshop golden-run automations.
47
+
48
+ Every row includes:
49
+
50
+ - `source_repos`
51
+ - `source_paths`
52
+ - `provenance_notes`
53
+ - `reviewed: true`
54
+ - `license: RMDW-owned`
55
+
56
+ For the v1.7 LoRA run, the 8 reviewed business-owner rows are oversampled 24 times by `scripts/build_v17_business_owner_sft_dataset.py`. Repeated rows receive unique IDs ending in `__v17_business_repeat_NN` and preserve the original source repository, source path, and provenance metadata.
57
+
58
+ Client-site repositories are used only as eval and generalized pattern sources unless a row is explicitly reviewed for training eligibility. Do not bulk-train on client-specific text, contact details, contracts, or private business data.
59
+
60
+ The local wiki path `/Users/richardecholsai7/Documents/RMDW-Wiki` is present but is not a git checkout. It is recorded as `RMDW-Wiki-local`, `selective-reference-only`, with `credentials.md`, `customers.md`, `customers/`, and `raw/` excluded. The GitHub `RichardEchols/rmdw-agent-wiki` repo remains the authoritative wiki source for training/eval provenance unless a reviewer documents a local exception.
61
+
62
+ ## Category Mix
63
+
64
+ The v0.1 category gate passed:
65
+
66
+ - Website/UI: at least 75 examples
67
+ - Coding: at least 75 examples
68
+ - Debugging: at least 50 examples
69
+ - Automation: at least 50 examples
70
+ - Tool-use: at least 50 examples
71
+ - Strategy: at least 25 examples
72
+ - Business: at least 15 examples
73
+ - Identity: at least 10 examples
74
+
75
+ ## Release Review Checklist
76
+
77
+ Before public release:
78
+
79
+ - Re-run dataset validation.
80
+ - Re-run source inventory against the current GitHub source-of-truth SHAs.
81
+ - Spot-check examples for secrets and private data.
82
+ - Confirm client-site rows are generalized pattern examples or eval-only.
83
+ - Confirm closed-model outputs are not used as supervised completions.
84
+ - Record exact base model revision.
85
+ - Attach upstream license and notices.
86
+ - Attach eval summary.
87
+ - Document known limitations and unsafe use boundaries.
EVAL_SCOREBOARD.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Business-Owner Eval Scoreboard
2
+
3
+ This scoreboard tracks the current release-candidate evidence. Do not publish weights or paid API claims until every required row has a dated result and reviewer.
4
+
5
+ ## Completed Local Gates
6
+
7
+ | Gate | Command | Result | Date |
8
+ |---|---|---:|---|
9
+ | Source inventory refresh | `python3 scripts/build_source_inventory.py` | Passed | 2026-06-03 |
10
+ | Candidate validation | `python3 scripts/validate_training_data.py --min-examples 350` | 1,689 examples / passed | 2026-06-03 |
11
+ | v1.7 category targets | `python3 scripts/check_dataset_targets.py --targets datasets/v1.7-targets.json` | Passed | 2026-06-03 |
12
+ | Business-owner SFT build | `python3 scripts/build_v17_business_owner_sft_dataset.py` | 1,881 rows / 192 repeats | 2026-06-03 |
13
+ | Router hard harness | `python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl` | 23/23 | 2026-06-03 |
14
+ | Router static checks | `python3 evals/run_router_static_checks.py runs/evals/20260603T103915Z-kaiju_router_harness/results.jsonl` | 23/23 | 2026-06-03 |
15
+ | Business-suite prompts | Included in router hard harness | 2/2 | 2026-06-03 |
16
+ | Deterministic API harness smoke | `python3 scripts/run_kaiju_api_harness_smoke.py` | Passed: website + business-suite API artifacts | 2026-06-03 |
17
+ | Direct business-suite artifact | `python3 scripts/run_kaiju_router.py --prompt "...Kiyomi 7.7.7 AI company operating pack..." --print-manifest` | 19 files / passed | 2026-06-03 |
18
+ | Full local RC smoke gate | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` | Passed; latest router/static run `20260603T103915Z-kaiju_router_harness` | 2026-06-03 |
19
+ | v1.7 LoRA train | `./scripts/run-gojira-b-qwen36-lora-train.sh` | Finished; runtime `1663.7101s`, train loss `1.7260706673065822`, adapter present | 2026-06-03 |
20
+ | v1.7 SGLang serve | `./scripts/start-qwen36-lora-sglang.sh` with `KAIJU_QWEN36_LORA_CONTEXT=4096`, `KAIJU_QWEN36_LORA_MEM_FRACTION=0.90` | `/v1/models` returned `kaiju_v17_business_owner` | 2026-06-03 |
21
+ | Raw served adapter smoke: website | `python3 evals/run_openai_compat_smoke.py --base-url http://100.109.109.14:18083/v1 --model kaiju_v17_business_owner --tasks evals/tasks/smoke.jsonl --max-tasks 1 --disable-thinking` | Passed; `20260603T031300Z-kaiju_v17_business_owner`, 2,726 chars in 174.49s | 2026-06-03 |
22
+ | Raw served adapter smoke: proposal | `python3 evals/run_openai_compat_smoke.py --base-url http://100.109.109.14:18083/v1 --model kaiju_v17_business_owner --tasks /tmp/kaiju-proposal-smoke.jsonl --system-prompt-file prompts/kaiju-coder-api-system.md --disable-thinking` | Passed; `20260603T032107Z-kaiju_v17_business_owner`, 4,306 chars in 232.27s | 2026-06-03 |
23
+ | Raw served adapter quality: website | `python3 evals/score_quality_gate.py runs/evals/20260603T033825Z-kaiju_v17_business_owner/results.jsonl` | Failed paid-ready: `3.71/4.0`, missing complete HTML after 12,706 chars / 793.96s | 2026-06-03 |
24
+ | Raw served adapter quality: proposal | `python3 evals/score_quality_gate.py runs/evals/20260603T032107Z-kaiju_v17_business_owner/results.jsonl` | Passed paid-ready: `4.0/4.0` | 2026-06-03 |
25
+ | Raw served adapter quality: Jah credits | `python3 evals/score_quality_gate.py runs/evals/20260603T035612Z-kaiju_v17_business_owner/results.jsonl` | Passed paid-ready: `4.0/4.0` | 2026-06-03 |
26
+ | Base Qwen comparison: proposal | `python3 evals/compare_quality_runs.py runs/quality-gates/20260603T035200Z-qwen36-27b/scores.jsonl runs/quality-gates/20260603T032107Z-kaiju_v17_business_owner/scores.jsonl` | Tie: base `4.0/4.0`, Kaiju v1.7 `4.0/4.0` | 2026-06-03 |
27
+ | Base Qwen comparison: Jah credits | `python3 evals/compare_quality_runs.py runs/quality-gates/20260603T040140Z-qwen36-27b/scores.jsonl runs/quality-gates/20260603T035612Z-kaiju_v17_business_owner/scores.jsonl` | Tie: base `4.0/4.0`, Kaiju v1.7 `4.0/4.0`; deterministic outputs were byte-identical | 2026-06-03 |
28
+ | Raw adapter differentiation probe | Identity and Jah probes comparing `qwen36-27b` to `kaiju_v17_business_owner` | Current v1.7 SGLang outputs can be byte-identical to base on deterministic prompts; 24-step v1.7 is too weak as a raw-weight differentiator | 2026-06-03 |
29
+ | v1.8 stronger LoRA train | `KAIJU_LORA_CONFIG=training/configs/qwen36-27b-lora-v1.8-business-owner.example.json KAIJU_SFT_DATASET=datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl KAIJU_LORA_RUN_DIR=runs/qwen36-27b-lora-v1.8-business-owner KAIJU_MIN_TRAIN_EXAMPLES=350 KAIJU_SKIP_DATASET_BUILD=1 KAIJU_TRAIN_BACKGROUND=1 ./scripts/run-gojira-b-qwen36-lora-train.sh` | Finished; runtime `11666.7564s`, train loss `0.9281658741335074`, adapter present | 2026-06-03 |
30
+ | v1.8 SGLang dynamic LoRA serve | `./scripts/start-qwen36-lora-sglang.sh` with v1.8 adapter, `KAIJU_QWEN36_LORA_CONTEXT=8192`, `KAIJU_QWEN36_LORA_MEM_FRACTION=0.90` | Historical only: `/v1/models` listed `kaiju_v18_business_owner`, but adapter-name-only output can be base-equivalent; not release evidence | 2026-06-03 |
31
+ | Corrected v1.8 dynamic LoRA selector | Model selector `qwen36-27b:kaiju_v18_business_owner` under SGLang with fused target modules | Fails: `LoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16])`; dynamic LoRA is not the release path | 2026-06-03 |
32
+ | v1.8 LoRA merge | `KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter ./scripts/run-gojira-b-qwen36-lora-merge.sh` | Passed; merged full model at `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`, `51G`, `14` shards | 2026-06-03 |
33
+ | Kaiju Coder 7 merged SGLang serve | `./scripts/start-qwen36-merged-sglang.sh` with `KAIJU_QWEN36_MERGED_CONTEXT=32768`, `KAIJU_QWEN36_MERGED_MEM_FRACTION=0.90` | `/v1/models` returned `kaiju-coder-7`, max model len `32768`; 12k/16k/24k/32k evidence is recorded in `release/SERVING_BENCHMARKS.md` | 2026-06-03 |
34
+ | Kaiju Coder 7 restored 32k direct API smoke | `python3 scripts/benchmark_kaiju_serving.py --contexts 32768 --prompts identity business_doc --max-tokens 768 --timeout 420` | Passed; `/v1/models` returned `kaiju-coder-7`, max model len `32768`; identity `2.92s`; business proposal `94.28s`, `1,737` chars | 2026-06-03 |
35
+ | Kaiju Coder 7 restored 32k OpenCode one-file smoke | `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-32k-final-smoke 'Create hello.txt with exactly: Kaiju Coder 7 final 32k ok'` | Passed; wrote `hello.txt` with exactly `Kaiju Coder 7 final 32k ok` | 2026-06-03 |
36
+ | Kaiju Coder 7 current restored 16k direct API smoke | `python3 scripts/benchmark_kaiju_serving.py --contexts 16384 --prompts identity --max-tokens 64 --timeout 120` | Passed; latest run `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`, identity `2.3s`, `26` chars | 2026-06-03 |
37
+ | Kaiju Coder 7 current restored 16k OpenCode one-file smoke | `mkdir -p /tmp/kaiju-opencode-fresh-public-smoke && opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-fresh-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'` | Passed; `/v1/models` returned `kaiju-coder-7`, max model len `16384`; wrote `hello.txt` with exactly `Kaiju Coder 7 fresh public smoke ok` | 2026-06-03 |
38
+ | Kaiju Coder 7 packaged public OpenCode smoke | `python3 scripts/run_kaiju_public_opencode_smoke.py --timeout 900 --keep-dir` | Passed; latest run `runs/public-opencode-smoke/20260603T182222Z/summary.md`, `4/4` checks passed; installer dry-run, OpenCode `1.15.13`, live 16k model, and file written only in the requested temp workspace | 2026-06-03 |
39
+ | Kaiju Coder 7 loop-guarded OpenCode install | `python3 scripts/install_kaiju_opencode_profile.py`; `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'` | Passed; config includes `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`; wrote `loopguard.txt` with exact requested content and exited cleanly | 2026-06-03 |
40
+ | Current harnessed OpenCode customer-readiness pack | `python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed` | Passed; latest run `runs/opencode-customer-readiness/20260603T185835Z/summary.md`, `4/4` tasks passed and `28/28` required files written, including release provenance and safety review | 2026-06-03 |
41
+ | Paid API Worker scaffold | `cd gateway/cloudflare-worker && npm run check && npm run preflight` | Passed `16/16` Worker tests and `17` scaffold preflight checks; covers bearer auth, inactive keys, insufficient credits, debit/refund, rate limit before debit, model `kaiju-coder-7` enforcement, stream/thinking/token caps, secret-content rejection without logging, signed Stripe Checkout top-up idempotency, origin-only R2 artifact upload, account-scoped artifact download, guarded Cloudflare resource prep, Wrangler dry-run deploy, sanitized paid-launch evidence template packaging, reviewed Cloudflare bindings template, binding applier guardrails, and sanitized evidence collection helper | 2026-06-03 |
42
+ | Kaiju Coder 7 merged vLLM serve | `KAIJU_VLLM_CONTEXT=16384 ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed at 16k with Gojira nightly vLLM after `pandas` preinstall and `--language-model-only`; identity `19.99s`, code patch `28.8s`; not faster enough to replace SGLang | 2026-06-03 |
43
+ | Kaiju Coder 7 runtime-quantized vLLM serve | `KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_QUANTIZATION=bitsandbytes KAIJU_VLLM_LOAD_FORMAT=bitsandbytes ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed at 8k and 16k; 16k identity `19.51s`, code patch `11.3s`; vLLM log reported about `17.8 GiB` model memory | 2026-06-03 |
44
+ | Kaiju Coder 7 runtime-quantized business-doc smoke | `KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_QUANTIZATION=bitsandbytes KAIJU_VLLM_LOAD_FORMAT=bitsandbytes KAIJU_VLLM_PROMPTS=business_doc KAIJU_VLLM_MAX_TOKENS=768 KAIJU_VLLM_PROMPT_TIMEOUT=420 ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed; business proposal `53.44s`, `1,610` chars, `30.127` chars/s; wrapper restored SGLang after completion | 2026-06-03 |
45
+ | Kaiju Coder 7 runtime-quantized OpenCode one-file smoke | `bash scripts/run_kaiju_quantized_opencode_smoke.sh` | Passed at 16k after vLLM `--enable-auto-tool-choice`; OpenCode wrote `hello.txt` with exactly `Kaiju Coder 7 quantized runtime ok` | 2026-06-03 |
46
+ | Hugging Face CLI install/auth check | `hf version && hf auth whoami && hf auth list` | `hf` installed locally at version `1.17.0`; auth user `restokes92`; token name `gojirakiyomikode` | 2026-06-03 |
47
+ | Hugging Face private repo create attempt | `KAIJU_HF_UPLOAD_APPLY=1 bash scripts/upload_hf_release_staging.sh` with namespaces `RichardEchols`, `RMDWLLC`, and `restokes92` | Blocked by Hugging Face `403 Forbidden`; current token cannot create model repos in those namespaces | 2026-06-03 |
48
+ | Hugging Face merged-model metadata and upload boundary | `bash scripts/prepare_hf_merged_model_metadata.sh`; `KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh`; `bash scripts/upload_hf_merged_model_from_gojira_b.sh`; `KAIJU_HF_UPLOAD_APPLY=1 bash scripts/upload_hf_merged_model_from_gojira_b.sh` | Metadata prep synced model card, quickstarts, provenance, benchmarks, evals, paid API status, final report, upstream license, and `MERGED_MODEL_RELEASE_MANIFEST.json` to Gojira-B; sudo rsync handled the root-owned merged folder; upload dry run confirmed metadata plus the `51G`/`14`-shard merged model before printing `hf upload-large-folder`; apply remains blocked by human review and Hugging Face namespace permission before any large upload | 2026-06-03 |
49
+ | v1.8 merged endpoint probe | Direct OpenAI-compatible chat request with top-level `chat_template_kwargs` disabling thinking | Passed; `1,155` visible chars in `60.17s`, normal `content` response | 2026-06-03 |
50
+ | Kaiju Coder 7 merged focused proposal eval | `python3 evals/run_openai_compat_smoke.py --model kaiju-coder-7 --tasks evals/tasks/business-owner-v18-comparison.jsonl --max-tasks 1 --max-tokens 1800 ...` then `python3 evals/score_quality_gate.py <results.jsonl>` | Passed: `1/1` paid-ready, `4.0/4.0`, `4,014` chars, `212.72s` | 2026-06-03 |
51
+ | Kaiju Coder 7 merged focused Jah credits eval | `python3 evals/run_openai_compat_smoke.py --model kaiju-coder-7 --tasks evals/tasks/business-owner-v18-comparison.jsonl ...` then `python3 evals/score_quality_gate.py <results.jsonl>` | Passed: `4.0/4.0`, `9,718` chars, `566.36s` | 2026-06-03 |
52
+ | Full local RC smoke gate | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` | Passed; latest router/static run `20260603T103915Z-kaiju_router_harness` | 2026-06-03 |
53
+
54
+ ## Required Before Release
55
+
56
+ | Gate | Required result | Status |
57
+ |---|---|---|
58
+ | v1.7 LoRA train | Finished metrics and adapter under `runs/qwen36-27b-lora-v1.7-business-owner` | Passed |
59
+ | v1.8 stronger LoRA train | Finished metrics and adapter under `runs/qwen36-27b-lora-v1.8-business-owner` | Passed |
60
+ | v1.8 merged focused smoke | `python3 evals/run_openai_compat_smoke.py --tasks evals/tasks/business-owner-v18-comparison.jsonl --model kaiju-coder-7 ...` then `python3 evals/score_quality_gate.py` | Passed for proposal rerun and Jah credits backend; broader sweep pending |
61
+ | Direct commercial eval | No critical failures, scored summary attached | Passed for targeted high-value tasks when using the product harness plus 8k raw website mode; broader task sweep still pending |
62
+ | Base Qwen comparison | Kaiju beats base Qwen on RMDW/Kiyomi practical tasks | Not yet: raw deterministic identity still matches base; compare broader tasks before model-level improvement claims |
63
+ | GLM comparison | Kaiju is near or above GLM on highest-value business-owner tasks | Pending |
64
+ | Local inference smoke | OpenAI-compatible endpoint returns usable business-owner artifact | Passed for v1.8 merged SGLang endpoint and product harness |
65
+ | Human review | Richard reviews artifacts for usefulness, privacy, and sellability | Pending |
66
+ | Release package | Model card, provenance, license notes, eval summary, limitations, Hugging Face draft, completion audit, and run instructions complete | Staged and upload-scripted; upload blocked by HF token permissions and human/public-review decision |
67
+
68
+ ## Decision Rule
69
+
70
+ The v1.8 adapter is a completed local checkpoint and the merged full model is the current served raw-model path. The business-owner product should still be published honestly as merged model plus deterministic harness plus verifier. Raw merged v1.8 is useful on business documents and Jah credits but slow on this SGLang stack. Do not claim raw-weight superiority until broader base/GLM and raw website comparisons pass.
FINAL_RELEASE_REPORT.md ADDED
@@ -0,0 +1,269 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Final Release Report
2
+
3
+ Generated: `2026-06-03T20:03:02Z`
4
+
5
+ Product name: `Kaiju Coder 7`
6
+ Public model id: `kaiju-coder-7`
7
+ Current source branch: `codex/kaiju-business-owner-rc`
8
+ Current HEAD: `3d57eae92ad523519473f0ff3eca6661a9736de3`
9
+ Current `origin/main`: `3d57eae92ad523519473f0ff3eca6661a9736de3`
10
+
11
+ ## Current Verdict
12
+
13
+ Kaiju Coder 7 is a local public-testing release candidate, not a fully public
14
+ commercial launch yet. The local model path, OpenCode profile, harnessed
15
+ business-owner evals, Hugging Face staging package, runtime-quantized recipe,
16
+ and paid API scaffold are in place. Public release still requires human
17
+ approval, a write-capable Hugging Face namespace/token, and live paid API
18
+ resources before the hosted API can be sold.
19
+
20
+ ## Runtime
21
+
22
+ | Field | Value |
23
+ |---|---|
24
+ | Status | `pass` |
25
+ | Base URL | `http://100.109.109.14:18083/v1` |
26
+ | Model id | `kaiju-coder-7` |
27
+ | Max model length | `16384` |
28
+ | Detail | `` |
29
+
30
+ Recommended default today: `16k` context through `kaiju-coder-7`. Higher
31
+ context has benchmark evidence, but the currently parked default is 16k for
32
+ stability and speed.
33
+
34
+ ## Readiness Summary
35
+
36
+ | Area | Result |
37
+ |---|---|
38
+ | Local public-testing readiness | `ready=True pass=23 fail=0 manual=1 rc=0` |
39
+ | Hugging Face release readiness | `ready=True pass=23 fail=0 manual=1 rc=0` |
40
+ | Public launch readiness | `ready=False pass=23 fail=1 manual=0 rc=1` |
41
+ | Hugging Face staging integrity | `ready=True pass=6 fail=0 manual=0 rc=0` |
42
+ | Paid API launch readiness | `ready=False pass=17 fail=3 manual=7 rc=1` |
43
+
44
+ ## Hugging Face Release Blockers
45
+
46
+ | Status | Check | Detail |
47
+ |---|---|---|
48
+ | manual | paid API launch preflight | 17 pass, 3 fail, 7 manual |
49
+
50
+ ## Public Launch Blockers
51
+
52
+ | Status | Check | Detail |
53
+ |---|---|---|
54
+ | fail | paid API launch preflight | 17 pass, 3 fail, 7 manual |
55
+
56
+ ## Paid API Launch Blockers
57
+
58
+ | Status | Check | Detail |
59
+ |---|---|---|
60
+ | fail | live D1 binding | KAIJU_BILLING_DB is missing or still placeholder/commented |
61
+ | fail | live KV binding | KAIJU_RATE_LIMIT_KV is missing or still placeholder/commented |
62
+ | fail | artifact R2 binding | KAIJU_ARTIFACT_BUCKET is missing; artifact routes cannot launch |
63
+ | manual | public route mode | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `public_route_mode` |
64
+ | manual | wrangler secret list confirms KAIJU_ORIGIN_URL, KAIJU_ORIGIN_SECRET, and KAIJU_STRIPE_WEBHOOK_SECRET | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `wrangler_secrets_verified` |
65
+ | manual | D1 migration 0001_paid_api.sql applied to the live billing database | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `d1_migration_applied` |
66
+ | manual | Stripe Checkout top-up products and webhook endpoint tested with metadata.kaiju_api_key_id | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `stripe_checkout_topup_staging` |
67
+ | manual | staging request passed through Worker to Gojira-B origin with model=kaiju-coder-7 | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `worker_to_gojira_staging_request` |
68
+ | manual | rollback command or route switch was exercised and recorded | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `rollback_exercised` |
69
+ | manual | p95 latency for paid routes is recorded after staging traffic | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `paid_route_latency` |
70
+
71
+ ## Evidence Paths
72
+
73
+ | Evidence | Path |
74
+ |---|---|
75
+ | Completion audit | `release/COMPLETION_AUDIT.md` |
76
+ | Goal completion audit | `release/GOAL_COMPLETION_AUDIT.md` |
77
+ | Release evidence refresh runner | `scripts/refresh_kaiju_release_evidence.py` |
78
+ | Eval scoreboard | `release/EVAL_SCOREBOARD.md` |
79
+ | Public testing quickstart | `release/PUBLIC_TESTING_QUICKSTART.md` |
80
+ | Serving benchmarks | `release/SERVING_BENCHMARKS.md` |
81
+ | Hugging Face release draft | `release/HUGGINGFACE_RELEASE_DRAFT.md` |
82
+ | Hugging Face release bundle | `release/bundles/LATEST.md` |
83
+ | Hugging Face bundle integrity checker | `scripts/check_hf_release_bundle_integrity.py` |
84
+ | Hugging Face permission evidence template | `release/hf-release-permission-evidence.example.json` |
85
+ | Hugging Face permission evidence collector | `scripts/collect_hf_release_permission_evidence.py` |
86
+ | Hugging Face permission evidence checker | `scripts/check_hf_release_permission_evidence.py` |
87
+ | Merged-model metadata prep | `scripts/prepare_hf_merged_model_metadata.sh` |
88
+ | Human release review gate | `release/HUMAN_RELEASE_REVIEW.md` |
89
+ | Paid API readiness | `release/PAID_API_READINESS.md` |
90
+ | Paid API evidence collector | `scripts/collect_paid_api_launch_evidence.py` |
91
+ | Paid API launch evidence template | `release/paid-api-launch-evidence.example.json` |
92
+ | Cloudflare bindings template | `release/cloudflare-bindings.example.json` |
93
+ | Cloudflare bindings applier | `scripts/apply_paid_api_cloudflare_bindings.py` |
94
+ | Latest direct API smoke | `runs/benchmarks/20260603T193000Z-kaiju-coder-7-serving/summary.md` |
95
+ | Latest OpenCode customer pack | `runs/opencode-customer-readiness/20260603T185835Z/summary.md` |
96
+ | Latest public OpenCode smoke | `runs/public-opencode-smoke` |
97
+
98
+ ## What Richard Should Test First
99
+
100
+ ```bash
101
+ python3 scripts/check_kaiju_public_release_readiness.py --mode local
102
+ python3 scripts/install_kaiju_opencode_profile.py
103
+ mkdir -p /tmp/kaiju-public-smoke
104
+ opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 public smoke ok'
105
+ python3 scripts/run_kaiju_public_opencode_smoke.py
106
+ python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed
107
+ bash scripts/prepare_hf_merged_model_metadata.sh
108
+ bash scripts/prepare_hf_release_staging.sh
109
+ python3 scripts/check_hf_staging_integrity.py --require-checksums
110
+ python3 scripts/create_hf_release_bundle.py
111
+ python3 scripts/check_hf_release_bundle_integrity.py
112
+ python3 scripts/check_kaiju_goal_completion.py --write
113
+ python3 scripts/refresh_kaiju_release_evidence.py --skip-opencode-smoke
114
+ python3 scripts/collect_hf_release_permission_evidence.py
115
+ # After HF repo-create permission is fixed:
116
+ python3 scripts/collect_hf_release_permission_evidence.py --apply --write
117
+ python3 scripts/check_hf_release_permission_evidence.py
118
+ python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
119
+ python3 scripts/check_kaiju_public_release_readiness.py --mode public
120
+ cp release/cloudflare-bindings.example.json release/cloudflare-bindings.json
121
+ # Replace placeholder D1/KV IDs in release/cloudflare-bindings.json first.
122
+ python3 scripts/apply_paid_api_cloudflare_bindings.py --bindings-file release/cloudflare-bindings.json
123
+ cp release/paid-api-launch-evidence.example.json release/paid-api-launch-evidence.json
124
+ python3 scripts/collect_paid_api_launch_evidence.py --help
125
+ python3 scripts/check_paid_api_readiness.py --mode launch --evidence-file release/paid-api-launch-evidence.json
126
+ ```
127
+
128
+ Do not expose the paid hosted API until `python3
129
+ scripts/check_paid_api_readiness.py --mode launch` has no failures and the
130
+ human release review explicitly approves public paid API launch.
131
+
132
+ ## Changed Files
133
+
134
+ `git status --short` currently reports `112` changed paths.
135
+
136
+ | State | Path |
137
+ |---|---|
138
+ | M | `.gitignore` |
139
+ | M | `LICENSE_NOTES.md` |
140
+ | M | `README.md` |
141
+ | M | `datasets/schema.json` |
142
+ | M | `docs/custom-harness.md` |
143
+ | M | `evals/BAKEOFF_CURRENT.md` |
144
+ | M | `evals/run_openai_compat_smoke.py` |
145
+ | M | `evals/run_router_static_checks.py` |
146
+ | M | `evals/tasks/router-hard-harness.jsonl` |
147
+ | M | `gateway/README.md` |
148
+ | M | `gateway/cloudflare-worker/README.md` |
149
+ | M | `gateway/cloudflare-worker/migrations/0001_paid_api.sql` |
150
+ | M | `gateway/cloudflare-worker/package.json` |
151
+ | M | `gateway/cloudflare-worker/src/index.js` |
152
+ | M | `gateway/cloudflare-worker/test/index.test.js` |
153
+ | M | `kaiju_harness/router.py` |
154
+ | M | `kaiju_harness/verification.py` |
155
+ | D | `models/README.md` |
156
+ | D | `models/qwen3.6-27b-base.md` |
157
+ | D | `models/qwen3.6-27b-fp8.md` |
158
+ | M | `prompts/kaiju-coder-api-system.md` |
159
+ | M | `prompts/kaiju-coder-speed-system.md` |
160
+ | M | `release/DATA_PROVENANCE_DRAFT.md` |
161
+ | M | `release/MODEL_CARD_DRAFT.md` |
162
+ | M | `scripts/build_sft_dataset.py` |
163
+ | M | `scripts/check-gojira-b-capacity.sh` |
164
+ | M | `scripts/run-gojira-b-qwen36-lora-eval.sh` |
165
+ | M | `scripts/run-gojira-b-qwen36-lora-sglang-eval.sh` |
166
+ | M | `scripts/run-gojira-b-qwen36-lora-train.sh` |
167
+ | M | `scripts/run_kaiju_api_harness_smoke.py` |
168
+ | M | `scripts/start-qwen36-lora-sglang.sh` |
169
+ | M | `scripts/stop-qwen36-lora-sglang.sh` |
170
+ | M | `scripts/validate_training_data.py` |
171
+ | M | `scripts/watch-gojira-b-qwen36-lora-train.sh` |
172
+ | ?? | `.opencode/` |
173
+ | ?? | `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl` |
174
+ | ?? | `datasets/v1.7-targets.json` |
175
+ | ?? | `evals/tasks/business-owner-v18-comparison.jsonl` |
176
+ | ?? | `evals/tasks/business-owner-v18-smoke.jsonl` |
177
+ | ?? | `evals/tasks/opencode-customer-readiness.jsonl` |
178
+ | ?? | `kaiju_harness/business_suite.py` |
179
+ | ?? | `release/COMPLETION_AUDIT.md` |
180
+ | ?? | `release/EVAL_SCOREBOARD.md` |
181
+ | ?? | `release/FINAL_RELEASE_REPORT.md` |
182
+ | ?? | `release/GOAL_COMPLETION_AUDIT.md` |
183
+ | ?? | `release/HF_ADAPTER_MODEL_CARD.md` |
184
+ | ?? | `release/HUGGINGFACE_RELEASE_DRAFT.md` |
185
+ | ?? | `release/HUMAN_RELEASE_REVIEW.md` |
186
+ | ?? | `release/LOCAL_TEST_INSTRUCTIONS.md` |
187
+ | ?? | `release/PAID_API_READINESS.md` |
188
+ | ?? | `release/PUBLIC_TESTING_QUICKSTART.md` |
189
+ | ?? | `release/QUANTIZATION_PLAN.md` |
190
+ | ?? | `release/SERVING_BENCHMARKS.md` |
191
+ | ?? | `release/SOURCE_INVENTORY.md` |
192
+ | ?? | `release/UPSTREAM_LICENSE_CHECK.md` |
193
+ | ?? | `release/bundles/` |
194
+ | ?? | `release/cloudflare-bindings.example.json` |
195
+ | ?? | `release/hf-release-permission-evidence.example.json` |
196
+ | ?? | `release/hf-release-permission-evidence.json` |
197
+ | ?? | `release/huggingface/` |
198
+ | ?? | `release/opencode/` |
199
+ | ?? | `release/paid-api-launch-evidence.example.json` |
200
+ | ?? | `release/quantized-runtime/` |
201
+ | ?? | `release/source-inventory.json` |
202
+ | ?? | `release/upstream/` |
203
+ | ?? | `scripts/apply_paid_api_cloudflare_bindings.py` |
204
+ | ?? | `scripts/benchmark_kaiju_serving.py` |
205
+ | ?? | `scripts/build_source_inventory.py` |
206
+ | ?? | `scripts/build_v17_business_owner_sft_dataset.py` |
207
+ | ?? | `scripts/check_hf_release_bundle_integrity.py` |
208
+ | ?? | `scripts/check_hf_release_permission_evidence.py` |
209
+ | ?? | `scripts/check_hf_release_permissions.sh` |
210
+ | ?? | `scripts/check_hf_staging_integrity.py` |
211
+ | ?? | `scripts/check_hf_uploaded_release.py` |
212
+ | ?? | `scripts/check_human_release_review.py` |
213
+ | ?? | `scripts/check_kaiju_goal_completion.py` |
214
+ | ?? | `scripts/check_kaiju_public_release_readiness.py` |
215
+ | ?? | `scripts/check_kaiju_quantization_prereqs.py` |
216
+ | ?? | `scripts/check_paid_api_readiness.py` |
217
+ | ?? | `scripts/collect_hf_release_permission_evidence.py` |
218
+ | ?? | `scripts/collect_paid_api_launch_evidence.py` |
219
+ | ?? | `scripts/create_hf_release_bundle.py` |
220
+ | ?? | `scripts/generate_kaiju_final_report.py` |
221
+ | ?? | `scripts/gojira-b-ssh-lib.sh` |
222
+ | ?? | `scripts/install_kaiju_opencode_profile.py` |
223
+ | ?? | `scripts/opencode-kaiju-no-autocontinue.mjs` |
224
+ | ?? | `scripts/prepare_hf_merged_model_metadata.sh` |
225
+ | ?? | `scripts/prepare_hf_release_staging.sh` |
226
+ | ?? | `scripts/prepare_paid_api_cloudflare_resources.sh` |
227
+ | ?? | `scripts/probe-gojira-b-kaiju-quantization.sh` |
228
+ | ?? | `scripts/refresh_kaiju_release_evidence.py` |
229
+ | ?? | `scripts/run-gojira-b-qwen36-lora-merge.sh` |
230
+ | ?? | `scripts/run-gojira-b-vllm-serving-benchmark.sh` |
231
+ | ?? | `scripts/run_kaiju_business_owner_rc_smoke.py` |
232
+ | ?? | `scripts/run_kaiju_opencode_customer_pack.py` |
233
+ | ?? | `scripts/run_kaiju_public_opencode_smoke.py` |
234
+ | ?? | `scripts/run_kaiju_quantized_opencode_smoke.sh` |
235
+ | ?? | `scripts/start-qwen36-merged-sglang.sh` |
236
+ | ?? | `scripts/start-qwen36-merged-vllm.sh` |
237
+ | ?? | `scripts/stop-qwen36-merged-sglang.sh` |
238
+ | ?? | `scripts/stop-qwen36-merged-vllm.sh` |
239
+ | ?? | `scripts/upload_hf_merged_model_from_gojira_b.sh` |
240
+ | ?? | `scripts/upload_hf_release_staging.sh` |
241
+ | ?? | `tests/test_kiyomi_business_suite.py` |
242
+ | ?? | `tests/test_release_package.py` |
243
+ | ?? | `tests/test_source_inventory.py` |
244
+ | ?? | `tests/test_training_provenance.py` |
245
+ | ?? | `tests/test_v17_business_dataset.py` |
246
+ | ?? | `training/configs/qwen36-27b-lora-v1.7.example.json` |
247
+ | ?? | `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json` |
248
+ | ?? | `training/scripts/qwen36_lora_merge.py` |
249
+ | ?? | `training/v1.7-business-owner-runbook.md` |
250
+
251
+ ## Commands Run During Report Generation
252
+
253
+ | Label | Command | Return code |
254
+ |---|---|---|
255
+ | git branch | `git branch --show-current` | 0 |
256
+ | git HEAD | `git rev-parse HEAD` | 0 |
257
+ | git origin/main | `git rev-parse origin/main` | 0 |
258
+ | git status | `git status --short` | 0 |
259
+ | local readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode local --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 0 |
260
+ | HF release readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode hf-release --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 0 |
261
+ | public readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode public --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 1 |
262
+ | HF staging integrity | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_hf_staging_integrity.py --staging-dir /tmp/kaiju-coder-7-hf-staging --require-checksums --json` | 0 |
263
+ | paid API launch readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_paid_api_readiness.py --mode launch --json` | 1 |
264
+
265
+ ## Report Safety
266
+
267
+ This generator intentionally avoids secret-bearing commands such as auth token
268
+ lists, environment dumps, process command-line scans, Wrangler secret lists, and
269
+ payment-provider credential output.
GOAL_COMPLETION_AUDIT.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Goal Completion Audit
2
+
3
+ Generated: `2026-06-03T20:03:23Z`
4
+
5
+ Overall: `not complete`
6
+ Summary: `16 passed / 1 blocked / 0 manual`
7
+
8
+ This audit maps the active Kaiju Coder 7 objective to current evidence. It is stricter than local readiness: local public testing can pass while Hugging Face upload, human review, and paid API launch remain blocked.
9
+
10
+ ## Readiness Commands
11
+
12
+ | Check | Ready | Return Code |
13
+ |---|---:|---:|
14
+ | Local public-testing readiness | `True` | `0` |
15
+ | Hugging Face release readiness | `True` | `0` |
16
+ | Public launch readiness | `False` | `1` |
17
+ | Paid API scaffold | `True` | `0` |
18
+ | Paid API launch | `False` | `1` |
19
+ | HF staging integrity | `True` | `0` |
20
+ | HF namespace permission evidence | `True` | `0` |
21
+ | Human public review | `True` | `0` |
22
+
23
+ ## Requirement Audit
24
+
25
+ | Area | Requirement | Status | Evidence | Blocker |
26
+ |---|---|---|---|---|
27
+ | Identity | Product name is Kaiju Coder 7 and public/API model id is kaiju-coder-7. | `passed` | scripts/check_kaiju_public_release_readiness.py --mode local; release/PUBLIC_TESTING_QUICKSTART.md | |
28
+ | OpenCode | Lean Kaiju-specific OpenCode config/agent minimizes prompt overhead and disables synthetic auto-continue loops. | `passed` | .opencode/agents/kaiju-coder-7.md; scripts/opencode-kaiju-no-autocontinue.mjs; scripts/install_kaiju_opencode_profile.py | |
29
+ | OpenCode | opencode -m kaiju/kaiju-coder-7 works from this Mac with the recommended config. | `passed` | runs/public-opencode-smoke latest passing summary; scripts/run_kaiju_public_opencode_smoke.py | |
30
+ | OpenCode | Customer-readiness pack passes without wrong-directory output, fake compaction completion, missing files, or secret leakage. | `passed` | runs/opencode-customer-readiness/20260603T185835Z/summary.md | |
31
+ | Runtime | Direct API smoke passes using model=kaiju-coder-7. | `passed` | runs/benchmarks/20260603T193000Z-kaiju-coder-7-serving/summary.md | |
32
+ | Runtime | 12k, 16k, 24k, and 32k context benchmarks are recorded with a recommended default. | `passed` | release/SERVING_BENCHMARKS.md records 12288, 16384, 24576, 32768 and recommends 16k live default | |
33
+ | Runtime | SGLang and vLLM/practical faster serving path are benchmarked honestly. | `passed` | release/SERVING_BENCHMARKS.md; release/quantized-runtime/README.md | |
34
+ | Runtime | At least one public-friendly quantized/local candidate is working or clearly documented as blocked with evidence. | `passed` | release/quantized-runtime/README.md documents vLLM bitsandbytes runtime candidate and persisted-weights limitation | |
35
+ | Hugging Face | Public-friendly HF release structure is staged with adapter, OpenCode helper, runtime-quantized helper, model cards, provenance, evals, and docs. | `passed` | python3 scripts/check_hf_staging_integrity.py --require-checksums | |
36
+ | Hugging Face | At least one public Hugging Face release path is ready to upload or uploaded. | `passed` | python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release | |
37
+ | Hugging Face | Merged 51GB model repo upload is guarded and ready after human review/namespace permission. | `passed` | scripts/prepare_hf_merged_model_metadata.sh; scripts/upload_hf_merged_model_from_gojira_b.sh dry run | |
38
+ | Quality | Customer-style evals cover website, proposal, Stripe/payment, CRM/reporting, CSV/parser, Kiyomi operating pack, and safety/provenance. | `passed` | evals/tasks/opencode-customer-readiness.jsonl; runs/opencode-customer-readiness/20260603T185835Z/summary.md | |
39
+ | Quality | Model/harness prompts produce file-oriented business-owner artifacts rather than vague advice. | `passed` | kaiju_harness/business_suite.py; release/EVAL_SCOREBOARD.md | |
40
+ | Provenance | Training/eval provenance is preserved and public docs avoid internal checkpoint naming except license/provenance attribution. | `passed` | release/SOURCE_INVENTORY.md; release/DATA_PROVENANCE_DRAFT.md; release/PUBLIC_TESTING_QUICKSTART.md | |
41
+ | Paid API | Paid API scaffold covers API keys, Stripe billing, rate limits, logging controls, abuse controls, rollback plan, and pricing assumptions. | `passed` | python3 scripts/check_paid_api_readiness.py --mode scaffold; gateway/cloudflare-worker tests | |
42
+ | Paid API | Paid API is ready for public charging. | `blocked` | python3 scripts/check_paid_api_readiness.py --mode launch | Requires live D1/KV/R2 bindings, Wrangler secrets, Stripe staging evidence, Worker-to-Gojira staging request, rollback proof, latency evidence, and human approval. |
43
+ | Final Report | Final report includes exact commands run, eval results, changed files, remaining risks, and what Richard should test first. | `passed` | release/FINAL_RELEASE_REPORT.md | |
44
+
45
+ ## Blocking Items
46
+
47
+ - Paid API: Paid API is ready for public charging.: Requires live D1/KV/R2 bindings, Wrangler secrets, Stripe staging evidence, Worker-to-Gojira staging request, rollback proof, latency evidence, and human approval.
48
+
49
+ ## Commands To Re-run
50
+
51
+ ```bash
52
+ python3 scripts/check_kaiju_public_release_readiness.py --mode local
53
+ python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
54
+ python3 scripts/check_kaiju_public_release_readiness.py --mode public
55
+ python3 scripts/check_paid_api_readiness.py --mode scaffold
56
+ python3 scripts/check_paid_api_readiness.py --mode launch
57
+ python3 scripts/check_hf_staging_integrity.py --require-checksums
58
+ python3 scripts/check_hf_release_permission_evidence.py
59
+ python3 scripts/check_human_release_review.py --mode public
60
+ ```
LOCAL_TEST_INSTRUCTIONS.md ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Local Test Instructions
2
+
3
+ Use these commands from the repo root. The public release name is Kaiju Coder 7. Internally, this build is backed by the v1.8 adapter under `runs/qwen36-27b-lora-v1.8-business-owner/adapter`. The release-candidate raw model path is the merged full model on Gojira B at `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`. The deterministic harness commands work locally now; the SGLang commands require Gojira B over Tailscale.
4
+
5
+ ## Run The Local Release-Candidate Gate
6
+
7
+ ```bash
8
+ python3 scripts/run_kaiju_business_owner_rc_smoke.py
9
+ ```
10
+
11
+ This validates reviewed data, checks v1.7 targets, builds the oversampled business-owner SFT file, smokes the local OpenAI-compatible harness API, runs the hard router suite, and runs static artifact checks.
12
+
13
+ For release status, read `release/COMPLETION_AUDIT.md` and `release/HUGGINGFACE_RELEASE_DRAFT.md`.
14
+
15
+ ## Merge The v1.8 Adapter
16
+
17
+ Use this if the merged full model must be rebuilt:
18
+
19
+ ```bash
20
+ KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter \
21
+ KAIJU_MERGED_MODEL_DIR=/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged \
22
+ ./scripts/run-gojira-b-qwen36-lora-merge.sh
23
+ ```
24
+
25
+ ## Start Kaiju Coder 7 Serving
26
+
27
+ Use this for the current model-side candidate:
28
+
29
+ ```bash
30
+ KAIJU_QWEN36_MERGED_PORT=18083 \
31
+ KAIJU_QWEN36_MERGED_SESSION=kaiju_qwen36_v18_merged_sglang \
32
+ KAIJU_QWEN36_MERGED_CONTEXT=16384 \
33
+ KAIJU_QWEN36_MERGED_MEM_FRACTION=0.85 \
34
+ ./scripts/start-qwen36-merged-sglang.sh
35
+ ```
36
+
37
+ Confirm readiness:
38
+
39
+ ```bash
40
+ curl http://100.109.109.14:18083/v1/models
41
+ ```
42
+
43
+ The high-context `32768` target has benchmark evidence in
44
+ `release/SERVING_BENCHMARKS.md`, but the current restored Gojira-B endpoint is
45
+ parked at `16384` for reliable local/OpenCode testing after the quantized-vLLM
46
+ smoke work.
47
+
48
+ ## Prepare Merged-Model Hugging Face Metadata
49
+
50
+ Use this before any full merged-model upload review. It syncs release metadata
51
+ into the Gojira-B model folder but does not upload or read Hugging Face tokens.
52
+ If the remote merged folder is root-owned, the helper automatically uses
53
+ passwordless sudo for rsync without changing model ownership:
54
+
55
+ ```bash
56
+ bash scripts/prepare_hf_merged_model_metadata.sh
57
+ KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh
58
+ bash scripts/upload_hf_merged_model_from_gojira_b.sh
59
+ ```
60
+
61
+ ## Install And Smoke OpenCode
62
+
63
+ ```bash
64
+ python3 scripts/install_kaiju_opencode_profile.py
65
+ opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
66
+ --dir /tmp/kaiju-opencode-loopguard-smoke \
67
+ --dangerously-skip-permissions \
68
+ 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'
69
+ ```
70
+
71
+ The installer writes the `kaiju` provider, the lean `kaiju-coder-7` agent, and
72
+ the scoped no-autocontinue plugin at
73
+ `~/.config/opencode/kaiju-no-autocontinue.mjs`.
74
+
75
+ ## Run The Deterministic Harness Smoke
76
+
77
+ ```bash
78
+ python3 scripts/run_kaiju_api_harness_smoke.py
79
+ ```
80
+
81
+ ## Run A Direct Model Eval
82
+
83
+ ```bash
84
+ python3 evals/run_openai_compat_smoke.py \
85
+ --base-url http://100.109.109.14:18083/v1 \
86
+ --model kaiju-coder-7 \
87
+ --tasks evals/tasks/smoke.jsonl \
88
+ --max-tasks 1 \
89
+ --timeout 300 \
90
+ --max-tokens 768 \
91
+ --temperature 0 \
92
+ --disable-thinking \
93
+ --system-prompt-file prompts/kaiju-coder-api-system.md
94
+ ```
95
+
96
+ For the selected final business-owner checkpoint, run the focused v1.8
97
+ business-owner pack and then score it. Raw merged model generation is slow, so
98
+ use the harness for practical paid website delivery until broader raw website
99
+ evals pass at acceptable latency:
100
+
101
+ ```bash
102
+ python3 evals/run_openai_compat_smoke.py \
103
+ --base-url http://100.109.109.14:18083/v1 \
104
+ --model kaiju-coder-7 \
105
+ --tasks evals/tasks/business-owner-v18-comparison.jsonl \
106
+ --timeout 900 \
107
+ --max-tokens 2500 \
108
+ --temperature 0 \
109
+ --disable-thinking \
110
+ --stream \
111
+ --system-prompt-file prompts/kaiju-coder-api-system.md
112
+
113
+ python3 evals/score_quality_gate.py runs/evals/<merged-v18-run>/results.jsonl
114
+ ```
115
+
116
+ Current merged evidence:
117
+
118
+ - Probe: `1,155` visible chars in `60.17s`.
119
+ - Proposal rerun: `1/1` paid-ready, `4.0/4.0`, `4,014` chars in `212.72s`.
120
+ - Jah credits backend: `4.0/4.0`, `9,718` chars in `566.36s`.
121
+
122
+ ## Dynamic LoRA Serving Caveat
123
+
124
+ Do not use dynamic SGLang LoRA serving as release evidence for v1.8. The adapter-name-only path can be base-equivalent, and the corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes this SGLang build with a fused-module LoRA buffer shape mismatch. Use the merged full-model path above.
125
+
126
+ ## Run The Business-Owner Harness
127
+
128
+ ```bash
129
+ python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl
130
+ python3 evals/run_router_static_checks.py runs/evals/<router-run>/results.jsonl
131
+ ```
132
+
133
+ ## Manual Prompt To Try First
134
+
135
+ ```text
136
+ Build me the full Kiyomi 7.7.7 AI company operating pack for a local business owner. I need the launch kit, website, content engine, connector checklist, intake CRM, money report, automations, operator handbook, lead generator, sales closer, ROI dashboard, and Workshop golden run. Make it owner-ready with no developer setup required.
137
+ ```
138
+
139
+ Expected shape:
140
+
141
+ - A project folder with multiple files, not advice only.
142
+ - Complete HTML where HTML is requested.
143
+ - Lead/sales CSVs.
144
+ - Connector verification gates.
145
+ - ROI audit gate.
146
+ - Workshop golden-run gate.
147
+ - Clear owner commands such as `/kiyomi` and `/kiyomi-do`.
PAID_API_READINESS.md ADDED
@@ -0,0 +1,266 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Paid API Readiness
2
+
3
+ Do not sell the hosted API as generally available until the gates below pass.
4
+
5
+ ## Current Position
6
+
7
+ Kaiju Coder 7 can be served locally through an OpenAI-compatible SGLang
8
+ endpoint. The reliable commercial product path is:
9
+
10
+ ```text
11
+ Kaiju Coder 7 model + deterministic business-owner harness + verifier + gateway controls
12
+ ```
13
+
14
+ Raw multi-file OpenCode generation is not yet fast enough to be the paid API
15
+ promise by itself. The harnessed customer-readiness pack passes and should be
16
+ the paid-route baseline until raw-agent generation improves.
17
+
18
+ ## Required Gateway Behavior
19
+
20
+ - Use model id `kaiju-coder-7`.
21
+ - Disable hidden thinking where the serving stack supports it.
22
+ - Stream responses for long outputs.
23
+ - Cap max output by route.
24
+ - Reject requests with secret-looking prompt content when possible.
25
+ - Never log API keys, bearer tokens, OAuth tokens, payment credentials, or full
26
+ private customer prompts by default.
27
+ - Keep request ids, customer id, route, token counts, latency, status, and coarse
28
+ failure reason.
29
+
30
+ ## Billing And Access
31
+
32
+ - API keys must be scoped per customer/account.
33
+ - Stripe subscription or prepaid credit balance must be checked before serving.
34
+ - Rate limits must be per key and per account.
35
+ - Failed auth and rate-limit events should be logged without prompt content.
36
+ - Admin override keys must be separate from customer keys.
37
+
38
+ ## Current Gateway Scaffold Evidence
39
+
40
+ Local Worker scaffold:
41
+
42
+ - `gateway/cloudflare-worker/src/index.js`
43
+ - `gateway/cloudflare-worker/migrations/0001_paid_api.sql`
44
+ - `gateway/cloudflare-worker/test/index.test.js`
45
+
46
+ Verified on 2026-06-03 with:
47
+
48
+ ```bash
49
+ cd gateway/cloudflare-worker
50
+ npm run check
51
+ npm run preflight
52
+ ```
53
+
54
+ Result: `16/16` Worker tests passed and `17` paid API scaffold preflight checks
55
+ passed.
56
+ The scaffold preflight also checks that the guarded Cloudflare resource-prep
57
+ script, `scripts/prepare_paid_api_cloudflare_resources.sh`, is wired through
58
+ `npm run prepare:cloudflare`, and that the reviewed binding template is present.
59
+
60
+ Covered locally:
61
+
62
+ - missing bearer token returns `401`
63
+ - inactive API key returns `403`
64
+ - insufficient credits return `402` before origin fetch
65
+ - successful chat request forwards `x-kaiju-origin-secret` and debits credits
66
+ - origin fetch failure refunds credits
67
+ - fixed-window rate limit blocks before debit
68
+ - public chat payload is forced to model `kaiju-coder-7`, streaming, thinking
69
+ disabled, and token capped
70
+ - unsupported model is rejected before debit
71
+ - secret-looking prompt content is rejected before debit, origin fetch, or logs
72
+ - signed Stripe Checkout webhook credits prepaid balance
73
+ - duplicate Stripe Checkout webhook does not double-credit
74
+ - invalid Stripe signature is rejected
75
+ - origin-only artifact upload stores bounded text artifacts in R2
76
+ - authenticated artifact download is scoped to the caller's account namespace
77
+ - unsafe artifact paths are rejected before R2 storage
78
+ - secret-looking artifact content is rejected before R2 storage
79
+
80
+ Executable preflight:
81
+
82
+ ```bash
83
+ python3 scripts/check_kaiju_public_release_readiness.py --mode local
84
+ python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
85
+ python3 scripts/check_kaiju_public_release_readiness.py --mode public
86
+ python3 scripts/generate_kaiju_final_report.py
87
+ python3 scripts/check_kaiju_goal_completion.py --write
88
+ python3 scripts/refresh_kaiju_release_evidence.py --skip-opencode-smoke
89
+ python3 scripts/check_hf_staging_integrity.py
90
+ python3 scripts/check_hf_release_bundle_integrity.py
91
+ python3 scripts/collect_hf_release_permission_evidence.py
92
+ python3 scripts/check_hf_release_permission_evidence.py
93
+ python3 scripts/check_human_release_review.py --mode local
94
+ python3 scripts/check_human_release_review.py --mode public
95
+ cd gateway/cloudflare-worker
96
+ npm run prepare:cloudflare
97
+ cd ../..
98
+ cp release/cloudflare-bindings.example.json release/cloudflare-bindings.json
99
+ # Replace placeholder D1/KV IDs in release/cloudflare-bindings.json first.
100
+ python3 scripts/apply_paid_api_cloudflare_bindings.py --bindings-file release/cloudflare-bindings.json
101
+ python3 scripts/check_paid_api_readiness.py --mode scaffold
102
+ python3 scripts/check_paid_api_readiness.py --mode launch
103
+ ```
104
+
105
+ `check_kaiju_public_release_readiness.py --mode local` is the consolidated
106
+ public-testing readiness command. It can pass while public upload and paid API
107
+ launch remain manual blockers. `--mode hf-release` checks the downloadable
108
+ model/helper release and requires sanitized Hugging Face namespace permission
109
+ evidence plus human review while keeping paid API launch manual. `--mode public`
110
+ must remain red until Hugging Face write permissions, live Cloudflare resources,
111
+ Stripe staging evidence, rollback proof, and human review are complete.
112
+
113
+ `generate_kaiju_final_report.py` writes `release/FINAL_RELEASE_REPORT.md` with
114
+ the current local/public readiness summaries, launch blockers, changed files,
115
+ commands run, and first commands Richard should test. It is part of the release
116
+ packet and does not inspect tokens, environment variables, or process command
117
+ lines.
118
+
119
+ `check_kaiju_goal_completion.py --write` writes
120
+ `release/GOAL_COMPLETION_AUDIT.md`, a stricter objective-level audit. It should
121
+ remain red while Hugging Face upload, human review, or live paid API launch
122
+ evidence are missing.
123
+
124
+ `refresh_kaiju_release_evidence.py` is a safe local refresh runner. It updates
125
+ direct API smoke evidence, goal audit, final report, HF staging, local bundle,
126
+ merged-model metadata on Gojira-B, and dry-run upload previews without reading
127
+ tokens or uploading anything.
128
+
129
+ `check_hf_staging_integrity.py` validates the staged Hugging Face package for
130
+ required files, public naming hygiene, raw secret-looking values, and staging
131
+ checksums. It does not upload, create repos, or print matched secret values.
132
+
133
+ `check_hf_release_permission_evidence.py` validates sanitized Hugging Face
134
+ repo-create evidence in `release/hf-release-permission-evidence.json`. Start
135
+ from `release/hf-release-permission-evidence.example.json` only after the
136
+ private permission probe succeeds, or use
137
+ `scripts/collect_hf_release_permission_evidence.py --apply --write` to run the
138
+ probe and write the sanitized evidence automatically. Never include raw auth
139
+ output or tokens.
140
+
141
+ `check_human_release_review.py` reads `release/HUMAN_RELEASE_REVIEW.md`. Local
142
+ mode may pass with pending/manual review fields; public mode must fail until
143
+ Richard changes the signoff fields to approved decisions.
144
+
145
+ `npm run prepare:cloudflare` is dry-run safe by default. It prints the exact
146
+ Wrangler commands for creating `KAIJU_BILLING_DB`, `KAIJU_RATE_LIMIT_KV`, and
147
+ `KAIJU_ARTIFACT_BUCKET`, applying the D1 migration, setting required secrets,
148
+ deploying, listing deployments, and exercising rollback. `npm run check` also
149
+ runs `npx wrangler deploy --dry-run` so the current Worker build path is validated
150
+ without publishing. Set
151
+ `KAIJU_CF_RESOURCE_APPLY=1` only when the intended Cloudflare account is active.
152
+
153
+ After real D1/KV/R2 resources exist, copy
154
+ `release/cloudflare-bindings.example.json` to `release/cloudflare-bindings.json`,
155
+ replace the placeholder IDs, and preview the reviewed config update:
156
+
157
+ ```bash
158
+ python3 scripts/apply_paid_api_cloudflare_bindings.py \
159
+ --bindings-file release/cloudflare-bindings.json
160
+ ```
161
+
162
+ The applier refuses placeholder values and secret-looking input. Only after the
163
+ preview is reviewed should it update `gateway/cloudflare-worker/wrangler.jsonc`:
164
+
165
+ ```bash
166
+ python3 scripts/apply_paid_api_cloudflare_bindings.py \
167
+ --bindings-file release/cloudflare-bindings.json \
168
+ --write
169
+ ```
170
+
171
+ `--mode scaffold` verifies the local gateway implementation and should pass.
172
+ `--mode launch` is stricter and should fail until real Cloudflare bindings,
173
+ Wrangler secrets, Stripe webhook evidence, staging traffic, latency evidence,
174
+ and rollback proof are attached.
175
+
176
+ Launch evidence is attached through a sanitized JSON file:
177
+
178
+ ```bash
179
+ cp release/paid-api-launch-evidence.example.json release/paid-api-launch-evidence.json
180
+ python3 scripts/collect_paid_api_launch_evidence.py --help
181
+ python3 scripts/check_paid_api_readiness.py --mode launch \
182
+ --evidence-file release/paid-api-launch-evidence.json
183
+ ```
184
+
185
+ Use `scripts/collect_paid_api_launch_evidence.py` to preview or write sanitized
186
+ launch evidence after staging resources exist. It can read the staging API key
187
+ from an environment variable for live probes, but it never writes the key, full
188
+ prompt, or model response to the evidence file. By default it prints a preview;
189
+ pass `--write` only after reviewing the target file path.
190
+
191
+ Only record secret names, route names, request ids, coarse latency numbers, and
192
+ pass/fail facts. Do not put raw API keys, bearer tokens, OAuth tokens, Stripe
193
+ secret keys, webhook signing secrets, tunnel credentials, full private prompts,
194
+ or customer private data in the evidence file. The checker scans the evidence
195
+ file for common secret-looking values and fails launch readiness if it finds
196
+ them.
197
+
198
+ ## Minimum API Gates
199
+
200
+ | Gate | Required Evidence |
201
+ | --- | --- |
202
+ | Auth | Unauthorized requests fail; valid test key works |
203
+ | Billing | Unpaid/suspended account is denied before model call |
204
+ | Rate limit | Burst and daily caps work per key |
205
+ | Logging | Logs omit secrets and full private prompts |
206
+ | Abuse control | Secret-looking payloads and obviously unsafe automation requests are rejected or redacted |
207
+ | Artifacts | Origin-only R2 upload and account-scoped artifact download pass |
208
+ | Rollback | One command can route traffic back to previous stable model/harness |
209
+ | Latency | p95 for paid routes is documented and acceptable |
210
+ | Quality | Business-owner eval pack passes with complete files/artifacts |
211
+
212
+ Current quality evidence:
213
+
214
+ - Harnessed customer-readiness pack:
215
+ `runs/opencode-customer-readiness/20260603T185835Z/summary.md`, `4/4`
216
+ passed, `28/28` required files written, including the release provenance and
217
+ safety review task.
218
+ - Restored 32k SGLang direct API smoke:
219
+ `runs/benchmarks/20260603T155233Z-kaiju-coder-7-serving/summary.md`,
220
+ identity passed in `2.92s`; business proposal passed in `94.28s` with
221
+ `1,737` chars.
222
+ - Runtime-quantized vLLM OpenCode smoke:
223
+ `bash scripts/run_kaiju_quantized_opencode_smoke.sh` passed at 16k after
224
+ vLLM launched with `--enable-auto-tool-choice`; OpenCode wrote
225
+ `hello.txt` with exactly `Kaiju Coder 7 quantized runtime ok`.
226
+ - Current restored 16k SGLang direct API smoke:
227
+ `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`,
228
+ identity passed in `2.3s`.
229
+ - Raw OpenCode multi-file pack remains a blocker for raw-agent claims.
230
+
231
+ ## Pricing Assumptions To Validate
232
+
233
+ - Raw model tokens are slow and expensive enough that per-token pricing alone is
234
+ not the right first product.
235
+ - Better first API product: priced business-owner routes such as website pack,
236
+ proposal pack, ROI/report pack, and Kiyomi operating pack.
237
+ - Charge for complete artifacts and verified workflow output, with token usage
238
+ as an internal cost-control metric.
239
+
240
+ ## Release Blockers
241
+
242
+ - Raw OpenCode customer-readiness task currently times out on multi-file work.
243
+ - Harnessed customer-readiness route passes; paid API must route through that
244
+ deterministic product path until a faster raw/quantized path passes.
245
+ - Context-size benchmarks passed at 12k, 16k, 24k, and 32k, but the current
246
+ parked Gojira-B/OpenCode profile is 16k. Treat 32k as the high-context target
247
+ to re-confirm after restart before using it as a public default.
248
+ - Restored 32k business-document direct API smoke passed, but the `94.28s`
249
+ latency is too slow for ungated paid API use without streaming, queueing,
250
+ and route-level caps.
251
+ - vLLM serving has been tested at 16k, but it is not clearly faster than SGLang
252
+ and needs the Gojira nightly image plus text-only launch flags.
253
+ - Runtime-quantized vLLM bitsandbytes has passed 8k and 16k identity/code
254
+ smoke tests, passed a 16k business-document smoke in `53.44s`, and reduces
255
+ model memory to about `17.8 GiB`; its OpenCode one-file smoke now passes.
256
+ - Persisted quantized public weights are still pending.
257
+ - Hosted gateway scaffold now has local-tested API key, D1 prepaid credits,
258
+ fixed-window rate limit, model enforcement, secret-content rejection, and
259
+ signed Stripe webhook top-up behavior. It also has a sanitized launch-evidence
260
+ collector for the remaining staging proof. It is not live-paid ready until real
261
+ Cloudflare resources, Stripe products/webhook endpoint, deployment secrets,
262
+ sanitized launch evidence, and staging end-to-end requests pass.
263
+ - `python3 scripts/check_paid_api_readiness.py --mode launch` currently fails
264
+ by design because live D1/KV/R2 bindings and manual launch evidence are not
265
+ attached. This prevents local scaffold readiness from being mistaken for
266
+ paid public launch approval.
PUBLIC_TESTING_QUICKSTART.md ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Public Testing Quickstart
2
+
3
+ Kaiju Coder 7 is the public model name. The OpenAI-compatible model id is:
4
+
5
+ ```text
6
+ kaiju-coder-7
7
+ ```
8
+
9
+ Use this guide for serious public testing. It avoids internal checkpoint names
10
+ and keeps the current limitations clear.
11
+
12
+ ## Pick A Test Path
13
+
14
+ ### Path 1: OpenCode Against An Existing Endpoint
15
+
16
+ Use this if you already have Kaiju Coder 7 served at an OpenAI-compatible
17
+ `/v1` endpoint.
18
+
19
+ ```bash
20
+ git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
21
+ cd kaiju-coder-7-opencode
22
+ python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
23
+ ```
24
+
25
+ Then run OpenCode inside the project you want to edit:
26
+
27
+ ```bash
28
+ opencode -m kaiju/kaiju-coder-7 --agent kaiju-coder-7
29
+ ```
30
+
31
+ For a bounded smoke test:
32
+
33
+ ```bash
34
+ mkdir -p /tmp/kaiju-public-smoke
35
+ opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
36
+ --dir /tmp/kaiju-public-smoke \
37
+ "Create hello.txt with exactly: Kaiju Coder 7 is ready"
38
+ ```
39
+
40
+ Or run the packaged verifier, which checks the installer, live model endpoint,
41
+ OpenCode binary, actual file creation, and wrong-directory behavior:
42
+
43
+ ```bash
44
+ python3 scripts/run_kaiju_public_opencode_smoke.py
45
+ ```
46
+
47
+ The helper installer adds:
48
+
49
+ - the `kaiju` OpenAI-compatible provider
50
+ - the lean `kaiju-coder-7` OpenCode agent
51
+ - a scoped no-autocontinue plugin that prevents false completion loops after
52
+ compaction or output limits
53
+
54
+ ### Path 2: Full Local Weights
55
+
56
+ Use this if the full `RMDWLLC/kaiju-coder-7` Hugging Face repo has been
57
+ uploaded and you have suitable local GPU hardware.
58
+
59
+ ```bash
60
+ hf download RMDWLLC/kaiju-coder-7 --local-dir ./kaiju-coder-7
61
+ ```
62
+
63
+ Serve the downloaded folder with an OpenAI-compatible local server. Configure
64
+ the server to expose:
65
+
66
+ ```text
67
+ model id: kaiju-coder-7
68
+ base URL: http://127.0.0.1:18083/v1
69
+ context: 16384
70
+ ```
71
+
72
+ Then install the OpenCode helper with:
73
+
74
+ ```bash
75
+ git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
76
+ cd kaiju-coder-7-opencode
77
+ python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
78
+ ```
79
+
80
+ ### Path 3: Runtime-Quantized Local Candidate
81
+
82
+ Use this only if you are comfortable with advanced serving setups. The current
83
+ working quantized option is a runtime bitsandbytes recipe, not a separate
84
+ persisted quantized weights repo.
85
+
86
+ ```bash
87
+ git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-quantized-runtime
88
+ cd kaiju-coder-7-quantized-runtime
89
+ ```
90
+
91
+ Read `README.md` in that repo before serving. This path can reduce model memory
92
+ at runtime, but it still depends on access to the full Kaiju Coder 7 weights.
93
+
94
+ ## Recommended Test Prompt
95
+
96
+ Run this from an empty project folder:
97
+
98
+ ```text
99
+ Build a launch-ready local service business website and operating pack. Include
100
+ index.html, a Stripe checkout safety plan, a CSV parser with tests, a simple CRM
101
+ schema, a weekly money report, and a safety/provenance note. Write the files,
102
+ not just advice.
103
+ ```
104
+
105
+ Expected result:
106
+
107
+ - files are written in the requested project folder
108
+ - `index.html` is complete HTML
109
+ - business docs start with Markdown H1 headings
110
+ - code includes a test or smoke-check command where practical
111
+ - no fake API keys, OAuth tokens, payment secrets, or private customer data
112
+
113
+ ## Current Recommended Defaults
114
+
115
+ - Public model id: `kaiju-coder-7`
116
+ - OpenCode context: `16384`
117
+ - Output cap for public testing: `2500`
118
+ - Current reliable product path: model plus deterministic business-owner
119
+ harness plus verifier
120
+ - Raw multi-file OpenCode generation: still too slow for broad paid API claims
121
+ - Paid API: not public until launch preflight passes
122
+
123
+ ## What Not To Claim Yet
124
+
125
+ Do not claim:
126
+
127
+ - that raw model weights alone reliably build every business-owner artifact
128
+ - that a paid hosted API is generally available
129
+ - that persisted quantized weights exist
130
+ - that 32k context is the current live default
131
+
132
+ Do claim:
133
+
134
+ - Kaiju Coder 7 has a working local/OpenCode release candidate
135
+ - the current tested OpenCode default is 16k context
136
+ - the helper package includes a lean agent and compaction loop guard
137
+ - the paid API scaffold has tests and a launch preflight, but is not yet public
138
+ - the packaged public smoke verifies a fresh OpenCode one-file write before
139
+ public claims are refreshed
140
+
141
+ ## Current Blockers Before Public Release
142
+
143
+ - Hugging Face repo creation still requires a write-capable token or namespace.
144
+ - Full merged model upload has not completed; the merged folder must first have
145
+ the metadata packet synced by `prepare_hf_merged_model_metadata.sh`.
146
+ - Public paid API launch needs real Cloudflare D1/KV/R2 bindings, Wrangler
147
+ secret verification, Stripe webhook staging evidence, staging traffic, latency
148
+ evidence, and rollback proof.
149
+ - Human review is still required before public upload.
QUANTIZATION_PLAN.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Quantization Plan
2
+
3
+ Kaiju Coder 7 needs a public-friendly quantized variant before broad local
4
+ OpenCode release. The merged full model is too large and slow for most users.
5
+
6
+ ## Current Prerequisite Check
7
+
8
+ Checked on 2026-06-03 with:
9
+
10
+ ```bash
11
+ python3 scripts/check_kaiju_quantization_prereqs.py
12
+ ```
13
+
14
+ - Local Mac default Python: no `torch`, `transformers`, `safetensors`,
15
+ `llmcompressor`, `auto_gptq`, `autoawq`, or `bitsandbytes`.
16
+ - Gojira-B default Python: same libraries unavailable outside model-serving
17
+ containers.
18
+ - SGLang container: has `torch 2.9.1+cu130`, `transformers 5.3.0`,
19
+ `safetensors 0.7.0`, and `huggingface_hub 1.10.2`; missing
20
+ `llmcompressor`, `autoawq`, `auto_gptq`, and `bitsandbytes`.
21
+ - HF CLI: unavailable in the current shell.
22
+
23
+ This does not block quantization, but it means persistent weight quantization
24
+ must run in a pinned container or a dedicated virtual environment, not the
25
+ default shell.
26
+
27
+ ## Gojira-B Probe Evidence
28
+
29
+ Run:
30
+
31
+ ```bash
32
+ ./scripts/probe-gojira-b-kaiju-quantization.sh
33
+ ```
34
+
35
+ Findings on 2026-06-03:
36
+
37
+ - Merged model artifact is `51G`.
38
+ - Architecture is `qwen3_5` / `Qwen3_5ForConditionalGeneration`.
39
+ - Text config uses both `linear_attention` and `full_attention` layers.
40
+ - Standard `vllm/vllm-openai:latest` cannot load the config because its
41
+ Transformers build does not recognize `qwen3_5`.
42
+ - `gojira/vllm-openai-ray:nightly` can load the config.
43
+ - vLLM serving requires the text-only launch path for this checkpoint because
44
+ the public text-serving merge does not include visual encoder weights.
45
+ - vLLM bitsandbytes runtime quantization works at 8k and 16k with the Gojira
46
+ nightly image, `pandas` preinstall, `--language-model-only`,
47
+ `--quantization bitsandbytes`, and `--load-format bitsandbytes`.
48
+ - The 16k runtime bitsandbytes business-document smoke passed:
49
+ `runs/benchmarks/20260603T161316Z-kaiju-coder-7-serving/summary.md`,
50
+ `53.44s`, `1,610` chars, `30.127` chars/s.
51
+ - The 16k runtime bitsandbytes OpenCode one-file smoke passed after adding
52
+ vLLM `--enable-auto-tool-choice`:
53
+ `bash scripts/run_kaiju_quantized_opencode_smoke.sh` wrote
54
+ `/tmp/kaiju-opencode-quantized-smoke/hello.txt` with exactly
55
+ `Kaiju Coder 7 quantized runtime ok`.
56
+
57
+ ## Candidate Order
58
+
59
+ 1. **FP8/AWQ-style GPU serving candidate**
60
+ - Best for hosted API or serious local GPU users.
61
+ - Benchmark against current SGLang merged full model.
62
+ - Must keep model id `kaiju-coder-7` in serving docs.
63
+ - Current working runtime candidate: vLLM bitsandbytes at `8192` and
64
+ `16384`, documented in `release/quantized-runtime/README.md`.
65
+
66
+ 2. **GGUF/llama.cpp candidate**
67
+ - Best for broad local distribution if the architecture converts cleanly.
68
+ - Publish only if a real local smoke test passes.
69
+
70
+ 3. **MLX candidate**
71
+ - Best for Apple Silicon users if conversion supports this architecture.
72
+ - Useful for Richard's local testing and Kiyomi/OpenCode demos.
73
+
74
+ ## Quantization Success Gate
75
+
76
+ A quantized candidate is not release-ready until it passes:
77
+
78
+ - `/v1/models` or local runtime identifies it as Kaiju Coder 7.
79
+ - Direct identity and code smoke pass.
80
+ - At least one business-owner document task passes.
81
+ - OpenCode one-file write smoke passes.
82
+ - Latency and memory are recorded in `release/SERVING_BENCHMARKS.md`.
83
+ - Model card states exact quantization format, hardware tested, and known
84
+ quality tradeoffs.
85
+
86
+ The runtime bitsandbytes candidate has passed the direct identity and code smoke
87
+ at 8k and 16k, a 16k business-owner document task, and an OpenCode one-file
88
+ write smoke. It can be documented as an advanced runtime-quantized OpenCode
89
+ path, but not as a public quantized-weights release.
90
+
91
+ ## Next Concrete Step
92
+
93
+ Create a pinned Docker/UV quantization environment on Gojira-B with the
94
+ Qwen3.5-capable Transformers/runtime stack plus one persistent-weight
95
+ quantization package at a time. Do not upload a quantized-weights repo until a
96
+ smoke-tested persisted artifact exists.
README.md ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 by Kiyomi - Adapter Model Card
2
+
3
+ This model card is for the LoRA adapter package, not a standalone base model.
4
+
5
+ ## Summary
6
+
7
+ Kaiju Coder 7 by Kiyomi is an RMDW/Kiyomi business-owner coding adapter trained on reviewed, RMDW-owned or RMDW-authored examples. It is designed for practical small-business build work: websites, proposals, intake/CRM flows, Stripe/payment implementation planning, reports, ROI dashboards, automations, operator handbooks, lead generation, sales follow-up, repo patches, and Kiyomi 7.7.7 style AI-company setup packs.
8
+
9
+ The current release-candidate product path is:
10
+
11
+ ```text
12
+ Qwen3.6-27B base
13
+ -> Kaiju v1.8 LoRA adapter
14
+ -> merged full-model artifact for raw local serving
15
+ -> Kaiju system prompt
16
+ -> deterministic business-owner harnesses
17
+ -> verifier/static checks
18
+ ```
19
+
20
+ Do not describe this package as raw weights alone producing every final artifact. The deterministic harness is part of the tested product path.
21
+
22
+ ## Base Model
23
+
24
+ - Base model: `Qwen/Qwen3.6-27B`
25
+ - Checked upstream revision: `6a9e13bd6fc8f0983b9b99948120bc37f49c13e9`
26
+ - Upstream license metadata: `apache-2.0`
27
+ - Upstream license copy: `release/upstream/qwen3.6-27b/LICENSE`
28
+
29
+ Attribution wording:
30
+
31
+ ```text
32
+ Kaiju Coder 7 by Kiyomi is fine-tuned from Qwen under Apache 2.0.
33
+ ```
34
+
35
+ Do not imply endorsement by Qwen, Alibaba, or upstream authors.
36
+
37
+ ## Adapter
38
+
39
+ - Adapter path: `runs/qwen36-27b-lora-v1.8-business-owner/adapter`
40
+ - Adapter type: LoRA / PEFT
41
+ - LoRA rank: `16`
42
+ - LoRA alpha: `32`
43
+ - LoRA dropout: `0.02`
44
+ - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
45
+ - Trainable parameter count: approximately `79.7M`
46
+
47
+ ## Merged Local Artifact
48
+
49
+ - Remote merged path: `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`
50
+ - Size: `51G`
51
+ - Shards: `14` safetensor shards plus tokenizer/config sidecars
52
+ - Served model name: `kaiju-coder-7`
53
+ - Merge script: `scripts/run-gojira-b-qwen36-lora-merge.sh`
54
+ - Serving script: `scripts/start-qwen36-merged-sglang.sh`
55
+
56
+ ## Training
57
+
58
+ - Dataset build: `datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl`
59
+ - Reviewed candidate examples: `1,689`
60
+ - SFT rows after controlled business-owner oversampling: `1,881`
61
+ - Train examples: `1,769`
62
+ - Eval examples: `112`
63
+ - Training runtime: `11666.7564s`
64
+ - Training loss: `0.9281658741335074`
65
+ - Max training length: `2048`
66
+ - Training config: `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json`
67
+
68
+ ## Data Provenance
69
+
70
+ Training data is source-backed and RMDW-owned or RMDW-authored. Client-site repositories are used only as generalized pattern/eval sources unless explicitly reviewed for training eligibility.
71
+
72
+ Relevant release files:
73
+
74
+ - `release/SOURCE_INVENTORY.md`
75
+ - `release/source-inventory.json`
76
+ - `release/DATA_PROVENANCE_DRAFT.md`
77
+ - `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl`
78
+
79
+ Excluded from training:
80
+
81
+ - Raw secrets, API keys, OAuth tokens, private keys, cookies, and credentials.
82
+ - Closed-model answers from OpenAI, Anthropic, Gemini, or similar providers as supervised completions unless terms clearly allow it.
83
+ - Private client data, customer notes, contracts, raw support logs, and client-specific website copy without explicit review and consent.
84
+
85
+ ## Evaluation Snapshot
86
+
87
+ Local product-path evidence:
88
+
89
+ - Unit tests: `65` passing.
90
+ - Full local RC smoke: passed.
91
+ - Router hard harness: `23/23`.
92
+ - Router static checks: `23/23`.
93
+ - Business-suite prompts: `2/2`.
94
+ - Local API harness: website and business-suite artifacts pass.
95
+
96
+ Merged serving evidence:
97
+
98
+ - Endpoint: `http://100.109.109.14:18083/v1`
99
+ - Served model: `kaiju-coder-7`
100
+ - Tested context: `32768` on Gojira-B, with `16384` documented as the lower-load fallback.
101
+ - Probe: `1,155` visible chars in `60.17s`.
102
+ - Proposal rerun: `1/1` paid-ready, `4.0/4.0`, `4,014` chars in `212.72s`.
103
+ - Jah credits backend: `4.0/4.0`, `9,718` chars in `566.36s`.
104
+ - OpenCode customer-readiness harness: `4/4` tasks passed, `28/28` required files written, including source/provenance and release-claim safety review.
105
+ - vLLM nightly serving probe: passed at `16384` after `pandas` preinstall and
106
+ `--language-model-only`, but not faster enough to replace SGLang.
107
+ - Runtime-quantized vLLM bitsandbytes: passed at `8192` and `16384`; 16k code
108
+ patch completed in `11.3s`, and logs reported about `17.8 GiB` model memory.
109
+
110
+ Known comparison caveat:
111
+
112
+ - Dynamic SGLang LoRA serving is not release evidence for this adapter: adapter-name-only output can be base-equivalent, and corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes with a fused-module LoRA buffer shape mismatch.
113
+ - Do not claim raw-weight superiority until broader base-Qwen and GLM/current-production comparisons are complete.
114
+
115
+ ## Limitations
116
+
117
+ - Raw full-website generation has not yet passed the merged-model release sweep and should remain harness-first for paid delivery.
118
+ - The deterministic harness remains the practical paid website workflow.
119
+ - The adapter needs a strong app layer for file editing, tool use, auth, billing, rate limits, logging, and rollback.
120
+ - Human review is still required before any public upload or paid production claim.
121
+ - Not intended for high-risk medical, legal, financial, or safety-critical decisions without expert review.
SERVING_BENCHMARKS.md ADDED
@@ -0,0 +1,358 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Coder 7 Serving Benchmarks
2
+
3
+ This file records serving evidence for public download and paid API decisions.
4
+ The model id must remain `kaiju-coder-7`.
5
+
6
+ ## Current Live Runtime
7
+
8
+ - Host: Gojira-B over Tailscale
9
+ - Base URL: `http://100.109.109.14:18083/v1`
10
+ - Serving stack: SGLang merged full model
11
+ - Current verified post-quantization restored context: `16384`
12
+ - Tested high-context target: `32768`
13
+ - Current container: `qwen36-merged-sglang-18083`
14
+ - Current caveat: direct raw generation is slow for multi-file OpenCode work.
15
+
16
+ ## Benchmark Command
17
+
18
+ For current-context latency without restart:
19
+
20
+ ```bash
21
+ python3 scripts/benchmark_kaiju_serving.py \
22
+ --contexts 12288 \
23
+ --prompts identity business_doc code_patch \
24
+ --max-tokens 768 \
25
+ --timeout 420
26
+ ```
27
+
28
+ For context restart benchmarking:
29
+
30
+ ```bash
31
+ python3 scripts/benchmark_kaiju_serving.py \
32
+ --restart \
33
+ --contexts 12288 16384 24576 32768 \
34
+ --prompts identity business_doc \
35
+ --max-tokens 768 \
36
+ --timeout 420 \
37
+ --ready-timeout 1200
38
+ ```
39
+
40
+ Use `--contexts 16384` for the current restored Gojira-B endpoint. Use
41
+ `32768` when explicitly testing the high-context target; it has passed earlier
42
+ benchmarks but should be re-confirmed after a fresh restart before calling it
43
+ the live default.
44
+
45
+ ## Current 12k Direct API Benchmark
46
+
47
+ Command:
48
+
49
+ ```bash
50
+ python3 scripts/benchmark_kaiju_serving.py \
51
+ --contexts 12288 \
52
+ --prompts identity code_patch \
53
+ --max-tokens 256 \
54
+ --timeout 300
55
+ ```
56
+
57
+ Run: `runs/benchmarks/20260603T135017Z-kaiju-coder-7-serving/summary.md`
58
+
59
+ | Context | Prompt | OK | Seconds | Chars | Chars/s |
60
+ | --- | --- | --- | ---: | ---: | ---: |
61
+ | 12288 | identity | True | 2.41 | 26 | 10.788 |
62
+ | 12288 | code_patch | True | 57.61 | 860 | 14.928 |
63
+
64
+ Interpretation: direct API calls are usable for short tasks, but latency is too
65
+ high for a paid raw-code API unless outputs are streamed and route-specific
66
+ limits are enforced.
67
+
68
+ ## 16k Context Benchmark
69
+
70
+ 16k was tested to reduce OpenCode compaction pressure.
71
+
72
+ Commands:
73
+
74
+ ```bash
75
+ python3 scripts/benchmark_kaiju_serving.py \
76
+ --restart \
77
+ --contexts 16384 \
78
+ --prompts identity \
79
+ --max-tokens 128 \
80
+ --timeout 300 \
81
+ --ready-timeout 1200
82
+
83
+ python3 scripts/benchmark_kaiju_serving.py \
84
+ --contexts 16384 \
85
+ --prompts code_patch \
86
+ --max-tokens 128 \
87
+ --timeout 300
88
+ ```
89
+
90
+ Runs:
91
+
92
+ - `runs/benchmarks/20260603T135651Z-kaiju-coder-7-serving/summary.md`
93
+ - `runs/benchmarks/20260603T140318Z-kaiju-coder-7-serving/summary.md`
94
+
95
+ | Context | Prompt | OK | Load Wait | Seconds | Chars | Chars/s |
96
+ | --- | --- | --- | ---: | ---: | ---: | ---: |
97
+ | 16384 | identity | True | 354.16 | 14.9 | 26 | 1.745 |
98
+ | 16384 | code_patch | True | n/a | 28.99 | 416 | 14.35 |
99
+
100
+ Interpretation: `16384` is a stable lower-load fallback and still leaves more
101
+ room above OpenCode's prompt/tool overhead than the original 12k setting.
102
+
103
+ ## 24k And 32k Context Benchmarks
104
+
105
+ 24k and 32k were tested after 16k proved stable. Both loaded and returned the
106
+ same code-patch latency profile as 16k on the short patch benchmark.
107
+
108
+ Commands:
109
+
110
+ ```bash
111
+ python3 scripts/benchmark_kaiju_serving.py \
112
+ --restart \
113
+ --contexts 24576 \
114
+ --prompts identity \
115
+ --max-tokens 128 \
116
+ --timeout 300 \
117
+ --ready-timeout 1200
118
+
119
+ python3 scripts/benchmark_kaiju_serving.py \
120
+ --contexts 24576 \
121
+ --prompts code_patch \
122
+ --max-tokens 128 \
123
+ --timeout 300
124
+
125
+ python3 scripts/benchmark_kaiju_serving.py \
126
+ --restart \
127
+ --contexts 32768 \
128
+ --prompts identity \
129
+ --max-tokens 64 \
130
+ --timeout 300 \
131
+ --ready-timeout 1200
132
+
133
+ python3 scripts/benchmark_kaiju_serving.py \
134
+ --contexts 32768 \
135
+ --prompts code_patch \
136
+ --max-tokens 128 \
137
+ --timeout 300
138
+ ```
139
+
140
+ Runs:
141
+
142
+ - `runs/benchmarks/20260603T141559Z-kaiju-coder-7-serving/summary.md`
143
+ - `runs/benchmarks/20260603T142354Z-kaiju-coder-7-serving/summary.md`
144
+ - `runs/benchmarks/20260603T142439Z-kaiju-coder-7-serving/summary.md`
145
+ - `runs/benchmarks/20260603T143256Z-kaiju-coder-7-serving/summary.md`
146
+
147
+ | Context | Prompt | OK | Load Wait | Seconds | Chars | Chars/s |
148
+ | --- | --- | --- | ---: | ---: | ---: | ---: |
149
+ | 24576 | identity | True | 439.54 | 16.84 | 26 | 1.544 |
150
+ | 24576 | code_patch | True | n/a | 29.03 | 416 | 14.33 |
151
+ | 32768 | identity | True | 386.53 | 16.27 | 26 | 1.598 |
152
+ | 32768 | code_patch | True | n/a | 28.99 | 416 | 14.35 |
153
+
154
+ Interpretation: `32768` is a proven high-context target from this benchmark set,
155
+ but it is not the currently parked live endpoint after the later
156
+ quantized-runtime testing. The current Gojira-B/OpenCode profile should stay at
157
+ `16384` until `32768` is freshly restarted and re-confirmed. Keep `12288` for
158
+ direct API smoke tests and constrained hardware.
159
+
160
+ Restored-service 32k direct API smoke after vLLM testing:
161
+
162
+ - Run: `runs/benchmarks/20260603T155233Z-kaiju-coder-7-serving/summary.md`
163
+ - `/v1/models`: `kaiju-coder-7`, max model len `32768`
164
+
165
+ | Context | Prompt | OK | Seconds | Chars | Chars/s |
166
+ | --- | --- | --- | ---: | ---: | ---: |
167
+ | 32768 | identity | True | 2.92 | 26 | 8.904 |
168
+ | 32768 | business_doc | True | 94.28 | 1737 | 18.424 |
169
+
170
+ Interpretation: the restored default endpoint is usable for business-owner
171
+ document work, but a long proposal response still takes about 94 seconds. Paid
172
+ routes must stream, cap output, queue carefully, and prefer verified
173
+ artifact routes over raw open-ended generation.
174
+
175
+ ## OpenCode Customer-Readiness Evidence
176
+
177
+ Final restored-service small OpenCode smoke:
178
+
179
+ ```bash
180
+ opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
181
+ --dir /tmp/kaiju-opencode-32k-final-smoke \
182
+ 'Create hello.txt with exactly: Kaiju Coder 7 final 32k ok'
183
+ ```
184
+
185
+ Result: passed. OpenCode wrote `hello.txt` with exactly
186
+ `Kaiju Coder 7 final 32k ok`.
187
+
188
+ Current restored 16k OpenCode smoke after quantized-vLLM testing:
189
+
190
+ ```bash
191
+ mkdir -p /tmp/kaiju-opencode-fresh-public-smoke
192
+ opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
193
+ --dir /tmp/kaiju-opencode-fresh-public-smoke \
194
+ --dangerously-skip-permissions \
195
+ 'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'
196
+ ```
197
+
198
+ Result: passed. OpenCode wrote `hello.txt` with exactly
199
+ `Kaiju Coder 7 fresh public smoke ok` in
200
+ `/tmp/kaiju-opencode-fresh-public-smoke`, and `/v1/models` returned
201
+ `kaiju-coder-7` with max model len `16384`.
202
+
203
+ Current restored 16k direct API identity smoke:
204
+
205
+ - Run: `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`
206
+ - `/v1/models`: `kaiju-coder-7`, max model len `16384`
207
+
208
+ | Context | Prompt | OK | Seconds | Chars | Chars/s |
209
+ | --- | --- | --- | ---: | ---: | ---: |
210
+ | 16384 | identity | True | 2.3 | 26 | 11.304 |
211
+
212
+ Command:
213
+
214
+ ```bash
215
+ python3 scripts/run_kaiju_opencode_customer_pack.py
216
+ ```
217
+
218
+ Latest harnessed product-path result on 2026-06-03:
219
+
220
+ - Run: `runs/opencode-customer-readiness/20260603T185835Z/summary.md`
221
+ - Mode: `harnessed`
222
+ - Status: `4/4` passed
223
+ - Tasks:
224
+ - `fade-flow-service-site`
225
+ - `kiyomi-owner-operating-pack`
226
+ - `paid-api-safety-scaffold`
227
+ - `release-provenance-safety-review`
228
+ - Required files written: `28/28`
229
+ - Forbidden secret-looking tokens: none found by verifier
230
+
231
+ Loop-guarded OpenCode install smoke:
232
+
233
+ - Command: `python3 scripts/install_kaiju_opencode_profile.py`, then
234
+ `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'`
235
+ - Result: passed. OpenCode wrote `loopguard.txt` in the requested directory with
236
+ exactly `Kaiju Coder 7 loop guard installed` and exited cleanly.
237
+ - Installed guard: `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`
238
+
239
+ Raw OpenCode-agent result on 2026-06-03:
240
+
241
+ - Task: `fade-flow-service-site`
242
+ - Status: timed out after `900s`
243
+ - Required files written: `0`
244
+ - Observed Gojira-B decode throughput while running: about `4.4` tokens/sec
245
+ - Follow-up runner fix: workspaces now run outside the repo and pass `opencode
246
+ run --dir <workspace>` explicitly.
247
+ - Structured follow-up run:
248
+ `runs/opencode-customer-readiness/20260603T135520Z/results.jsonl`
249
+ timed out after `60s`, wrote `0` files, and recorded `pwd` as the intended
250
+ temp workspace.
251
+ - 16k/stricter-agent follow-up runs:
252
+ - `runs/opencode-customer-readiness/20260603T140650Z/results.jsonl`
253
+ timed out after `120s`, wrote `0` files, and recorded the intended temp
254
+ workspace.
255
+ - `runs/opencode-customer-readiness/20260603T140908Z/results.jsonl`
256
+ timed out after `120s`, wrote `0` files after adding stricter "write first
257
+ file immediately" prompt guidance.
258
+ - Interpretation: the lean OpenCode agent fits and can write small files.
259
+ Harnessed file-plan delivery passes the customer pack. Current raw multi-file
260
+ OpenCode generation is still not public/API ready, so public and paid claims
261
+ must describe the reliable product path as model plus deterministic harness
262
+ and verifier.
263
+
264
+ ## Recommendation Until Faster Serving Is Proven
265
+
266
+ - Public local release can proceed only with clear speed/hardware caveats.
267
+ - Paid API should route business-owner deliverables through deterministic
268
+ harnesses and verifiers, not raw OpenCode multi-file generation.
269
+ - Quantized candidates and/or a smaller distilled variant are required for
270
+ broad public OpenCode usability.
271
+
272
+ ## vLLM Serving Probe
273
+
274
+ vLLM was tested as the practical alternative serving path after SGLang. The
275
+ standard `vllm/vllm-openai:latest` image cannot read the merged checkpoint's
276
+ `qwen3_5` config. The Gojira nightly image can read it, but needed two launch
277
+ fixes for this checkpoint:
278
+
279
+ - preinstall `pandas`, because the Qwen3.5 model path imports it in this image
280
+ - pass `--language-model-only`, because the merged text-serving checkpoint does
281
+ not include the visual encoder weights expected by the multimodal config
282
+
283
+ Guarded benchmark command:
284
+
285
+ ```bash
286
+ KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_READY_TIMEOUT=900 \
287
+ ./scripts/run-gojira-b-vllm-serving-benchmark.sh
288
+ ```
289
+
290
+ Run: `runs/benchmarks/20260603T151244Z-kaiju-coder-7-serving/summary.md`
291
+
292
+ | Stack | Context | Prompt | OK | Seconds | Chars | Chars/s |
293
+ | --- | ---: | --- | --- | ---: | ---: | ---: |
294
+ | vLLM nightly | 16384 | identity | True | 19.99 | 26 | 1.301 |
295
+ | vLLM nightly | 16384 | code_patch | True | 28.8 | 416 | 14.444 |
296
+
297
+ Interpretation: vLLM now runs Kaiju Coder 7 at 16k, but it is not clearly
298
+ faster than SGLang on the current smoke prompts. Keep SGLang as the recommended
299
+ runtime because it has stable OpenCode smoke evidence, a simpler launch path,
300
+ and historical 32k proof. Keep the live/default OpenCode profile at 16k until
301
+ 32k is freshly re-confirmed. Keep the vLLM scripts for future nightly-image or
302
+ quantized-weight testing.
303
+
304
+ ## vLLM bitsandbytes Runtime-Quantized Candidate
305
+
306
+ The first working quantized local variant is a runtime bitsandbytes vLLM path.
307
+ It does not create separate quantized weights yet; it loads the full merged
308
+ model through vLLM's bitsandbytes loader.
309
+
310
+ Command:
311
+
312
+ ```bash
313
+ KAIJU_VLLM_CONTEXT=16384 \
314
+ KAIJU_VLLM_READY_TIMEOUT=1200 \
315
+ KAIJU_VLLM_QUANTIZATION=bitsandbytes \
316
+ KAIJU_VLLM_LOAD_FORMAT=bitsandbytes \
317
+ ./scripts/run-gojira-b-vllm-serving-benchmark.sh
318
+ ```
319
+
320
+ Runs:
321
+
322
+ - `runs/benchmarks/20260603T153257Z-kaiju-coder-7-serving/summary.md`
323
+ - `runs/benchmarks/20260603T154450Z-kaiju-coder-7-serving/summary.md`
324
+ - `runs/benchmarks/20260603T161316Z-kaiju-coder-7-serving/summary.md`
325
+ - `runs/benchmarks/20260603T165512Z-kaiju-coder-7-serving/summary.md`
326
+
327
+ | Stack | Context | Prompt | OK | Seconds | Chars | Chars/s |
328
+ | --- | ---: | --- | --- | ---: | ---: | ---: |
329
+ | vLLM bitsandbytes | 8192 | identity | True | 21.19 | 26 | 1.227 |
330
+ | vLLM bitsandbytes | 8192 | code_patch | True | 11.31 | 424 | 37.489 |
331
+ | vLLM bitsandbytes | 16384 | identity | True | 19.51 | 26 | 1.333 |
332
+ | vLLM bitsandbytes | 16384 | code_patch | True | 11.3 | 416 | 36.814 |
333
+ | vLLM bitsandbytes | 16384 | business_doc | True | 53.44 | 1610 | 30.127 |
334
+ | vLLM bitsandbytes | 16384 | identity | True | 19.65 | 26 | 1.323 |
335
+
336
+ Gojira-B vLLM logs reported about `17.8 GiB` model memory for the bitsandbytes
337
+ load at both 8k and 16k, compared with about `50.22 GiB` for the unquantized
338
+ vLLM load. Code-patch latency improved materially on this smoke prompt.
339
+ Business-document latency improved versus the restored 32k SGLang business-doc
340
+ smoke (`53.44s` at 16k vLLM bitsandbytes versus `94.28s` at 32k SGLang).
341
+ Identity latency remains slower than SGLang.
342
+
343
+ Quantized OpenCode one-file smoke passed after launching vLLM with
344
+ `--enable-auto-tool-choice` plus `--tool-call-parser qwen3_coder` and running:
345
+
346
+ ```bash
347
+ bash scripts/run_kaiju_quantized_opencode_smoke.sh
348
+ ```
349
+
350
+ Result: OpenCode wrote `/tmp/kaiju-opencode-quantized-smoke/hello.txt` with
351
+ exactly `Kaiju Coder 7 quantized runtime ok`.
352
+
353
+ Recommendation: keep SGLang as the default public/OpenCode runtime and keep the
354
+ currently installed OpenCode profile at 16k unless the 32k target has just been
355
+ restarted and re-confirmed. Treat vLLM bitsandbytes as the current working
356
+ quantized local candidate for advanced GPU users and future paid API speed
357
+ experiments. It now has direct identity/code/business-doc evidence plus an
358
+ OpenCode one-file smoke, but it is not a persisted quantized-weights repo.
SOURCE_INVENTORY.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kaiju Source Inventory
2
+
3
+ Generated from GitHub source-of-truth repositories plus the requested local RMDW wiki snapshot. This inventory defines what may become Kaiju training data, what is eval-only, and what must stay excluded.
4
+
5
+ ## Global Training Rules
6
+
7
+ - Do not train on raw secrets, API keys, OAuth tokens, cookies, private keys, or credential files.
8
+ - Do not train on closed-model responses from OpenAI, Anthropic, Gemini, or similar providers unless the terms clearly allow it.
9
+ - Do not train on client-specific private data without explicit review and consent.
10
+ - Preserve repository name, commit SHA, source path, license, and reviewer status for every promoted dataset row.
11
+
12
+ ## GitHub Repository Inventory
13
+
14
+ | Repo | SHA | Role | Training use | Required gates | Exclusions | Notes |
15
+ |---|---|---|---|---|---|---|
16
+ | [RichardEchols/kaiju-coder](https://github.com/RichardEchols/kaiju-coder) | `3d57eae92ad5` | model lab, harness, evals, training scripts | candidate-after-review | secret-scan, closed-model-output-check, license-review | runs, models, .secrets, private datasets, raw logs | Use repo-owned harnesses, evals, docs, scripts, and curated datasets. Exclude weights, generated runs, and local secrets. |
17
+ | [RichardEchols/Kiyomi-7.7.7](https://github.com/RichardEchols/Kiyomi-7.7.7) | `294b31008135` | business-owner AI-company module contracts | candidate-after-review | secret-scan, closed-model-output-check, private-data-review | credentials, tokens, private client state, closed-model transcripts | Use module contracts, templates, acceptance gates, and owner-facing task structure as high-signal business-owner curriculum. |
18
+ | [RichardEchols/kiyomi-agent](https://github.com/RichardEchols/kiyomi-agent) | `b192c910f3f7` | business OS wrapper and local-agent patterns | candidate-after-review | secret-scan, closed-model-output-check, private-data-review | credentials, tokens, local runtime state, private support logs | Use architecture, docs, scripts, and safe wrapper patterns. Do not train on runtime secrets or private logs. |
19
+ | [RichardEchols/rmdw-site](https://github.com/RichardEchols/rmdw-site) | `df089dc3b2d3` | public RMDW offer, site, and conversion surface | candidate-after-review | secret-scan, closed-model-output-check, public-copy-review | environment files, deployment secrets, analytics tokens | Use public offer copy, app structure, pricing/CTA patterns, and website implementation patterns. |
20
+ | [RichardEchols/makotoair](https://github.com/RichardEchols/makotoair) | `7568f07fea6e` | client website implementation pattern | eval-and-patterns-only | secret-scan, client-data-review, consent-review | client-specific, contact data, contracts, private business details | Use as eval/pattern inspiration for local service business sites. Do not bulk-train on client-specific text without explicit review. |
21
+ | [RichardEchols/Mezzal-Construction](https://github.com/RichardEchols/Mezzal-Construction) | `e8f2eede0405` | client website implementation pattern | eval-and-patterns-only | secret-scan, client-data-review, consent-review | client-specific, contact data, contracts, private business details | Use as eval/pattern inspiration for premium contractor site work. Do not bulk-train on client-specific text without explicit review. |
22
+ | [RichardEchols/rmdw-agent-wiki](https://github.com/RichardEchols/rmdw-agent-wiki) | `ae1b8e85d3fe` | RMDW/Kiyomi operational wiki | selective-reference-only | secret-scan, credentials-redaction, private-data-review, closed-model-output-check | credentials.md, customers.md, raw, contracts, private client notes, support logs | Use only redacted strategy/product notes and documented decisions. Never use raw credentials or private client data. |
23
+
24
+ ## Local Source Inventory
25
+
26
+ Local files are context snapshots, not the source of truth. Promote local wiki material into training only after explicit review, redaction, and either sync/diff against the GitHub wiki or a documented reviewer exception.
27
+
28
+ | Source | Path | Git repo | Files | Training use | Required gates | Excluded paths present | Safe reference candidates | Notes |
29
+ |---|---|---:|---:|---|---|---|---|---|
30
+ | RMDW-Wiki-local | `/Users/richardecholsai7/Documents/RMDW-Wiki` | no | 93 | selective-reference-only | secret-scan, credentials-redaction, private-data-review, sync-or-diff-against-github | credentials.md, customers.md, customers/, raw/ | README.md, kaiju-coder-build-log.md, kaiju-coder-business-plan.md, kaiju-coder-soul.md, kiyomi-agent-build-log.md, pricing-history.md, product/kiyomi-private-ai-workstation.md, ops/product-ops-automation.md, client-acquisition-engine/README.md | Use as a local context snapshot only after explicit row-level review. Do not treat unsynced local files as the authoritative training source. |
31
+
32
+ ## Training Eligibility Meaning
33
+
34
+ - `candidate-after-review`: source can produce training or eval examples only after secret scanning, closed-model-output review, and row-level provenance.
35
+ - `eval-and-patterns-only`: use for hard eval prompts, harness behavior, screenshots, or generalized patterns. Do not bulk-train on client-specific source text.
36
+ - `selective-reference-only`: use narrowly after redaction. Treat credentials, customer notes, and raw operational data as excluded by default.
37
+ - Local snapshots require review against the GitHub source of truth before promotion into dataset rows.
38
+
39
+ ## Next Dataset Step
40
+
41
+ Generate candidate examples only from reviewed paths, attach this inventory SHA or local snapshot data to each row, then run `scripts/validate_training_data.py` before any training run.
UPSTREAM_LICENSE_CHECK.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Upstream License Check
2
+
3
+ Date: 2026-06-03
4
+
5
+ This is an engineering release check, not legal advice.
6
+
7
+ ## Base Model
8
+
9
+ - Upstream model: `Qwen/Qwen3.6-27B`
10
+ - Hugging Face URL: `https://huggingface.co/Qwen/Qwen3.6-27B`
11
+ - Checked revision from Hugging Face API: `6a9e13bd6fc8f0983b9b99948120bc37f49c13e9`
12
+ - Hugging Face license metadata: `apache-2.0`
13
+ - Local license copy: `release/upstream/qwen3.6-27b/LICENSE`
14
+ - Common upstream notice files checked: `NOTICE`, `NOTICE.txt`, `NOTICE.md`
15
+ - Notice result: no common notice file found at the checked upstream paths
16
+
17
+ ## Release Obligations To Preserve
18
+
19
+ - Include the upstream Apache 2.0 license with the adapter release package.
20
+ - Keep the upstream base model name and revision in the model card.
21
+ - State that Kaiju Coder is fine-tuned from Qwen; do not imply Qwen, Alibaba, or upstream-author endorsement.
22
+ - Include a modification note for the LoRA adapter and RMDW/Kiyomi training/eval package.
23
+ - Retain warranty and limitation language through the included Apache 2.0 license.
24
+
25
+ ## Current Packaging Status
26
+
27
+ Passed for release review:
28
+
29
+ - Upstream license file copied locally.
30
+ - Upstream revision recorded.
31
+ - Upstream license metadata recorded.
32
+ - Notice check recorded.
33
+
34
+ Still requires human release review:
35
+
36
+ - Confirm no upstream files changed before upload.
37
+ - Confirm the final Hugging Face repository includes the copied license file and model card.
38
+ - Confirm public wording avoids endorsement or trademark confusion.
adapter_config.json ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": null,
6
+ "base_model_name_or_path": "/workspace/kaiju-coder/models/Qwen3.6-27B",
7
+ "bias": "none",
8
+ "corda_config": null,
9
+ "ensure_weight_tying": false,
10
+ "eva_config": null,
11
+ "exclude_modules": null,
12
+ "fan_in_fan_out": false,
13
+ "inference_mode": true,
14
+ "init_lora_weights": true,
15
+ "layer_replication": null,
16
+ "layers_pattern": null,
17
+ "layers_to_transform": null,
18
+ "loftq_config": {},
19
+ "lora_alpha": 32,
20
+ "lora_bias": false,
21
+ "lora_dropout": 0.02,
22
+ "lora_ga_config": null,
23
+ "megatron_config": null,
24
+ "megatron_core": "megatron.core",
25
+ "modules_to_save": null,
26
+ "peft_type": "LORA",
27
+ "peft_version": "0.19.1",
28
+ "qalora_group_size": 16,
29
+ "r": 16,
30
+ "rank_pattern": {},
31
+ "revision": null,
32
+ "target_modules": [
33
+ "up_proj",
34
+ "v_proj",
35
+ "q_proj",
36
+ "o_proj",
37
+ "down_proj",
38
+ "gate_proj",
39
+ "k_proj"
40
+ ],
41
+ "target_parameters": null,
42
+ "task_type": "CAUSAL_LM",
43
+ "trainable_token_indices": null,
44
+ "use_bdlora": null,
45
+ "use_dora": false,
46
+ "use_qalora": false,
47
+ "use_rslora": false
48
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e4c8842d3e7cc98303a0eab68c6c24cd0bb526e95834395e828d2440a929c85
3
+ size 318835672
chat_template.jinja ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- set image_count = namespace(value=0) %}
2
+ {%- set video_count = namespace(value=0) %}
3
+ {%- macro render_content(content, do_vision_count, is_system_content=false) %}
4
+ {%- if content is string %}
5
+ {{- content }}
6
+ {%- elif content is iterable and content is not mapping %}
7
+ {%- for item in content %}
8
+ {%- if 'image' in item or 'image_url' in item or item.type == 'image' %}
9
+ {%- if is_system_content %}
10
+ {{- raise_exception('System message cannot contain images.') }}
11
+ {%- endif %}
12
+ {%- if do_vision_count %}
13
+ {%- set image_count.value = image_count.value + 1 %}
14
+ {%- endif %}
15
+ {%- if add_vision_id %}
16
+ {{- 'Picture ' ~ image_count.value ~ ': ' }}
17
+ {%- endif %}
18
+ {{- '<|vision_start|><|image_pad|><|vision_end|>' }}
19
+ {%- elif 'video' in item or item.type == 'video' %}
20
+ {%- if is_system_content %}
21
+ {{- raise_exception('System message cannot contain videos.') }}
22
+ {%- endif %}
23
+ {%- if do_vision_count %}
24
+ {%- set video_count.value = video_count.value + 1 %}
25
+ {%- endif %}
26
+ {%- if add_vision_id %}
27
+ {{- 'Video ' ~ video_count.value ~ ': ' }}
28
+ {%- endif %}
29
+ {{- '<|vision_start|><|video_pad|><|vision_end|>' }}
30
+ {%- elif 'text' in item %}
31
+ {{- item.text }}
32
+ {%- else %}
33
+ {{- raise_exception('Unexpected item type in content.') }}
34
+ {%- endif %}
35
+ {%- endfor %}
36
+ {%- elif content is none or content is undefined %}
37
+ {{- '' }}
38
+ {%- else %}
39
+ {{- raise_exception('Unexpected content type.') }}
40
+ {%- endif %}
41
+ {%- endmacro %}
42
+ {%- if not messages %}
43
+ {{- raise_exception('No messages provided.') }}
44
+ {%- endif %}
45
+ {%- if tools and tools is iterable and tools is not mapping %}
46
+ {{- '<|im_start|>system\n' }}
47
+ {{- "# Tools\n\nYou have access to the following functions:\n\n<tools>" }}
48
+ {%- for tool in tools %}
49
+ {{- "\n" }}
50
+ {{- tool | tojson }}
51
+ {%- endfor %}
52
+ {{- "\n</tools>" }}
53
+ {{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
54
+ {%- if messages[0].role == 'system' %}
55
+ {%- set content = render_content(messages[0].content, false, true)|trim %}
56
+ {%- if content %}
57
+ {{- '\n\n' + content }}
58
+ {%- endif %}
59
+ {%- endif %}
60
+ {{- '<|im_end|>\n' }}
61
+ {%- else %}
62
+ {%- if messages[0].role == 'system' %}
63
+ {%- set content = render_content(messages[0].content, false, true)|trim %}
64
+ {{- '<|im_start|>system\n' + content + '<|im_end|>\n' }}
65
+ {%- endif %}
66
+ {%- endif %}
67
+ {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
68
+ {%- for message in messages[::-1] %}
69
+ {%- set index = (messages|length - 1) - loop.index0 %}
70
+ {%- if ns.multi_step_tool and message.role == "user" %}
71
+ {%- set content = render_content(message.content, false)|trim %}
72
+ {%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %}
73
+ {%- set ns.multi_step_tool = false %}
74
+ {%- set ns.last_query_index = index %}
75
+ {%- endif %}
76
+ {%- endif %}
77
+ {%- endfor %}
78
+ {%- if ns.multi_step_tool %}
79
+ {{- raise_exception('No user query found in messages.') }}
80
+ {%- endif %}
81
+ {%- for message in messages %}
82
+ {%- set content = render_content(message.content, true)|trim %}
83
+ {%- if message.role == "system" %}
84
+ {%- if not loop.first %}
85
+ {{- raise_exception('System message must be at the beginning.') }}
86
+ {%- endif %}
87
+ {%- elif message.role == "user" %}
88
+ {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
89
+ {%- elif message.role == "assistant" %}
90
+ {%- set reasoning_content = '' %}
91
+ {%- if message.reasoning_content is string %}
92
+ {%- set reasoning_content = message.reasoning_content %}
93
+ {%- else %}
94
+ {%- if '</think>' in content %}
95
+ {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
96
+ {%- set content = content.split('</think>')[-1].lstrip('\n') %}
97
+ {%- endif %}
98
+ {%- endif %}
99
+ {%- set reasoning_content = reasoning_content|trim %}
100
+ {%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %}
101
+ {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
102
+ {%- else %}
103
+ {{- '<|im_start|>' + message.role + '\n' + content }}
104
+ {%- endif %}
105
+ {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %}
106
+ {%- for tool_call in message.tool_calls %}
107
+ {%- if tool_call.function is defined %}
108
+ {%- set tool_call = tool_call.function %}
109
+ {%- endif %}
110
+ {%- if loop.first %}
111
+ {%- if content|trim %}
112
+ {{- '\n\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
113
+ {%- else %}
114
+ {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
115
+ {%- endif %}
116
+ {%- else %}
117
+ {{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
118
+ {%- endif %}
119
+ {%- if tool_call.arguments is defined %}
120
+ {%- for args_name, args_value in tool_call.arguments|items %}
121
+ {{- '<parameter=' + args_name + '>\n' }}
122
+ {%- set args_value = args_value | string if args_value is string else args_value | tojson | safe %}
123
+ {{- args_value }}
124
+ {{- '\n</parameter>\n' }}
125
+ {%- endfor %}
126
+ {%- endif %}
127
+ {{- '</function>\n</tool_call>' }}
128
+ {%- endfor %}
129
+ {%- endif %}
130
+ {{- '<|im_end|>\n' }}
131
+ {%- elif message.role == "tool" %}
132
+ {%- if loop.previtem and loop.previtem.role != "tool" %}
133
+ {{- '<|im_start|>user' }}
134
+ {%- endif %}
135
+ {{- '\n<tool_response>\n' }}
136
+ {{- content }}
137
+ {{- '\n</tool_response>' }}
138
+ {%- if not loop.last and loop.nextitem.role != "tool" %}
139
+ {{- '<|im_end|>\n' }}
140
+ {%- elif loop.last %}
141
+ {{- '<|im_end|>\n' }}
142
+ {%- endif %}
143
+ {%- else %}
144
+ {{- raise_exception('Unexpected message role.') }}
145
+ {%- endif %}
146
+ {%- endfor %}
147
+ {%- if add_generation_prompt %}
148
+ {{- '<|im_start|>assistant\n' }}
149
+ {%- if enable_thinking is defined and enable_thinking is false %}
150
+ {{- '<think>\n\n</think>\n\n' }}
151
+ {%- else %}
152
+ {{- '<think>\n' }}
153
+ {%- endif %}
154
+ {%- endif %}
cloudflare-bindings.example.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "d1_database": {
3
+ "binding": "KAIJU_BILLING_DB",
4
+ "database_name": "kaiju_api_billing",
5
+ "database_id": "replace_with_real_d1_database_id"
6
+ },
7
+ "kv_namespace": {
8
+ "binding": "KAIJU_RATE_LIMIT_KV",
9
+ "id": "replace_with_real_kv_namespace_id"
10
+ },
11
+ "r2_bucket": {
12
+ "binding": "KAIJU_ARTIFACT_BUCKET",
13
+ "bucket_name": "kaiju-api-artifacts"
14
+ },
15
+ "workers_dev": false
16
+ }
hf-release-permission-evidence.example.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "status": "pending",
3
+ "checked_at": "replace_with_utc_timestamp",
4
+ "namespace": "RMDWLLC",
5
+ "authenticated_user": "replace_with_hf_username",
6
+ "probe_repo": "RMDWLLC/kaiju-coder-7-permission-probe",
7
+ "command": "KAIJU_HF_PERMISSION_PROBE_APPLY=1 bash scripts/check_hf_release_permissions.sh",
8
+ "result": "replace_with_private_model_repo_create_result"
9
+ }
paid-api-launch-evidence.example.json ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "public_route_mode": {
3
+ "status": "pending",
4
+ "checked_at": "2026-06-03T00:00:00Z",
5
+ "exposure_mode": "custom_domain",
6
+ "route": "https://api.example.com",
7
+ "result": "custom domain resolves to the intended Kaiju Worker"
8
+ },
9
+ "wrangler_secrets_verified": {
10
+ "status": "pending",
11
+ "checked_at": "2026-06-03T00:00:00Z",
12
+ "command": "wrangler secret list",
13
+ "observed_names": [
14
+ "KAIJU_ORIGIN_URL",
15
+ "KAIJU_ORIGIN_SECRET",
16
+ "KAIJU_STRIPE_WEBHOOK_SECRET"
17
+ ],
18
+ "notes": "Record only secret names. Never include secret values."
19
+ },
20
+ "d1_migration_applied": {
21
+ "status": "pending",
22
+ "checked_at": "2026-06-03T00:00:00Z",
23
+ "command": "wrangler d1 migrations apply kaiju-billing --remote",
24
+ "migration": "0001_paid_api.sql",
25
+ "result": "success"
26
+ },
27
+ "stripe_checkout_topup_staging": {
28
+ "status": "pending",
29
+ "checked_at": "2026-06-03T00:00:00Z",
30
+ "mode": "test",
31
+ "webhook_event": "checkout.session.completed",
32
+ "credited_api_key_id": "key_staging_001",
33
+ "idempotency_checked": true,
34
+ "notes": "Do not include Stripe secret keys or webhook signing secrets."
35
+ },
36
+ "worker_to_gojira_staging_request": {
37
+ "status": "pending",
38
+ "checked_at": "2026-06-03T00:00:00Z",
39
+ "route": "/v1/chat/completions",
40
+ "model": "kaiju-coder-7",
41
+ "http_status": 200,
42
+ "streamed": true,
43
+ "request_id": "req_staging_001",
44
+ "notes": "Do not include bearer tokens or full private prompts."
45
+ },
46
+ "rollback_exercised": {
47
+ "status": "pending",
48
+ "checked_at": "2026-06-03T00:00:00Z",
49
+ "command": "wrangler rollback",
50
+ "result": "success"
51
+ },
52
+ "paid_route_latency": {
53
+ "status": "pending",
54
+ "checked_at": "2026-06-03T00:00:00Z",
55
+ "route": "/v1/chat/completions",
56
+ "sample_count": 5,
57
+ "p95_ms": 90000,
58
+ "max_acceptable_ms": 120000,
59
+ "notes": "Use staging traffic. Record coarse metrics only."
60
+ }
61
+ }
scripts/apply_paid_api_cloudflare_bindings.py ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Apply reviewed Cloudflare D1/KV/R2 bindings to the Kaiju Worker config.
3
+
4
+ The script is preview-only by default. It accepts resource IDs/names, not
5
+ secrets, and refuses placeholder or secret-looking values.
6
+ """
7
+
8
+ from __future__ import annotations
9
+
10
+ import argparse
11
+ import json
12
+ import re
13
+ import sys
14
+ from pathlib import Path
15
+ from typing import Any
16
+
17
+
18
+ ROOT = Path(__file__).resolve().parents[1]
19
+ DEFAULT_BINDINGS = ROOT / "release/cloudflare-bindings.json"
20
+ DEFAULT_WRANGLER = ROOT / "gateway/cloudflare-worker/wrangler.jsonc"
21
+ SECRET_PATTERNS = [
22
+ ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
23
+ ("anthropic_api_key", re.compile(r"\bsk-ant-[A-Za-z0-9_-]{20,}\b")),
24
+ ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
25
+ ("stripe_webhook_secret", re.compile(r"\bwhsec_[A-Za-z0-9]{16,}\b")),
26
+ ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
27
+ ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
28
+ ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
29
+ ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
30
+ ]
31
+
32
+
33
+ def strip_jsonc(text: str) -> str:
34
+ text = re.sub(r"/\*.*?\*/", "", text, flags=re.DOTALL)
35
+ lines = []
36
+ for line in text.splitlines():
37
+ in_string = False
38
+ escaped = False
39
+ output = []
40
+ index = 0
41
+ while index < len(line):
42
+ char = line[index]
43
+ nxt = line[index + 1] if index + 1 < len(line) else ""
44
+ if char == "\\" and in_string:
45
+ escaped = not escaped
46
+ output.append(char)
47
+ elif char == '"' and not escaped:
48
+ in_string = not in_string
49
+ output.append(char)
50
+ elif char == "/" and nxt == "/" and not in_string:
51
+ break
52
+ else:
53
+ escaped = False
54
+ output.append(char)
55
+ index += 1
56
+ lines.append("".join(output))
57
+ return re.sub(r",\s*([}\]])", r"\1", "\n".join(lines))
58
+
59
+
60
+ def load_json_or_jsonc(path: Path) -> dict[str, Any]:
61
+ text = path.read_text(encoding="utf-8")
62
+ try:
63
+ return json.loads(strip_jsonc(text))
64
+ except json.JSONDecodeError as exc:
65
+ raise SystemExit(f"{path} is not valid JSON/JSONC: {exc}") from exc
66
+
67
+
68
+ def secret_findings(payload: Any) -> list[str]:
69
+ rendered = json.dumps(payload, sort_keys=True)
70
+ return sorted({label for label, pattern in SECRET_PATTERNS if pattern.search(rendered)})
71
+
72
+
73
+ def safe_value(value: Any, *, name: str, pattern: str, allow_placeholder: bool = False) -> str:
74
+ text = str(value or "").strip()
75
+ if not text:
76
+ raise SystemExit(f"Missing required Cloudflare binding value: {name}")
77
+ if not allow_placeholder and text.startswith("replace_with_"):
78
+ raise SystemExit(f"Refusing placeholder Cloudflare binding value for {name}: {text}")
79
+ if not re.fullmatch(pattern, text):
80
+ raise SystemExit(f"Unsafe Cloudflare binding value for {name}: {text!r}")
81
+ return text
82
+
83
+
84
+ def build_bindings(raw: dict[str, Any]) -> dict[str, Any]:
85
+ findings = secret_findings(raw)
86
+ if findings:
87
+ raise SystemExit("Refusing secret-looking Cloudflare binding input: " + ", ".join(findings))
88
+
89
+ d1 = raw.get("d1_database") or {}
90
+ kv = raw.get("kv_namespace") or {}
91
+ r2 = raw.get("r2_bucket") or {}
92
+
93
+ result: dict[str, Any] = {
94
+ "d1_databases": [
95
+ {
96
+ "binding": safe_value(d1.get("binding", "KAIJU_BILLING_DB"), name="d1_database.binding", pattern=r"[A-Z0-9_]{3,64}"),
97
+ "database_name": safe_value(d1.get("database_name", "kaiju_api_billing"), name="d1_database.database_name", pattern=r"[A-Za-z0-9._-]{3,128}"),
98
+ "database_id": safe_value(d1.get("database_id"), name="d1_database.database_id", pattern=r"[A-Za-z0-9_-]{12,128}"),
99
+ }
100
+ ],
101
+ "kv_namespaces": [
102
+ {
103
+ "binding": safe_value(kv.get("binding", "KAIJU_RATE_LIMIT_KV"), name="kv_namespace.binding", pattern=r"[A-Z0-9_]{3,64}"),
104
+ "id": safe_value(kv.get("id"), name="kv_namespace.id", pattern=r"[A-Za-z0-9_-]{12,128}"),
105
+ }
106
+ ],
107
+ "r2_buckets": [
108
+ {
109
+ "binding": safe_value(r2.get("binding", "KAIJU_ARTIFACT_BUCKET"), name="r2_bucket.binding", pattern=r"[A-Z0-9_]{3,64}"),
110
+ "bucket_name": safe_value(r2.get("bucket_name", "kaiju-api-artifacts"), name="r2_bucket.bucket_name", pattern=r"[a-z0-9][a-z0-9.-]{1,61}[a-z0-9]"),
111
+ }
112
+ ],
113
+ }
114
+ if "workers_dev" in raw:
115
+ if not isinstance(raw["workers_dev"], bool):
116
+ raise SystemExit("workers_dev must be true or false")
117
+ result["workers_dev"] = raw["workers_dev"]
118
+ return result
119
+
120
+
121
+ def apply_bindings(config: dict[str, Any], bindings: dict[str, Any]) -> dict[str, Any]:
122
+ updated = dict(config)
123
+ for key in ["d1_databases", "kv_namespaces", "r2_buckets"]:
124
+ updated[key] = bindings[key]
125
+ if "workers_dev" in bindings:
126
+ updated["workers_dev"] = bindings["workers_dev"]
127
+ return updated
128
+
129
+
130
+ def parse_args() -> argparse.Namespace:
131
+ parser = argparse.ArgumentParser(description=__doc__)
132
+ parser.add_argument("--bindings-file", type=Path, default=DEFAULT_BINDINGS)
133
+ parser.add_argument("--wrangler-config", type=Path, default=DEFAULT_WRANGLER)
134
+ parser.add_argument("--out", type=Path, help="Write preview to this file instead of stdout. With --write, defaults to --wrangler-config.")
135
+ parser.add_argument("--write", action="store_true", help="Update wrangler.jsonc. Default is preview only.")
136
+ return parser.parse_args()
137
+
138
+
139
+ def main() -> int:
140
+ args = parse_args()
141
+ raw = load_json_or_jsonc(args.bindings_file)
142
+ config = load_json_or_jsonc(args.wrangler_config)
143
+ updated = apply_bindings(config, build_bindings(raw))
144
+ rendered = json.dumps(updated, indent=2, sort_keys=False) + "\n"
145
+
146
+ if args.write:
147
+ target = args.out or args.wrangler_config
148
+ target.parent.mkdir(parents=True, exist_ok=True)
149
+ target.write_text(rendered, encoding="utf-8")
150
+ print(f"Wrote reviewed Cloudflare bindings to {target}")
151
+ elif args.out:
152
+ args.out.parent.mkdir(parents=True, exist_ok=True)
153
+ args.out.write_text(rendered, encoding="utf-8")
154
+ print(f"Wrote preview Cloudflare config to {args.out}")
155
+ else:
156
+ print(rendered, end="")
157
+ print("Preview only. Pass --write to update wrangler.jsonc.", file=sys.stderr)
158
+ return 0
159
+
160
+
161
+ if __name__ == "__main__":
162
+ raise SystemExit(main())
scripts/check_hf_release_permission_evidence.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Validate sanitized Hugging Face repo-create permission evidence.
3
+
4
+ The evidence file must not contain tokens or credentials. It records only that
5
+ the authenticated account successfully created a private model repo probe for
6
+ the intended namespace.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import argparse
12
+ import json
13
+ import re
14
+ from dataclasses import asdict, dataclass
15
+ from pathlib import Path
16
+ from typing import Any
17
+
18
+
19
+ ROOT = Path(__file__).resolve().parents[1]
20
+ DEFAULT_EVIDENCE = ROOT / "release/hf-release-permission-evidence.json"
21
+ EXAMPLE_EVIDENCE = ROOT / "release/hf-release-permission-evidence.example.json"
22
+ SECRET_PATTERNS = [
23
+ ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
24
+ ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
25
+ ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
26
+ ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
27
+ ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
28
+ ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
29
+ ]
30
+
31
+
32
+ @dataclass
33
+ class Check:
34
+ name: str
35
+ status: str
36
+ detail: str
37
+
38
+
39
+ def read_json(path: Path) -> tuple[dict[str, Any], Check | None]:
40
+ if not path.is_file():
41
+ return {}, Check("HF permission evidence file", "fail", f"missing file: {path}")
42
+ text = path.read_text(encoding="utf-8")
43
+ findings = sorted({label for label, pattern in SECRET_PATTERNS if pattern.search(text)})
44
+ if findings:
45
+ return {}, Check("HF permission evidence file", "fail", "secret-looking values found: " + ", ".join(findings))
46
+ try:
47
+ payload = json.loads(text)
48
+ except json.JSONDecodeError as exc:
49
+ return {}, Check("HF permission evidence file", "fail", f"invalid JSON: {exc}")
50
+ if not isinstance(payload, dict):
51
+ return {}, Check("HF permission evidence file", "fail", f"{path} must contain a JSON object")
52
+ return payload, Check("HF permission evidence file", "pass", f"loaded sanitized evidence from {path}")
53
+
54
+
55
+ def text_field(payload: dict[str, Any], field: str) -> str:
56
+ return str(payload.get(field) or "").strip()
57
+
58
+
59
+ def has_placeholder(value: str) -> bool:
60
+ return "replace_with_" in value.lower()
61
+
62
+
63
+ def validate(path: Path) -> list[Check]:
64
+ payload, file_check = read_json(path)
65
+ checks = [file_check] if file_check else []
66
+ if not payload:
67
+ return checks
68
+
69
+ required = ["status", "checked_at", "namespace", "authenticated_user", "probe_repo", "command", "result"]
70
+ missing = [field for field in required if not text_field(payload, field)]
71
+ if missing:
72
+ checks.append(Check("HF permission evidence fields", "fail", "missing fields: " + ", ".join(missing)))
73
+ return checks
74
+ checks.append(Check("HF permission evidence fields", "pass", f"{len(required)} required fields present"))
75
+
76
+ placeholders = [field for field in required if has_placeholder(text_field(payload, field))]
77
+ if placeholders:
78
+ checks.append(Check("HF permission placeholders", "fail", "replace placeholder values: " + ", ".join(placeholders)))
79
+ else:
80
+ checks.append(Check("HF permission placeholders", "pass", "no placeholder values remain"))
81
+
82
+ namespace = text_field(payload, "namespace")
83
+ user = text_field(payload, "authenticated_user")
84
+ probe_repo = text_field(payload, "probe_repo")
85
+ command = text_field(payload, "command")
86
+ result = text_field(payload, "result").lower()
87
+ checked_at = text_field(payload, "checked_at")
88
+
89
+ if payload.get("status") != "pass":
90
+ checks.append(Check("HF permission status", "fail", f"status is {payload.get('status')!r}; expected pass"))
91
+ else:
92
+ checks.append(Check("HF permission status", "pass", "repo-create permission probe passed"))
93
+
94
+ if not re.fullmatch(r"[A-Za-z0-9][A-Za-z0-9_.-]{1,95}", namespace):
95
+ checks.append(Check("HF namespace format", "fail", f"unsafe namespace: {namespace!r}"))
96
+ elif probe_repo != f"{namespace}/kaiju-coder-7-permission-probe":
97
+ checks.append(Check("HF probe repo", "fail", f"probe_repo must be {namespace}/kaiju-coder-7-permission-probe"))
98
+ else:
99
+ checks.append(Check("HF probe repo", "pass", probe_repo))
100
+
101
+ if not re.fullmatch(r"[A-Za-z0-9][A-Za-z0-9_.-]{1,95}", user):
102
+ checks.append(Check("HF authenticated user format", "fail", f"unsafe authenticated_user: {user!r}"))
103
+ else:
104
+ checks.append(Check("HF authenticated user format", "pass", user))
105
+
106
+ if not re.match(r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z$", checked_at):
107
+ checks.append(Check("HF permission checked_at", "fail", "checked_at must be UTC like 2026-06-03T19:00:00Z"))
108
+ else:
109
+ checks.append(Check("HF permission checked_at", "pass", checked_at))
110
+
111
+ if "hf repos create" not in command and "check_hf_release_permissions.sh" not in command:
112
+ checks.append(Check("HF permission command", "fail", "command must name the repo-create probe command"))
113
+ elif "auth list" in command:
114
+ checks.append(Check("HF permission command", "fail", "command must not include auth list output"))
115
+ else:
116
+ checks.append(Check("HF permission command", "pass", "repo-create probe command recorded"))
117
+
118
+ if "succeeded" not in result and "passed" not in result:
119
+ checks.append(Check("HF permission result", "fail", "result must record that private model repo creation succeeded"))
120
+ else:
121
+ checks.append(Check("HF permission result", "pass", "private model repo-create permission recorded"))
122
+
123
+ return checks
124
+
125
+
126
+ def summarize(checks: list[Check]) -> dict[str, Any]:
127
+ return {
128
+ "ready": not any(check.status == "fail" for check in checks),
129
+ "summary": {
130
+ "pass": sum(1 for check in checks if check.status == "pass"),
131
+ "fail": sum(1 for check in checks if check.status == "fail"),
132
+ "manual": sum(1 for check in checks if check.status == "manual"),
133
+ },
134
+ "checks": [asdict(check) for check in checks],
135
+ }
136
+
137
+
138
+ def print_text(result: dict[str, Any]) -> None:
139
+ print(f"Kaiju Coder 7 HF permission evidence: ready={result['ready']}")
140
+ print(
141
+ "Summary: "
142
+ f"{result['summary']['pass']} pass, "
143
+ f"{result['summary']['fail']} fail, "
144
+ f"{result['summary']['manual']} manual"
145
+ )
146
+ for check in result["checks"]:
147
+ print(f"[{check['status']}] {check['name']} - {check['detail']}")
148
+
149
+
150
+ def main() -> int:
151
+ parser = argparse.ArgumentParser(description=__doc__)
152
+ parser.add_argument("--evidence-file", type=Path, default=DEFAULT_EVIDENCE)
153
+ parser.add_argument("--json", action="store_true")
154
+ args = parser.parse_args()
155
+ result = summarize(validate(args.evidence_file))
156
+ if args.json:
157
+ print(json.dumps(result, indent=2))
158
+ else:
159
+ print_text(result)
160
+ return 0 if result["ready"] else 1
161
+
162
+
163
+ if __name__ == "__main__":
164
+ raise SystemExit(main())
scripts/check_hf_uploaded_release.py ADDED
@@ -0,0 +1,384 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Verify uploaded Kaiju Coder 7 Hugging Face repos after private upload.
3
+
4
+ The default mode is a dry run that prints the exact checks without downloading
5
+ or reading auth tokens. Pass --apply after Hugging Face namespace permission and
6
+ human review are complete. Private repos are verified through the existing HF
7
+ CLI login; tokens are never accepted as arguments or printed.
8
+ """
9
+
10
+ from __future__ import annotations
11
+
12
+ import argparse
13
+ import json
14
+ import shutil
15
+ import subprocess
16
+ import sys
17
+ import tempfile
18
+ import urllib.error
19
+ import urllib.request
20
+ from dataclasses import asdict, dataclass
21
+ from pathlib import Path
22
+ from typing import Any
23
+
24
+
25
+ MODEL_ID = "kaiju-coder-7"
26
+ DEFAULT_NAMESPACE = "RMDWLLC"
27
+ DEFAULT_BASE_URL = "http://100.109.109.14:18083/v1"
28
+
29
+
30
+ @dataclass(frozen=True)
31
+ class RepoSpec:
32
+ key: str
33
+ suffix: str
34
+ label: str
35
+ required_files: tuple[str, ...]
36
+ marker_files: tuple[tuple[str, tuple[str, ...]], ...]
37
+
38
+ def repo_id(self, namespace: str) -> str:
39
+ return f"{namespace}/{self.suffix}"
40
+
41
+
42
+ @dataclass
43
+ class Check:
44
+ name: str
45
+ status: str
46
+ detail: str
47
+
48
+
49
+ REPOS: tuple[RepoSpec, ...] = (
50
+ RepoSpec(
51
+ key="adapter",
52
+ suffix="kaiju-coder-7-adapter",
53
+ label="adapter repo",
54
+ required_files=(
55
+ "README.md",
56
+ "adapter_config.json",
57
+ "adapter_model.safetensors",
58
+ "DATA_PROVENANCE_DRAFT.md",
59
+ "SOURCE_INVENTORY.md",
60
+ "EVAL_SCOREBOARD.md",
61
+ "SERVING_BENCHMARKS.md",
62
+ "PAID_API_READINESS.md",
63
+ "PUBLIC_TESTING_QUICKSTART.md",
64
+ "FINAL_RELEASE_REPORT.md",
65
+ "GOAL_COMPLETION_AUDIT.md",
66
+ "UPSTREAM_LICENSE_CHECK.md",
67
+ "upstream/qwen3.6-27b/LICENSE",
68
+ "scripts/check_hf_uploaded_release.py",
69
+ "scripts/check_hf_release_permission_evidence.py",
70
+ ),
71
+ marker_files=(
72
+ ("README.md", ("Kaiju Coder 7", MODEL_ID)),
73
+ ("PUBLIC_TESTING_QUICKSTART.md", ("Kaiju Coder 7 Public Testing Quickstart", MODEL_ID)),
74
+ ("FINAL_RELEASE_REPORT.md", ("Kaiju Coder 7 Final Release Report", "Public Launch Blockers")),
75
+ ),
76
+ ),
77
+ RepoSpec(
78
+ key="opencode",
79
+ suffix="kaiju-coder-7-opencode",
80
+ label="OpenCode helper repo",
81
+ required_files=(
82
+ "README.md",
83
+ "PUBLIC_TESTING_QUICKSTART.md",
84
+ "opencode.kaiju-coder-7.jsonc",
85
+ ".opencode/agents/kaiju-coder-7.md",
86
+ "scripts/install_kaiju_opencode_profile.py",
87
+ "scripts/opencode-kaiju-no-autocontinue.mjs",
88
+ "scripts/run_kaiju_public_opencode_smoke.py",
89
+ "scripts/run_kaiju_opencode_customer_pack.py",
90
+ "scripts/check_hf_uploaded_release.py",
91
+ "evals/tasks/opencode-customer-readiness.jsonl",
92
+ ),
93
+ marker_files=(
94
+ ("README.md", ("Kaiju Coder 7", "opencode -m kaiju/kaiju-coder-7")),
95
+ ("opencode.kaiju-coder-7.jsonc", (MODEL_ID, '"context": 16384')),
96
+ (".opencode/agents/kaiju-coder-7.md", ("You are Kaiju Coder 7", "Confirm the current working directory")),
97
+ ("scripts/opencode-kaiju-no-autocontinue.mjs", ("experimental.compaction.autocontinue", MODEL_ID)),
98
+ ),
99
+ ),
100
+ RepoSpec(
101
+ key="quantized-runtime",
102
+ suffix="kaiju-coder-7-quantized-runtime",
103
+ label="runtime quantization helper repo",
104
+ required_files=(
105
+ "README.md",
106
+ "PUBLIC_TESTING_QUICKSTART.md",
107
+ "scripts/start-qwen36-merged-vllm.sh",
108
+ "scripts/stop-qwen36-merged-vllm.sh",
109
+ "scripts/run-gojira-b-vllm-serving-benchmark.sh",
110
+ ),
111
+ marker_files=(
112
+ ("README.md", ("Runtime-Quantized Local Candidate", "bitsandbytes", "Kaiju Coder 7")),
113
+ ("PUBLIC_TESTING_QUICKSTART.md", ("Kaiju Coder 7 Public Testing Quickstart", MODEL_ID)),
114
+ ),
115
+ ),
116
+ )
117
+
118
+
119
+ def shell_join(args: list[str]) -> str:
120
+ import shlex
121
+
122
+ return " ".join(shlex.quote(arg) for arg in args)
123
+
124
+
125
+ def run_command(args: list[str], *, cwd: Path | None = None, timeout: int) -> subprocess.CompletedProcess[str]:
126
+ return subprocess.run(
127
+ args,
128
+ cwd=cwd,
129
+ check=False,
130
+ text=True,
131
+ stdout=subprocess.PIPE,
132
+ stderr=subprocess.STDOUT,
133
+ timeout=timeout,
134
+ )
135
+
136
+
137
+ def read_text(path: Path) -> str:
138
+ return path.read_text(encoding="utf-8", errors="replace")
139
+
140
+
141
+ def selected_repos(args: argparse.Namespace) -> list[RepoSpec]:
142
+ skipped = {
143
+ "adapter": args.skip_adapter,
144
+ "opencode": args.skip_opencode,
145
+ "quantized-runtime": args.skip_quantized_runtime,
146
+ }
147
+ return [spec for spec in REPOS if not skipped[spec.key]]
148
+
149
+
150
+ def add_dry_run_checks(checks: list[Check], repos: list[RepoSpec], namespace: str, download_dir: Path) -> None:
151
+ checks.append(Check("HF uploaded release mode", "manual", "dry run only; pass --apply to download and verify repos"))
152
+ for spec in repos:
153
+ target = download_dir / spec.suffix
154
+ command = ["hf", "download", spec.repo_id(namespace), "--repo-type", "model", "--local-dir", str(target)]
155
+ checks.append(Check(f"{spec.label} download command", "manual", shell_join(command)))
156
+
157
+
158
+ def add_hf_cli_check(checks: list[Check], timeout: int) -> bool:
159
+ hf_bin = shutil.which("hf")
160
+ if not hf_bin:
161
+ checks.append(Check("HF CLI", "fail", "`hf` is not on PATH"))
162
+ return False
163
+ result = run_command([hf_bin, "auth", "whoami"], timeout=timeout)
164
+ if result.returncode == 0 and "user=" in result.stdout:
165
+ checks.append(Check("HF CLI", "pass", result.stdout.strip().replace("\n", "; ")))
166
+ return True
167
+ checks.append(Check("HF CLI", "fail", result.stdout.strip()[:800]))
168
+ return False
169
+
170
+
171
+ def download_repo(checks: list[Check], spec: RepoSpec, namespace: str, download_root: Path, timeout: int) -> Path | None:
172
+ target = download_root / spec.suffix
173
+ target.mkdir(parents=True, exist_ok=True)
174
+ command = ["hf", "download", spec.repo_id(namespace), "--repo-type", "model", "--local-dir", str(target)]
175
+ result = run_command(command, timeout=timeout)
176
+ if result.returncode == 0:
177
+ checks.append(Check(f"{spec.label} download", "pass", f"{spec.repo_id(namespace)} downloaded to {target}"))
178
+ return target
179
+ checks.append(Check(f"{spec.label} download", "fail", result.stdout.strip()[-1200:]))
180
+ return None
181
+
182
+
183
+ def check_required_files(checks: list[Check], spec: RepoSpec, root: Path) -> None:
184
+ missing = [name for name in spec.required_files if not (root / name).is_file()]
185
+ if missing:
186
+ checks.append(Check(f"{spec.label} required files", "fail", "missing: " + ", ".join(missing)))
187
+ else:
188
+ checks.append(Check(f"{spec.label} required files", "pass", f"{len(spec.required_files)} files present"))
189
+
190
+
191
+ def check_markers(checks: list[Check], spec: RepoSpec, root: Path) -> None:
192
+ failures: list[str] = []
193
+ for file_name, markers in spec.marker_files:
194
+ path = root / file_name
195
+ if not path.is_file():
196
+ failures.append(f"{file_name} missing")
197
+ continue
198
+ text = read_text(path)
199
+ missing = [marker for marker in markers if marker not in text]
200
+ if missing:
201
+ failures.append(f"{file_name} missing {', '.join(missing)}")
202
+ if failures:
203
+ checks.append(Check(f"{spec.label} content markers", "fail", "; ".join(failures)))
204
+ else:
205
+ checks.append(Check(f"{spec.label} content markers", "pass", "expected Kaiju Coder 7 markers found"))
206
+
207
+
208
+ def check_public_quickstart_naming(checks: list[Check], spec: RepoSpec, root: Path) -> None:
209
+ path = root / "PUBLIC_TESTING_QUICKSTART.md"
210
+ if not path.is_file():
211
+ return
212
+ lowered = read_text(path).lower()
213
+ forbidden = [term for term in ("qwen", "v1.8") if term in lowered]
214
+ if forbidden:
215
+ checks.append(Check(f"{spec.label} public naming hygiene", "fail", "contains: " + ", ".join(forbidden)))
216
+ else:
217
+ checks.append(Check(f"{spec.label} public naming hygiene", "pass", "public quickstart avoids internal upstream/checkpoint naming"))
218
+
219
+
220
+ def check_opencode_installer(checks: list[Check], opencode_root: Path, timeout: int) -> None:
221
+ installer = opencode_root / "scripts/install_kaiju_opencode_profile.py"
222
+ if not installer.is_file():
223
+ checks.append(Check("uploaded OpenCode installer dry-run", "fail", f"missing {installer}"))
224
+ return
225
+ with tempfile.TemporaryDirectory(prefix="kaiju-uploaded-opencode-config-") as tmp:
226
+ result = run_command(
227
+ [sys.executable, str(installer), "--config-dir", tmp, "--dry-run"],
228
+ cwd=opencode_root,
229
+ timeout=timeout,
230
+ )
231
+ if result.returncode == 0 and "kaiju-no-autocontinue.mjs" in result.stdout and MODEL_ID in result.stdout:
232
+ checks.append(Check("uploaded OpenCode installer dry-run", "pass", "staged helper installs provider, agent, and loop guard"))
233
+ else:
234
+ checks.append(Check("uploaded OpenCode installer dry-run", "fail", result.stdout.strip()[:1000]))
235
+
236
+
237
+ def run_opencode_smoke(checks: list[Check], opencode_root: Path, base_url: str, timeout: int) -> None:
238
+ script = opencode_root / "scripts/run_kaiju_public_opencode_smoke.py"
239
+ if not script.is_file():
240
+ checks.append(Check("uploaded OpenCode smoke", "fail", f"missing {script}"))
241
+ return
242
+ result = run_command([sys.executable, str(script), "--base-url", base_url, "--timeout", str(timeout)], cwd=opencode_root, timeout=timeout + 120)
243
+ if result.returncode == 0:
244
+ checks.append(Check("uploaded OpenCode smoke", "pass", "downloaded helper completed live public OpenCode smoke"))
245
+ else:
246
+ checks.append(Check("uploaded OpenCode smoke", "fail", result.stdout.strip()[-1200:]))
247
+
248
+
249
+ def check_public_visibility(checks: list[Check], spec: RepoSpec, namespace: str, timeout: int) -> None:
250
+ repo_id = spec.repo_id(namespace)
251
+ url = f"https://huggingface.co/api/models/{repo_id}"
252
+ request = urllib.request.Request(url, headers={"User-Agent": "kaiju-coder-7-release-check"})
253
+ try:
254
+ with urllib.request.urlopen(request, timeout=timeout) as response:
255
+ if response.status == 200:
256
+ checks.append(Check(f"{spec.label} public visibility", "pass", f"{repo_id} is publicly readable"))
257
+ return
258
+ checks.append(Check(f"{spec.label} public visibility", "fail", f"{url} returned HTTP {response.status}"))
259
+ except urllib.error.HTTPError as exc:
260
+ checks.append(Check(f"{spec.label} public visibility", "fail", f"{url} returned HTTP {exc.code}"))
261
+ except Exception as exc: # noqa: BLE001 - report network failures clearly.
262
+ checks.append(Check(f"{spec.label} public visibility", "fail", f"{url} failed: {exc!r}"))
263
+
264
+
265
+ def verify_downloaded_repo(checks: list[Check], spec: RepoSpec, root: Path, *, installer_timeout: int) -> None:
266
+ check_required_files(checks, spec, root)
267
+ check_markers(checks, spec, root)
268
+ check_public_quickstart_naming(checks, spec, root)
269
+ if spec.key == "opencode":
270
+ check_opencode_installer(checks, root, timeout=installer_timeout)
271
+
272
+
273
+ def summarize(checks: list[Check], *, applied: bool) -> dict[str, Any]:
274
+ return {
275
+ "ready": applied and not any(check.status in {"fail", "manual"} for check in checks),
276
+ "applied": applied,
277
+ "summary": {
278
+ "pass": sum(1 for check in checks if check.status == "pass"),
279
+ "fail": sum(1 for check in checks if check.status == "fail"),
280
+ "manual": sum(1 for check in checks if check.status == "manual"),
281
+ },
282
+ "checks": [asdict(check) for check in checks],
283
+ }
284
+
285
+
286
+ def print_text(result: dict[str, Any]) -> None:
287
+ print(f"Kaiju Coder 7 uploaded HF release verification: ready={result['ready']} applied={result['applied']}")
288
+ print(
289
+ "Summary: "
290
+ f"{result['summary']['pass']} pass, "
291
+ f"{result['summary']['fail']} fail, "
292
+ f"{result['summary']['manual']} manual"
293
+ )
294
+ for check in result["checks"]:
295
+ print(f"[{check['status']}] {check['name']} - {check['detail']}")
296
+
297
+
298
+ def parse_args() -> argparse.Namespace:
299
+ parser = argparse.ArgumentParser(description=__doc__)
300
+ parser.add_argument("--namespace", default=DEFAULT_NAMESPACE)
301
+ parser.add_argument("--download-dir", type=Path, default=None)
302
+ parser.add_argument("--apply", action="store_true", help="Download uploaded repos and verify contents.")
303
+ parser.add_argument("--require-public", action="store_true", help="Require repos to be publicly readable without auth.")
304
+ parser.add_argument("--run-opencode-smoke", action="store_true", help="Run the downloaded OpenCode helper live smoke.")
305
+ parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
306
+ parser.add_argument("--download-timeout", type=int, default=900)
307
+ parser.add_argument("--installer-timeout", type=int, default=60)
308
+ parser.add_argument("--public-timeout", type=int, default=15)
309
+ parser.add_argument("--opencode-timeout", type=int, default=900)
310
+ parser.add_argument("--skip-adapter", action="store_true")
311
+ parser.add_argument("--skip-opencode", action="store_true")
312
+ parser.add_argument("--skip-quantized-runtime", action="store_true")
313
+ parser.add_argument("--json", action="store_true")
314
+ return parser.parse_args()
315
+
316
+
317
+ def main() -> int:
318
+ args = parse_args()
319
+ repos = selected_repos(args)
320
+ checks: list[Check] = []
321
+ if not repos:
322
+ checks.append(Check("repo selection", "fail", "all repos were skipped"))
323
+ result = summarize(checks, applied=args.apply)
324
+ if args.json:
325
+ print(json.dumps(result, indent=2))
326
+ else:
327
+ print_text(result)
328
+ return 1
329
+
330
+ if args.download_dir:
331
+ download_root = args.download_dir
332
+ download_root.mkdir(parents=True, exist_ok=True)
333
+ temp_context: Any = None
334
+ else:
335
+ temp_context = tempfile.TemporaryDirectory(prefix="kaiju-hf-uploaded-")
336
+ download_root = Path(temp_context.name)
337
+
338
+ try:
339
+ if not args.apply:
340
+ add_dry_run_checks(checks, repos, args.namespace, download_root)
341
+ result = summarize(checks, applied=False)
342
+ if args.json:
343
+ print(json.dumps(result, indent=2))
344
+ else:
345
+ print_text(result)
346
+ return 0
347
+
348
+ if not add_hf_cli_check(checks, timeout=30):
349
+ result = summarize(checks, applied=True)
350
+ if args.json:
351
+ print(json.dumps(result, indent=2))
352
+ else:
353
+ print_text(result)
354
+ return 1
355
+
356
+ downloaded: dict[str, Path] = {}
357
+ for spec in repos:
358
+ if args.require_public:
359
+ check_public_visibility(checks, spec, args.namespace, timeout=args.public_timeout)
360
+ root = download_repo(checks, spec, args.namespace, download_root, timeout=args.download_timeout)
361
+ if root:
362
+ downloaded[spec.key] = root
363
+ verify_downloaded_repo(checks, spec, root, installer_timeout=args.installer_timeout)
364
+
365
+ if args.run_opencode_smoke:
366
+ opencode_root = downloaded.get("opencode")
367
+ if opencode_root:
368
+ run_opencode_smoke(checks, opencode_root, base_url=args.base_url, timeout=args.opencode_timeout)
369
+ else:
370
+ checks.append(Check("uploaded OpenCode smoke", "fail", "OpenCode helper repo was not downloaded"))
371
+
372
+ result = summarize(checks, applied=True)
373
+ if args.json:
374
+ print(json.dumps(result, indent=2))
375
+ else:
376
+ print_text(result)
377
+ return 0 if result["ready"] else 1
378
+ finally:
379
+ if temp_context is not None:
380
+ temp_context.cleanup()
381
+
382
+
383
+ if __name__ == "__main__":
384
+ raise SystemExit(main())
scripts/check_paid_api_readiness.py ADDED
@@ -0,0 +1,518 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Check Kaiju Coder 7 paid API readiness without reading secrets.
3
+
4
+ The scaffold mode should pass for the local Worker implementation. The launch
5
+ mode is intentionally stricter and should fail until real Cloudflare bindings,
6
+ Stripe webhook evidence, staging requests, and rollback proof are attached.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import argparse
12
+ import json
13
+ import re
14
+ import sys
15
+ from dataclasses import asdict, dataclass
16
+ from pathlib import Path
17
+ from typing import Any
18
+
19
+
20
+ ROOT = Path(__file__).resolve().parents[1]
21
+ WORKER = ROOT / "gateway/cloudflare-worker"
22
+ WRANGLER = WORKER / "wrangler.jsonc"
23
+ SOURCE = WORKER / "src/index.js"
24
+ TESTS = WORKER / "test/index.test.js"
25
+ MIGRATION = WORKER / "migrations/0001_paid_api.sql"
26
+ PACKAGE = WORKER / "package.json"
27
+ PAID_DOC = ROOT / "release/PAID_API_READINESS.md"
28
+ RESOURCE_SCRIPT = ROOT / "scripts/prepare_paid_api_cloudflare_resources.sh"
29
+ DEFAULT_EVIDENCE = ROOT / "release/paid-api-launch-evidence.json"
30
+ EVIDENCE_EXAMPLE = ROOT / "release/paid-api-launch-evidence.example.json"
31
+ CF_BINDINGS_EXAMPLE = ROOT / "release/cloudflare-bindings.example.json"
32
+ SECRET_PATTERNS = [
33
+ ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
34
+ ("anthropic_api_key", re.compile(r"\bsk-ant-[A-Za-z0-9_-]{20,}\b")),
35
+ ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
36
+ ("stripe_webhook_secret", re.compile(r"\bwhsec_[A-Za-z0-9]{16,}\b")),
37
+ ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
38
+ ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
39
+ ("google_api_key", re.compile(r"\bAIza[0-9A-Za-z_-]{20,}\b")),
40
+ ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
41
+ ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
42
+ ]
43
+
44
+
45
+ @dataclass
46
+ class Check:
47
+ name: str
48
+ status: str
49
+ detail: str
50
+
51
+
52
+ def file_text(path: Path) -> str:
53
+ return path.read_text(encoding="utf-8")
54
+
55
+
56
+ def strip_jsonc(text: str) -> str:
57
+ text = re.sub(r"/\*.*?\*/", "", text, flags=re.DOTALL)
58
+ lines = []
59
+ for line in text.splitlines():
60
+ in_string = False
61
+ escaped = False
62
+ output = []
63
+ index = 0
64
+ while index < len(line):
65
+ char = line[index]
66
+ nxt = line[index + 1] if index + 1 < len(line) else ""
67
+ if char == "\\" and in_string:
68
+ escaped = not escaped
69
+ output.append(char)
70
+ elif char == '"' and not escaped:
71
+ in_string = not in_string
72
+ output.append(char)
73
+ elif char == "/" and nxt == "/" and not in_string:
74
+ break
75
+ else:
76
+ escaped = False
77
+ output.append(char)
78
+ index += 1
79
+ lines.append("".join(output))
80
+ stripped = "\n".join(lines)
81
+ return re.sub(r",\s*([}\]])", r"\1", stripped)
82
+
83
+
84
+ def load_wrangler(path: Path = WRANGLER) -> dict[str, Any]:
85
+ return json.loads(strip_jsonc(file_text(path)))
86
+
87
+
88
+ def has_real_binding(bindings: list[dict[str, Any]], binding_name: str, id_field: str) -> bool:
89
+ for binding in bindings:
90
+ if binding.get("binding") != binding_name:
91
+ continue
92
+ value = str(binding.get(id_field, "")).strip()
93
+ return bool(value) and not value.startswith("replace_with_")
94
+ return False
95
+
96
+
97
+ def load_launch_evidence(path: Path) -> tuple[dict[str, Any], Check | None]:
98
+ if not path.is_file():
99
+ return {}, None
100
+ text = file_text(path)
101
+ findings = [label for label, pattern in SECRET_PATTERNS if pattern.search(text)]
102
+ if findings:
103
+ return {}, Check(
104
+ "paid API launch evidence file",
105
+ "fail",
106
+ f"{path} appears to contain secret-looking values: {', '.join(sorted(set(findings)))}",
107
+ )
108
+ try:
109
+ evidence = json.loads(text)
110
+ except json.JSONDecodeError as exc:
111
+ return {}, Check("paid API launch evidence file", "fail", f"{path} is invalid JSON: {exc}")
112
+ if not isinstance(evidence, dict):
113
+ return {}, Check("paid API launch evidence file", "fail", f"{path} must contain a JSON object")
114
+ return evidence, Check("paid API launch evidence file", "pass", f"loaded sanitized launch evidence from {path}")
115
+
116
+
117
+ def evidence_item(
118
+ checks: list[Check],
119
+ evidence: dict[str, Any],
120
+ path: Path,
121
+ key: str,
122
+ name: str,
123
+ required_fields: list[str],
124
+ validator: Any | None = None,
125
+ ) -> None:
126
+ item = evidence.get(key)
127
+ if not item:
128
+ checks.append(Check(name, "manual", f"attach sanitized evidence in {path} key `{key}`"))
129
+ return
130
+ if not isinstance(item, dict):
131
+ checks.append(Check(name, "fail", f"{path} key `{key}` must be an object"))
132
+ return
133
+ if item.get("status") != "pass":
134
+ checks.append(Check(name, "manual", f"{path} key `{key}` status is not pass"))
135
+ return
136
+ missing = [field for field in required_fields if item.get(field) in (None, "", [])]
137
+ if missing:
138
+ checks.append(Check(name, "manual", f"{path} key `{key}` missing fields: {', '.join(missing)}"))
139
+ return
140
+ if validator:
141
+ validation = validator(item)
142
+ if validation:
143
+ checks.append(Check(name, validation[0], validation[1]))
144
+ return
145
+ checks.append(Check(name, "pass", f"{path} key `{key}` has required sanitized evidence"))
146
+
147
+
148
+ def validate_public_route_mode(item: dict[str, Any]) -> tuple[str, str] | None:
149
+ if item.get("exposure_mode") != "custom_domain":
150
+ return ("manual", "public route evidence must use exposure_mode=custom_domain before paid launch")
151
+ route = str(item.get("route", ""))
152
+ if not route.startswith("https://"):
153
+ return ("manual", "public route evidence route must be an https URL")
154
+ return None
155
+
156
+
157
+ def validate_secrets_verified(item: dict[str, Any]) -> tuple[str, str] | None:
158
+ required = {"KAIJU_ORIGIN_URL", "KAIJU_ORIGIN_SECRET", "KAIJU_STRIPE_WEBHOOK_SECRET"}
159
+ observed = set(item.get("observed_names") or [])
160
+ missing = sorted(required - observed)
161
+ if missing:
162
+ return ("manual", "secret-name evidence missing: " + ", ".join(missing))
163
+ return None
164
+
165
+
166
+ def validate_d1_migration(item: dict[str, Any]) -> tuple[str, str] | None:
167
+ if item.get("migration") != "0001_paid_api.sql":
168
+ return ("manual", "D1 migration evidence must name 0001_paid_api.sql")
169
+ if item.get("result") not in {"success", "already_applied"}:
170
+ return ("manual", "D1 migration result must be success or already_applied")
171
+ return None
172
+
173
+
174
+ def validate_stripe_staging(item: dict[str, Any]) -> tuple[str, str] | None:
175
+ if item.get("webhook_event") != "checkout.session.completed":
176
+ return ("manual", "Stripe evidence must include checkout.session.completed")
177
+ if item.get("idempotency_checked") is not True:
178
+ return ("manual", "Stripe evidence must confirm duplicate webhook idempotency")
179
+ return None
180
+
181
+
182
+ def validate_staging_request(item: dict[str, Any]) -> tuple[str, str] | None:
183
+ if item.get("model") != "kaiju-coder-7":
184
+ return ("fail", "staging request evidence must use model=kaiju-coder-7")
185
+ if int(item.get("http_status") or 0) != 200:
186
+ return ("manual", "staging request evidence must show HTTP 200")
187
+ if item.get("streamed") is not True:
188
+ return ("manual", "staging request evidence must confirm streaming")
189
+ return None
190
+
191
+
192
+ def validate_rollback(item: dict[str, Any]) -> tuple[str, str] | None:
193
+ if item.get("result") != "success":
194
+ return ("manual", "rollback evidence must be a successful exercised rollback or route switch")
195
+ return None
196
+
197
+
198
+ def validate_latency(item: dict[str, Any]) -> tuple[str, str] | None:
199
+ p95_ms = float(item.get("p95_ms") or 0)
200
+ sample_count = int(item.get("sample_count") or 0)
201
+ max_acceptable_ms = float(item.get("max_acceptable_ms") or 0)
202
+ if sample_count < 5:
203
+ return ("manual", "latency evidence needs at least 5 staging samples")
204
+ if max_acceptable_ms <= 0:
205
+ return ("manual", "latency evidence must set max_acceptable_ms")
206
+ if p95_ms <= 0 or p95_ms > max_acceptable_ms:
207
+ return ("manual", f"p95_ms={p95_ms:g} exceeds max_acceptable_ms={max_acceptable_ms:g}")
208
+ return None
209
+
210
+
211
+ def add_marker_check(checks: list[Check], name: str, text: str, markers: list[str], path: Path) -> None:
212
+ missing = [marker for marker in markers if marker not in text]
213
+ if missing:
214
+ checks.append(Check(name, "fail", f"{path} missing markers: {', '.join(missing)}"))
215
+ else:
216
+ checks.append(Check(name, "pass", f"{path} contains required markers"))
217
+
218
+
219
+ def scaffold_checks(wrangler_path: Path = WRANGLER) -> list[Check]:
220
+ checks: list[Check] = []
221
+ source = file_text(SOURCE)
222
+ tests = file_text(TESTS)
223
+ migration = file_text(MIGRATION)
224
+ package = json.loads(file_text(PACKAGE))
225
+ paid_doc = file_text(PAID_DOC)
226
+ resource_script = file_text(RESOURCE_SCRIPT)
227
+ wrangler = load_wrangler(wrangler_path)
228
+
229
+ add_marker_check(
230
+ checks,
231
+ "model id enforcement",
232
+ source,
233
+ [
234
+ 'const DEFAULT_MODEL_ID = "kaiju-coder-7"',
235
+ "Unsupported model. Use",
236
+ "payload.model = modelId",
237
+ ],
238
+ SOURCE,
239
+ )
240
+ add_marker_check(
241
+ checks,
242
+ "streaming and thinking controls",
243
+ source,
244
+ ["payload.stream = true", "enable_thinking: false", "thinking: false", "streamHeaders"],
245
+ SOURCE,
246
+ )
247
+ add_marker_check(
248
+ checks,
249
+ "billing and debit/refund controls",
250
+ source,
251
+ ["KAIJU_BILLING_DB", "reserveCredit", "refundCredit", "markUsageDebited"],
252
+ SOURCE,
253
+ )
254
+ add_marker_check(
255
+ checks,
256
+ "rate limit controls",
257
+ source,
258
+ ["KAIJU_RATE_LIMIT_KV", "rateLimit", "Rate limit exceeded"],
259
+ SOURCE,
260
+ )
261
+ add_marker_check(
262
+ checks,
263
+ "secret-like prompt rejection",
264
+ source,
265
+ ["SECRET_PATTERNS", "secret_like_content", "Remove them before using Kaiju Coder 7"],
266
+ SOURCE,
267
+ )
268
+ add_marker_check(
269
+ checks,
270
+ "stripe top-up webhook",
271
+ source,
272
+ ["verifyStripeSignature", "checkout.session.completed", "stripe_topup_credited"],
273
+ SOURCE,
274
+ )
275
+ add_marker_check(
276
+ checks,
277
+ "artifact route controls",
278
+ source,
279
+ [
280
+ "KAIJU_ARTIFACT_BUCKET",
281
+ "uploadArtifact",
282
+ "downloadArtifact",
283
+ "artifact_stored",
284
+ "Artifact appears to contain secrets or credentials",
285
+ ],
286
+ SOURCE,
287
+ )
288
+ add_marker_check(
289
+ checks,
290
+ "paid API tests",
291
+ tests,
292
+ [
293
+ "rejects inactive paid API keys",
294
+ "rejects paid API requests with insufficient credits before origin fetch",
295
+ "rate limits authenticated paid API keys before debit",
296
+ "credits paid API balance from signed Stripe checkout webhook",
297
+ "rejects secret-looking prompt content before debit",
298
+ ],
299
+ TESTS,
300
+ )
301
+ add_marker_check(
302
+ checks,
303
+ "artifact route tests",
304
+ tests,
305
+ [
306
+ "stores origin-uploaded artifacts in the account/request R2 namespace",
307
+ "serves customer artifacts only through the authenticated account namespace",
308
+ "rejects unsafe artifact paths before R2 storage",
309
+ "rejects secret-looking artifact content before R2 storage",
310
+ ],
311
+ TESTS,
312
+ )
313
+ add_marker_check(
314
+ checks,
315
+ "D1 schema",
316
+ migration,
317
+ ["kaiju_api_keys", "kaiju_credit_ledger", "kaiju_usage_events"],
318
+ MIGRATION,
319
+ )
320
+ add_marker_check(
321
+ checks,
322
+ "paid readiness docs",
323
+ paid_doc,
324
+ [
325
+ "Do not sell the hosted API",
326
+ "Harnessed customer-readiness pack",
327
+ "Raw OpenCode multi-file pack remains a blocker",
328
+ ],
329
+ PAID_DOC,
330
+ )
331
+
332
+ if (
333
+ package.get("scripts", {}).get("check")
334
+ == "node --check src/index.js && node --check scripts/create-api-key.mjs && npm test && npm run check:deploy"
335
+ and package.get("scripts", {}).get("check:deploy") == "npx wrangler deploy --dry-run"
336
+ ):
337
+ checks.append(Check("gateway check command", "pass", "npm run check covers syntax, Worker tests, and Wrangler dry-run deploy"))
338
+ else:
339
+ checks.append(Check("gateway check command", "fail", "package.json check/check:deploy scripts changed or missing"))
340
+
341
+ if package.get("scripts", {}).get("prepare:cloudflare") == "bash ../../scripts/prepare_paid_api_cloudflare_resources.sh":
342
+ checks.append(Check("Cloudflare resource prep command", "pass", "npm run prepare:cloudflare is wired"))
343
+ else:
344
+ checks.append(Check("Cloudflare resource prep command", "fail", "package.json prepare:cloudflare script is missing"))
345
+
346
+ add_marker_check(
347
+ checks,
348
+ "Cloudflare resource prep script",
349
+ resource_script,
350
+ [
351
+ "KAIJU_CF_RESOURCE_APPLY",
352
+ "wrangler d1 create",
353
+ "wrangler kv namespace create",
354
+ "wrangler r2 bucket create",
355
+ "wrangler d1 migrations apply",
356
+ "wrangler rollback",
357
+ "preflight:launch",
358
+ ],
359
+ RESOURCE_SCRIPT,
360
+ )
361
+ if EVIDENCE_EXAMPLE.is_file():
362
+ checks.append(Check("paid API launch evidence template", "pass", f"{EVIDENCE_EXAMPLE} exists"))
363
+ else:
364
+ checks.append(Check("paid API launch evidence template", "fail", f"missing {EVIDENCE_EXAMPLE}"))
365
+
366
+ if CF_BINDINGS_EXAMPLE.is_file():
367
+ checks.append(Check("Cloudflare bindings template", "pass", f"{CF_BINDINGS_EXAMPLE} exists"))
368
+ else:
369
+ checks.append(Check("Cloudflare bindings template", "fail", f"missing {CF_BINDINGS_EXAMPLE}"))
370
+
371
+ if wrangler.get("name") == "kaiju-api-gateway" and wrangler.get("main") == "src/index.js":
372
+ checks.append(Check("wrangler scaffold config", "pass", "Worker name and entrypoint are present"))
373
+ else:
374
+ checks.append(Check("wrangler scaffold config", "fail", "wrangler name or entrypoint is missing"))
375
+
376
+ return checks
377
+
378
+
379
+ def launch_checks(evidence_path: Path, wrangler_path: Path = WRANGLER) -> list[Check]:
380
+ checks = scaffold_checks(wrangler_path)
381
+ wrangler = load_wrangler(wrangler_path)
382
+ evidence, evidence_check = load_launch_evidence(evidence_path)
383
+ if evidence_check and evidence_check.status == "fail":
384
+ checks.append(evidence_check)
385
+
386
+ if has_real_binding(wrangler.get("d1_databases", []), "KAIJU_BILLING_DB", "database_id"):
387
+ checks.append(Check("live D1 binding", "pass", "KAIJU_BILLING_DB has a non-placeholder database_id"))
388
+ else:
389
+ checks.append(Check("live D1 binding", "fail", "KAIJU_BILLING_DB is missing or still placeholder/commented"))
390
+
391
+ if has_real_binding(wrangler.get("kv_namespaces", []), "KAIJU_RATE_LIMIT_KV", "id"):
392
+ checks.append(Check("live KV binding", "pass", "KAIJU_RATE_LIMIT_KV has a non-placeholder id"))
393
+ else:
394
+ checks.append(Check("live KV binding", "fail", "KAIJU_RATE_LIMIT_KV is missing or still placeholder/commented"))
395
+
396
+ if has_real_binding(wrangler.get("r2_buckets", []), "KAIJU_ARTIFACT_BUCKET", "bucket_name"):
397
+ checks.append(Check("artifact R2 binding", "pass", "KAIJU_ARTIFACT_BUCKET is configured"))
398
+ else:
399
+ checks.append(Check("artifact R2 binding", "fail", "KAIJU_ARTIFACT_BUCKET is missing; artifact routes cannot launch"))
400
+
401
+ if wrangler.get("workers_dev") is False:
402
+ checks.append(Check("public route mode", "pass", "workers_dev is disabled for custom-domain launch"))
403
+ else:
404
+ evidence_item(
405
+ checks,
406
+ evidence,
407
+ evidence_path,
408
+ "public_route_mode",
409
+ "public route mode",
410
+ ["checked_at", "exposure_mode", "route", "result"],
411
+ validate_public_route_mode,
412
+ )
413
+
414
+ evidence_item(
415
+ checks,
416
+ evidence,
417
+ evidence_path,
418
+ "wrangler_secrets_verified",
419
+ "wrangler secret list confirms KAIJU_ORIGIN_URL, KAIJU_ORIGIN_SECRET, and KAIJU_STRIPE_WEBHOOK_SECRET",
420
+ ["checked_at", "command", "observed_names"],
421
+ validate_secrets_verified,
422
+ )
423
+ evidence_item(
424
+ checks,
425
+ evidence,
426
+ evidence_path,
427
+ "d1_migration_applied",
428
+ "D1 migration 0001_paid_api.sql applied to the live billing database",
429
+ ["checked_at", "command", "migration", "result"],
430
+ validate_d1_migration,
431
+ )
432
+ evidence_item(
433
+ checks,
434
+ evidence,
435
+ evidence_path,
436
+ "stripe_checkout_topup_staging",
437
+ "Stripe Checkout top-up products and webhook endpoint tested with metadata.kaiju_api_key_id",
438
+ ["checked_at", "mode", "webhook_event", "credited_api_key_id", "idempotency_checked"],
439
+ validate_stripe_staging,
440
+ )
441
+ evidence_item(
442
+ checks,
443
+ evidence,
444
+ evidence_path,
445
+ "worker_to_gojira_staging_request",
446
+ "staging request passed through Worker to Gojira-B origin with model=kaiju-coder-7",
447
+ ["checked_at", "route", "model", "http_status", "streamed", "request_id"],
448
+ validate_staging_request,
449
+ )
450
+ evidence_item(
451
+ checks,
452
+ evidence,
453
+ evidence_path,
454
+ "rollback_exercised",
455
+ "rollback command or route switch was exercised and recorded",
456
+ ["checked_at", "command", "result"],
457
+ validate_rollback,
458
+ )
459
+ evidence_item(
460
+ checks,
461
+ evidence,
462
+ evidence_path,
463
+ "paid_route_latency",
464
+ "p95 latency for paid routes is recorded after staging traffic",
465
+ ["checked_at", "route", "sample_count", "p95_ms", "max_acceptable_ms"],
466
+ validate_latency,
467
+ )
468
+
469
+ return checks
470
+
471
+
472
+ def summarize(checks: list[Check], mode: str) -> dict[str, Any]:
473
+ hard_fail = any(check.status == "fail" for check in checks)
474
+ manual = any(check.status == "manual" for check in checks)
475
+ ready = not hard_fail and (mode == "scaffold" or not manual)
476
+ return {
477
+ "mode": mode,
478
+ "ready": ready,
479
+ "summary": {
480
+ "pass": sum(1 for check in checks if check.status == "pass"),
481
+ "fail": sum(1 for check in checks if check.status == "fail"),
482
+ "manual": sum(1 for check in checks if check.status == "manual"),
483
+ },
484
+ "checks": [asdict(check) for check in checks],
485
+ }
486
+
487
+
488
+ def print_text(result: dict[str, Any]) -> None:
489
+ print(f"Kaiju Coder 7 paid API readiness: mode={result['mode']} ready={result['ready']}")
490
+ print(
491
+ "Summary: "
492
+ f"{result['summary']['pass']} pass, "
493
+ f"{result['summary']['fail']} fail, "
494
+ f"{result['summary']['manual']} manual"
495
+ )
496
+ for check in result["checks"]:
497
+ print(f"[{check['status']}] {check['name']} - {check['detail']}")
498
+
499
+
500
+ def main() -> int:
501
+ parser = argparse.ArgumentParser(description=__doc__)
502
+ parser.add_argument("--mode", choices=["scaffold", "launch"], default="scaffold")
503
+ parser.add_argument("--evidence-file", type=Path, default=DEFAULT_EVIDENCE)
504
+ parser.add_argument("--wrangler-config", type=Path, default=WRANGLER)
505
+ parser.add_argument("--json", action="store_true", help="Print machine-readable JSON.")
506
+ args = parser.parse_args()
507
+
508
+ checks = scaffold_checks(args.wrangler_config) if args.mode == "scaffold" else launch_checks(args.evidence_file, args.wrangler_config)
509
+ result = summarize(checks, args.mode)
510
+ if args.json:
511
+ print(json.dumps(result, indent=2))
512
+ else:
513
+ print_text(result)
514
+ return 0 if result["ready"] else 1
515
+
516
+
517
+ if __name__ == "__main__":
518
+ raise SystemExit(main())
scripts/collect_hf_release_permission_evidence.py ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Create sanitized Hugging Face repo-create permission evidence.
3
+
4
+ This helper never reads or writes Hugging Face tokens. In apply mode it runs a
5
+ private model repo-create probe for the intended namespace, then writes only
6
+ the sanitized facts required by check_hf_release_permission_evidence.py.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import argparse
12
+ import json
13
+ import shutil
14
+ import subprocess
15
+ import sys
16
+ import tempfile
17
+ from datetime import datetime, timezone
18
+ from pathlib import Path
19
+
20
+
21
+ ROOT = Path(__file__).resolve().parents[1]
22
+ DEFAULT_OUT = ROOT / "release/hf-release-permission-evidence.json"
23
+ DEFAULT_NAMESPACE = "RMDWLLC"
24
+ MODEL_ID = "kaiju-coder-7"
25
+
26
+
27
+ def run(args: list[str]) -> subprocess.CompletedProcess[str]:
28
+ return subprocess.run(
29
+ args,
30
+ cwd=ROOT,
31
+ check=False,
32
+ text=True,
33
+ stdout=subprocess.PIPE,
34
+ stderr=subprocess.STDOUT,
35
+ )
36
+
37
+
38
+ def parse_whoami(text: str) -> str:
39
+ for part in text.replace("\n", " ").split():
40
+ if part.startswith("user="):
41
+ return part.split("=", 1)[1].strip()
42
+ first = text.strip().splitlines()[0].strip() if text.strip() else ""
43
+ return first.split()[0] if first else ""
44
+
45
+
46
+ def validate_payload(payload: dict[str, str]) -> None:
47
+ with tempfile.TemporaryDirectory() as tmp:
48
+ evidence_path = Path(tmp) / "hf-release-permission-evidence.json"
49
+ evidence_path.write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8")
50
+ result = run(
51
+ [
52
+ sys.executable,
53
+ str(ROOT / "scripts/check_hf_release_permission_evidence.py"),
54
+ "--evidence-file",
55
+ str(evidence_path),
56
+ "--json",
57
+ ]
58
+ )
59
+ if result.returncode != 0:
60
+ raise RuntimeError("generated evidence did not validate:\n" + result.stdout)
61
+
62
+
63
+ def build_payload(namespace: str, user: str, probe_repo: str, command: str) -> dict[str, str]:
64
+ return {
65
+ "status": "pass",
66
+ "checked_at": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
67
+ "namespace": namespace,
68
+ "authenticated_user": user,
69
+ "probe_repo": probe_repo,
70
+ "command": command,
71
+ "result": "private model repo creation succeeded",
72
+ }
73
+
74
+
75
+ def print_json(payload: dict[str, object]) -> None:
76
+ print(json.dumps(payload, indent=2))
77
+
78
+
79
+ def parse_args() -> argparse.Namespace:
80
+ parser = argparse.ArgumentParser(description=__doc__)
81
+ parser.add_argument("--namespace", default=DEFAULT_NAMESPACE)
82
+ parser.add_argument("--out", type=Path, default=DEFAULT_OUT)
83
+ parser.add_argument("--apply", action="store_true", help="create the private permission probe repo")
84
+ parser.add_argument("--write", action="store_true", help="write the sanitized evidence file after the probe passes")
85
+ parser.add_argument("--json", action="store_true")
86
+ return parser.parse_args()
87
+
88
+
89
+ def main() -> int:
90
+ args = parse_args()
91
+ if shutil.which("hf") is None:
92
+ print("Missing Hugging Face CLI: hf", file=sys.stderr)
93
+ print("Install: curl -LsSf https://hf.co/cli/install.sh | bash -s", file=sys.stderr)
94
+ return 2
95
+
96
+ whoami = run(["hf", "auth", "whoami"])
97
+ if whoami.returncode != 0:
98
+ print("hf auth whoami failed. Run `hf auth login` with a write-capable token.", file=sys.stderr)
99
+ print(whoami.stdout.strip(), file=sys.stderr)
100
+ return whoami.returncode or 1
101
+ user = parse_whoami(whoami.stdout)
102
+ if not user:
103
+ print("Could not parse authenticated Hugging Face username from `hf auth whoami`.", file=sys.stderr)
104
+ return 2
105
+
106
+ probe_repo = f"{args.namespace}/{MODEL_ID}-permission-probe"
107
+ command_args = ["hf", "repos", "create", probe_repo, "--type", "model", "--private", "--exist-ok"]
108
+ command_text = " ".join(command_args)
109
+
110
+ if not args.apply:
111
+ preview = {
112
+ "ready": False,
113
+ "authenticated_user": user,
114
+ "namespace": args.namespace,
115
+ "probe_repo": probe_repo,
116
+ "next_command": command_text,
117
+ "detail": "dry run only; pass --apply --write after the intended namespace/token is active",
118
+ }
119
+ if args.json:
120
+ print_json(preview)
121
+ else:
122
+ print("Dry run. No repo was created and no evidence file was written.")
123
+ print(f"Authenticated user: {user}")
124
+ print(f"Namespace: {args.namespace}")
125
+ print(f"Probe command: {command_text}")
126
+ print(f"Write evidence after a successful probe: {args.out}")
127
+ return 0
128
+
129
+ probe = run(command_args)
130
+ if probe.returncode != 0:
131
+ print("Hugging Face private repo-create permission probe failed.", file=sys.stderr)
132
+ print(probe.stdout.strip(), file=sys.stderr)
133
+ return probe.returncode or 1
134
+
135
+ payload = build_payload(args.namespace, user, probe_repo, command_text)
136
+ validate_payload(payload)
137
+
138
+ if args.write:
139
+ args.out.parent.mkdir(parents=True, exist_ok=True)
140
+ args.out.write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8")
141
+ if args.json:
142
+ print_json({"ready": True, "evidence_file": str(args.out), "evidence": payload})
143
+ else:
144
+ print(f"Wrote sanitized Hugging Face permission evidence: {args.out}")
145
+ else:
146
+ if args.json:
147
+ print_json({"ready": True, "evidence_file": None, "evidence": payload})
148
+ else:
149
+ print("Permission probe passed. Preview evidence:")
150
+ print(json.dumps(payload, indent=2))
151
+ print(f"Pass --write to write: {args.out}")
152
+ return 0
153
+
154
+
155
+ if __name__ == "__main__":
156
+ raise SystemExit(main())
scripts/collect_paid_api_launch_evidence.py ADDED
@@ -0,0 +1,286 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Collect sanitized Kaiju Coder 7 paid API launch evidence.
3
+
4
+ This script helps fill release/paid-api-launch-evidence.json without storing
5
+ API keys, secret values, full prompts, or model responses. It is preview-only by
6
+ default; pass --write to update the evidence file.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import argparse
12
+ import json
13
+ import os
14
+ import re
15
+ import statistics
16
+ import sys
17
+ import time
18
+ import urllib.error
19
+ import urllib.request
20
+ import uuid
21
+ from datetime import datetime, timezone
22
+ from pathlib import Path
23
+ from typing import Any
24
+
25
+
26
+ ROOT = Path(__file__).resolve().parents[1]
27
+ DEFAULT_OUT = ROOT / "release/paid-api-launch-evidence.json"
28
+ MODEL_ID = "kaiju-coder-7"
29
+ DEFAULT_ROUTE = "/v1/chat/completions"
30
+ SECRET_PATTERNS = [
31
+ ("openai_api_key", re.compile(r"\bsk-[A-Za-z0-9][A-Za-z0-9_-]{20,}\b")),
32
+ ("anthropic_api_key", re.compile(r"\bsk-ant-[A-Za-z0-9_-]{20,}\b")),
33
+ ("stripe_secret_key", re.compile(r"\b[rs]k_(?:live|test)_[A-Za-z0-9]{16,}\b")),
34
+ ("stripe_webhook_secret", re.compile(r"\bwhsec_[A-Za-z0-9]{16,}\b")),
35
+ ("huggingface_token", re.compile(r"\bhf_[A-Za-z0-9]{20,}\b")),
36
+ ("github_token", re.compile(r"\b(?:ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{22,})\b")),
37
+ ("google_api_key", re.compile(r"\bAIza[0-9A-Za-z_-]{20,}\b")),
38
+ ("bearer_token", re.compile(r"\bBearer\s+[A-Za-z0-9._~+/-]{24,}={0,2}\b", re.IGNORECASE)),
39
+ ("private_key_block", re.compile(r"-----BEGIN (?:RSA |OPENSSH |EC |DSA )?PRIVATE KEY-----")),
40
+ ]
41
+
42
+
43
+ def utc_now() -> str:
44
+ return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
45
+
46
+
47
+ def load_existing(path: Path) -> dict[str, Any]:
48
+ if not path.is_file():
49
+ return {}
50
+ return json.loads(path.read_text(encoding="utf-8"))
51
+
52
+
53
+ def secret_findings(text: str) -> list[str]:
54
+ return sorted({label for label, pattern in SECRET_PATTERNS if pattern.search(text)})
55
+
56
+
57
+ def assert_sanitized(payload: dict[str, Any]) -> None:
58
+ rendered = json.dumps(payload, sort_keys=True)
59
+ findings = secret_findings(rendered)
60
+ if findings:
61
+ raise SystemExit("Refusing to write secret-looking evidence: " + ", ".join(findings))
62
+
63
+
64
+ def api_url(base_url: str, path: str) -> str:
65
+ return base_url.rstrip("/") + path
66
+
67
+
68
+ def request_json(url: str, payload: dict[str, Any], api_key: str, request_id: str, timeout: int) -> tuple[int, str, float]:
69
+ body = json.dumps(payload).encode("utf-8")
70
+ request = urllib.request.Request(
71
+ url,
72
+ data=body,
73
+ method="POST",
74
+ headers={
75
+ "authorization": f"Bearer {api_key}",
76
+ "content-type": "application/json",
77
+ "x-request-id": request_id,
78
+ },
79
+ )
80
+ start = time.perf_counter()
81
+ try:
82
+ with urllib.request.urlopen(request, timeout=timeout) as response:
83
+ content_type = response.headers.get("content-type", "")
84
+ response.read()
85
+ return response.status, content_type, (time.perf_counter() - start) * 1000
86
+ except urllib.error.HTTPError as exc:
87
+ exc.read()
88
+ return exc.code, exc.headers.get("content-type", ""), (time.perf_counter() - start) * 1000
89
+
90
+
91
+ def probe_health(base_url: str, timeout: int) -> tuple[int, float] | None:
92
+ start = time.perf_counter()
93
+ try:
94
+ with urllib.request.urlopen(api_url(base_url, "/health"), timeout=timeout) as response:
95
+ response.read()
96
+ return response.status, (time.perf_counter() - start) * 1000
97
+ except Exception:
98
+ return None
99
+
100
+
101
+ def percentile_95(values: list[float]) -> float:
102
+ if len(values) == 1:
103
+ return values[0]
104
+ try:
105
+ return statistics.quantiles(values, n=20, method="inclusive")[18]
106
+ except statistics.StatisticsError:
107
+ return max(values)
108
+
109
+
110
+ def run_staging_samples(args: argparse.Namespace) -> tuple[dict[str, Any] | None, dict[str, Any] | None]:
111
+ if args.skip_live_request:
112
+ return None, None
113
+ if not args.api_base_url:
114
+ raise SystemExit("--api-base-url is required unless --skip-live-request is set")
115
+ api_key = os.environ.get(args.api_key_env)
116
+ if not api_key:
117
+ raise SystemExit(f"{args.api_key_env} is not set; refusing to read API keys from arguments")
118
+
119
+ latencies: list[float] = []
120
+ first_request_id = ""
121
+ first_status = 0
122
+ first_streamed = False
123
+ url = api_url(args.api_base_url, DEFAULT_ROUTE)
124
+ sample_count = max(args.live_samples, 1)
125
+ for index in range(sample_count):
126
+ request_id = f"kaiju-paid-staging-{uuid.uuid4()}"
127
+ payload = {
128
+ "model": MODEL_ID,
129
+ "stream": True,
130
+ "max_tokens": 48,
131
+ "messages": [
132
+ {
133
+ "role": "user",
134
+ "content": "Return a short Kaiju Coder 7 paid API staging smoke response.",
135
+ }
136
+ ],
137
+ }
138
+ status, content_type, latency_ms = request_json(url, payload, api_key, request_id, args.timeout)
139
+ if index == 0:
140
+ first_request_id = request_id
141
+ first_status = status
142
+ first_streamed = "event-stream" in content_type.lower()
143
+ if status == 200:
144
+ latencies.append(latency_ms)
145
+
146
+ request_evidence = {
147
+ "status": "pass" if first_status == 200 and first_streamed else "pending",
148
+ "checked_at": utc_now(),
149
+ "route": DEFAULT_ROUTE,
150
+ "model": MODEL_ID,
151
+ "http_status": first_status,
152
+ "streamed": first_streamed,
153
+ "request_id": first_request_id,
154
+ }
155
+ latency_evidence = {
156
+ "status": "pass" if len(latencies) >= 5 else "pending",
157
+ "checked_at": utc_now(),
158
+ "route": DEFAULT_ROUTE,
159
+ "sample_count": len(latencies),
160
+ "p95_ms": round(percentile_95(latencies), 2) if latencies else 0,
161
+ "max_acceptable_ms": args.max_acceptable_ms,
162
+ }
163
+ return request_evidence, latency_evidence
164
+
165
+
166
+ def add_optional_manual_evidence(evidence: dict[str, Any], args: argparse.Namespace) -> None:
167
+ checked_at = args.checked_at or utc_now()
168
+ if args.public_route_ok:
169
+ health = probe_health(args.api_base_url, args.timeout) if args.api_base_url else None
170
+ evidence["public_route_mode"] = {
171
+ "status": "pass",
172
+ "checked_at": checked_at,
173
+ "exposure_mode": "custom_domain",
174
+ "route": args.api_base_url,
175
+ "result": "custom domain resolves to the intended Kaiju Worker"
176
+ + (f"; /health={health[0]} in {health[1]:.0f}ms" if health else ""),
177
+ }
178
+ if args.wrangler_secret_name:
179
+ evidence["wrangler_secrets_verified"] = {
180
+ "status": "pass",
181
+ "checked_at": checked_at,
182
+ "command": "wrangler secret list",
183
+ "observed_names": sorted(set(args.wrangler_secret_name)),
184
+ }
185
+ if args.d1_migration_result:
186
+ evidence["d1_migration_applied"] = {
187
+ "status": "pass",
188
+ "checked_at": checked_at,
189
+ "command": args.d1_migration_command,
190
+ "migration": "0001_paid_api.sql",
191
+ "result": args.d1_migration_result,
192
+ }
193
+ if args.stripe_checkout_topup_pass:
194
+ evidence["stripe_checkout_topup_staging"] = {
195
+ "status": "pass",
196
+ "checked_at": checked_at,
197
+ "mode": args.stripe_mode,
198
+ "webhook_event": "checkout.session.completed",
199
+ "credited_api_key_id": args.credited_api_key_id,
200
+ "idempotency_checked": args.stripe_idempotency_checked,
201
+ }
202
+ if args.rollback_result:
203
+ evidence["rollback_exercised"] = {
204
+ "status": "pass",
205
+ "checked_at": checked_at,
206
+ "command": args.rollback_command,
207
+ "result": args.rollback_result,
208
+ }
209
+ if args.staging_request_id:
210
+ evidence["worker_to_gojira_staging_request"] = {
211
+ "status": "pass",
212
+ "checked_at": checked_at,
213
+ "route": DEFAULT_ROUTE,
214
+ "model": MODEL_ID,
215
+ "http_status": args.staging_http_status,
216
+ "streamed": args.staging_streamed,
217
+ "request_id": args.staging_request_id,
218
+ }
219
+ if args.paid_route_p95_ms is not None:
220
+ evidence["paid_route_latency"] = {
221
+ "status": "pass",
222
+ "checked_at": checked_at,
223
+ "route": DEFAULT_ROUTE,
224
+ "sample_count": args.paid_route_sample_count,
225
+ "p95_ms": args.paid_route_p95_ms,
226
+ "max_acceptable_ms": args.max_acceptable_ms,
227
+ }
228
+
229
+
230
+ def parse_args() -> argparse.Namespace:
231
+ parser = argparse.ArgumentParser(description=__doc__)
232
+ parser.add_argument("--out", type=Path, default=DEFAULT_OUT)
233
+ parser.add_argument("--write", action="store_true", help="Write the evidence file. Default is preview only.")
234
+ parser.add_argument("--merge-existing", action="store_true", help="Merge with existing evidence at --out.")
235
+ parser.add_argument("--checked-at", help="Override checked_at timestamp for manual evidence.")
236
+ parser.add_argument("--api-base-url", default="", help="Public paid API base URL, for example https://api.example.com.")
237
+ parser.add_argument("--api-key-env", default="KAIJU_PAID_API_KEY", help="Environment variable containing the staging API key.")
238
+ parser.add_argument("--timeout", type=int, default=120)
239
+ parser.add_argument("--skip-live-request", action="store_true", help="Do not call the paid API.")
240
+ parser.add_argument("--live-samples", type=int, default=5)
241
+ parser.add_argument("--max-acceptable-ms", type=float, default=120_000)
242
+ parser.add_argument("--public-route-ok", action="store_true", help="Record public custom-domain route evidence.")
243
+ parser.add_argument("--wrangler-secret-name", action="append", default=[], help="Observed Wrangler secret name. Repeatable.")
244
+ parser.add_argument("--d1-migration-result", choices=["success", "already_applied"])
245
+ parser.add_argument(
246
+ "--d1-migration-command",
247
+ default="wrangler d1 migrations apply KAIJU_BILLING_DB --remote",
248
+ )
249
+ parser.add_argument("--stripe-checkout-topup-pass", action="store_true")
250
+ parser.add_argument("--stripe-mode", default="test")
251
+ parser.add_argument("--credited-api-key-id", default="key_staging_001")
252
+ parser.add_argument("--stripe-idempotency-checked", action="store_true")
253
+ parser.add_argument("--rollback-result", choices=["success"])
254
+ parser.add_argument("--rollback-command", default="wrangler rollback")
255
+ parser.add_argument("--staging-request-id", help="Sanitized request id from a separate staging request.")
256
+ parser.add_argument("--staging-http-status", type=int, default=200)
257
+ parser.add_argument("--staging-streamed", action="store_true")
258
+ parser.add_argument("--paid-route-p95-ms", type=float)
259
+ parser.add_argument("--paid-route-sample-count", type=int, default=5)
260
+ return parser.parse_args()
261
+
262
+
263
+ def main() -> int:
264
+ args = parse_args()
265
+ evidence = load_existing(args.out) if args.merge_existing else {}
266
+ add_optional_manual_evidence(evidence, args)
267
+ request_evidence, latency_evidence = run_staging_samples(args)
268
+ if request_evidence:
269
+ evidence["worker_to_gojira_staging_request"] = request_evidence
270
+ if latency_evidence:
271
+ evidence["paid_route_latency"] = latency_evidence
272
+
273
+ assert_sanitized(evidence)
274
+ rendered = json.dumps(evidence, indent=2, sort_keys=True) + "\n"
275
+ if args.write:
276
+ args.out.parent.mkdir(parents=True, exist_ok=True)
277
+ args.out.write_text(rendered, encoding="utf-8")
278
+ print(f"Wrote sanitized paid API evidence to {args.out}")
279
+ else:
280
+ print(rendered, end="")
281
+ print("Preview only. Pass --write to update the evidence file.", file=sys.stderr)
282
+ return 0
283
+
284
+
285
+ if __name__ == "__main__":
286
+ raise SystemExit(main())
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f399b3cd12fa270d51457bb749fb30863521e8359b8a27059c71b6c2f7d6dd6c
3
+ size 19989424
tokenizer_config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "audio_bos_token": "<|audio_start|>",
4
+ "audio_eos_token": "<|audio_end|>",
5
+ "audio_token": "<|audio_pad|>",
6
+ "backend": "tokenizers",
7
+ "bos_token": null,
8
+ "clean_up_tokenization_spaces": false,
9
+ "eos_token": "<|im_end|>",
10
+ "errors": "replace",
11
+ "image_token": "<|image_pad|>",
12
+ "is_local": true,
13
+ "local_files_only": false,
14
+ "model_max_length": 262144,
15
+ "model_specific_special_tokens": {
16
+ "audio_bos_token": "<|audio_start|>",
17
+ "audio_eos_token": "<|audio_end|>",
18
+ "audio_token": "<|audio_pad|>",
19
+ "image_token": "<|image_pad|>",
20
+ "video_token": "<|video_pad|>",
21
+ "vision_bos_token": "<|vision_start|>",
22
+ "vision_eos_token": "<|vision_end|>"
23
+ },
24
+ "pad_token": "<|endoftext|>",
25
+ "pretokenize_regex": "(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?[\\p{L}\\p{M}]+|\\p{N}| ?[^\\s\\p{L}\\p{M}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+",
26
+ "split_special_tokens": false,
27
+ "tokenizer_class": "Qwen2Tokenizer",
28
+ "unk_token": null,
29
+ "video_token": "<|video_pad|>",
30
+ "vision_bos_token": "<|vision_start|>",
31
+ "vision_eos_token": "<|vision_end|>"
32
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4134c1b08a1c0abe45425df32940332d5ab998f2811aab5fc7525f465f6ba60b
3
+ size 5329
upstream/qwen3.6-27b/LICENSE ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ Apache License
3
+ Version 2.0, January 2004
4
+ http://www.apache.org/licenses/
5
+
6
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7
+
8
+ 1. Definitions.
9
+
10
+ "License" shall mean the terms and conditions for use, reproduction,
11
+ and distribution as defined by Sections 1 through 9 of this document.
12
+
13
+ "Licensor" shall mean the copyright owner or entity authorized by
14
+ the copyright owner that is granting the License.
15
+
16
+ "Legal Entity" shall mean the union of the acting entity and all
17
+ other entities that control, are controlled by, or are under common
18
+ control with that entity. For the purposes of this definition,
19
+ "control" means (i) the power, direct or indirect, to cause the
20
+ direction or management of such entity, whether by contract or
21
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
22
+ outstanding shares, or (iii) beneficial ownership of such entity.
23
+
24
+ "You" (or "Your") shall mean an individual or Legal Entity
25
+ exercising permissions granted by this License.
26
+
27
+ "Source" form shall mean the preferred form for making modifications,
28
+ including but not limited to software source code, documentation
29
+ source, and configuration files.
30
+
31
+ "Object" form shall mean any form resulting from mechanical
32
+ transformation or translation of a Source form, including but
33
+ not limited to compiled object code, generated documentation,
34
+ and conversions to other media types.
35
+
36
+ "Work" shall mean the work of authorship, whether in Source or
37
+ Object form, made available under the License, as indicated by a
38
+ copyright notice that is included in or attached to the work
39
+ (an example is provided in the Appendix below).
40
+
41
+ "Derivative Works" shall mean any work, whether in Source or Object
42
+ form, that is based on (or derived from) the Work and for which the
43
+ editorial revisions, annotations, elaborations, or other modifications
44
+ represent, as a whole, an original work of authorship. For the purposes
45
+ of this License, Derivative Works shall not include works that remain
46
+ separable from, or merely link (or bind by name) to the interfaces of,
47
+ the Work and Derivative Works thereof.
48
+
49
+ "Contribution" shall mean any work of authorship, including
50
+ the original version of the Work and any modifications or additions
51
+ to that Work or Derivative Works thereof, that is intentionally
52
+ submitted to Licensor for inclusion in the Work by the copyright owner
53
+ or by an individual or Legal Entity authorized to submit on behalf of
54
+ the copyright owner. For the purposes of this definition, "submitted"
55
+ means any form of electronic, verbal, or written communication sent
56
+ to the Licensor or its representatives, including but not limited to
57
+ communication on electronic mailing lists, source code control systems,
58
+ and issue tracking systems that are managed by, or on behalf of, the
59
+ Licensor for the purpose of discussing and improving the Work, but
60
+ excluding communication that is conspicuously marked or otherwise
61
+ designated in writing by the copyright owner as "Not a Contribution."
62
+
63
+ "Contributor" shall mean Licensor and any individual or Legal Entity
64
+ on behalf of whom a Contribution has been received by Licensor and
65
+ subsequently incorporated within the Work.
66
+
67
+ 2. Grant of Copyright License. Subject to the terms and conditions of
68
+ this License, each Contributor hereby grants to You a perpetual,
69
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
70
+ copyright license to reproduce, prepare Derivative Works of,
71
+ publicly display, publicly perform, sublicense, and distribute the
72
+ Work and such Derivative Works in Source or Object form.
73
+
74
+ 3. Grant of Patent License. Subject to the terms and conditions of
75
+ this License, each Contributor hereby grants to You a perpetual,
76
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
77
+ (except as stated in this section) patent license to make, have made,
78
+ use, offer to sell, sell, import, and otherwise transfer the Work,
79
+ where such license applies only to those patent claims licensable
80
+ by such Contributor that are necessarily infringed by their
81
+ Contribution(s) alone or by combination of their Contribution(s)
82
+ with the Work to which such Contribution(s) was submitted. If You
83
+ institute patent litigation against any entity (including a
84
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
85
+ or a Contribution incorporated within the Work constitutes direct
86
+ or contributory patent infringement, then any patent licenses
87
+ granted to You under this License for that Work shall terminate
88
+ as of the date such litigation is filed.
89
+
90
+ 4. Redistribution. You may reproduce and distribute copies of the
91
+ Work or Derivative Works thereof in any medium, with or without
92
+ modifications, and in Source or Object form, provided that You
93
+ meet the following conditions:
94
+
95
+ (a) You must give any other recipients of the Work or
96
+ Derivative Works a copy of this License; and
97
+
98
+ (b) You must cause any modified files to carry prominent notices
99
+ stating that You changed the files; and
100
+
101
+ (c) You must retain, in the Source form of any Derivative Works
102
+ that You distribute, all copyright, patent, trademark, and
103
+ attribution notices from the Source form of the Work,
104
+ excluding those notices that do not pertain to any part of
105
+ the Derivative Works; and
106
+
107
+ (d) If the Work includes a "NOTICE" text file as part of its
108
+ distribution, then any Derivative Works that You distribute must
109
+ include a readable copy of the attribution notices contained
110
+ within such NOTICE file, excluding those notices that do not
111
+ pertain to any part of the Derivative Works, in at least one
112
+ of the following places: within a NOTICE text file distributed
113
+ as part of the Derivative Works; within the Source form or
114
+ documentation, if provided along with the Derivative Works; or,
115
+ within a display generated by the Derivative Works, if and
116
+ wherever such third-party notices normally appear. The contents
117
+ of the NOTICE file are for informational purposes only and
118
+ do not modify the License. You may add Your own attribution
119
+ notices within Derivative Works that You distribute, alongside
120
+ or as an addendum to the NOTICE text from the Work, provided
121
+ that such additional attribution notices cannot be construed
122
+ as modifying the License.
123
+
124
+ You may add Your own copyright statement to Your modifications and
125
+ may provide additional or different license terms and conditions
126
+ for use, reproduction, or distribution of Your modifications, or
127
+ for any such Derivative Works as a whole, provided Your use,
128
+ reproduction, and distribution of the Work otherwise complies with
129
+ the conditions stated in this License.
130
+
131
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
132
+ any Contribution intentionally submitted for inclusion in the Work
133
+ by You to the Licensor shall be under the terms and conditions of
134
+ this License, without any additional terms or conditions.
135
+ Notwithstanding the above, nothing herein shall supersede or modify
136
+ the terms of any separate license agreement you may have executed
137
+ with Licensor regarding such Contributions.
138
+
139
+ 6. Trademarks. This License does not grant permission to use the trade
140
+ names, trademarks, service marks, or product names of the Licensor,
141
+ except as required for reasonable and customary use in describing the
142
+ origin of the Work and reproducing the content of the NOTICE file.
143
+
144
+ 7. Disclaimer of Warranty. Unless required by applicable law or
145
+ agreed to in writing, Licensor provides the Work (and each
146
+ Contributor provides its Contributions) on an "AS IS" BASIS,
147
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148
+ implied, including, without limitation, any warranties or conditions
149
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150
+ PARTICULAR PURPOSE. You are solely responsible for determining the
151
+ appropriateness of using or redistributing the Work and assume any
152
+ risks associated with Your exercise of permissions under this License.
153
+
154
+ 8. Limitation of Liability. In no event and under no legal theory,
155
+ whether in tort (including negligence), contract, or otherwise,
156
+ unless required by applicable law (such as deliberate and grossly
157
+ negligent acts) or agreed to in writing, shall any Contributor be
158
+ liable to You for damages, including any direct, indirect, special,
159
+ incidental, or consequential damages of any character arising as a
160
+ result of this License or out of the use or inability to use the
161
+ Work (including but not limited to damages for loss of goodwill,
162
+ work stoppage, computer failure or malfunction, or any and all
163
+ other commercial damages or losses), even if such Contributor
164
+ has been advised of the possibility of such damages.
165
+
166
+ 9. Accepting Warranty or Additional Liability. While redistributing
167
+ the Work or Derivative Works thereof, You may choose to offer,
168
+ and charge a fee for, acceptance of support, warranty, indemnity,
169
+ or other liability obligations and/or rights consistent with this
170
+ License. However, in accepting such obligations, You may act only
171
+ on Your own behalf and on Your sole responsibility, not on behalf
172
+ of any other Contributor, and only if You agree to indemnify,
173
+ defend, and hold each Contributor harmless for any liability
174
+ incurred by, or claims asserted against, such Contributor by reason
175
+ of your accepting any such warranty or additional liability.
176
+
177
+ END OF TERMS AND CONDITIONS
178
+
179
+ APPENDIX: How to apply the Apache License to your work.
180
+
181
+ To apply the Apache License to your work, attach the following
182
+ boilerplate notice, with the fields enclosed by brackets "[]"
183
+ replaced with your own identifying information. (Don't include
184
+ the brackets!) The text should be enclosed in the appropriate
185
+ comment syntax for the file format. We also recommend that a
186
+ file or class name and description of purpose be included on the
187
+ same "printed page" as the copyright notice for easier
188
+ identification within third-party archives.
189
+
190
+ Copyright 2026 Alibaba Cloud
191
+
192
+ Licensed under the Apache License, Version 2.0 (the "License");
193
+ you may not use this file except in compliance with the License.
194
+ You may obtain a copy of the License at
195
+
196
+ http://www.apache.org/licenses/LICENSE-2.0
197
+
198
+ Unless required by applicable law or agreed to in writing, software
199
+ distributed under the License is distributed on an "AS IS" BASIS,
200
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201
+ See the License for the specific language governing permissions and
202
+ limitations under the License.