kaiju-coder-7 / GOAL_COMPLETION_AUDIT.md
restokes92's picture
Polish Kaiju Coder 7 Hugging Face model cards
b040ee3 verified

Kaiju Coder 7 Goal Completion Audit

Generated: 2026-06-04T02:44:04Z

Overall: complete Summary: 18 passed / 0 blocked / 0 manual

This audit maps the active Kaiju Coder 7 objective to current evidence across local runtime, Hugging Face release, OpenCode, paid API preflight, and remaining honest caveats.

Readiness Commands

Check Ready Return Code
Local public-testing readiness True 0
Hugging Face release readiness True 0
Public launch readiness True 0
Paid API scaffold True 0
Paid API launch True 0
HF staging integrity True 0
HF namespace permission evidence True 0
Human public review True 0

Requirement Audit

Area Requirement Status Evidence Blocker
Identity Product name is Kaiju Coder 7 and public/API model id is kaiju-coder-7. passed scripts/check_kaiju_public_release_readiness.py --mode local; release/PUBLIC_TESTING_QUICKSTART.md
OpenCode Kaiju-specific OpenCode config installs the model, default agent, hidden artifact routing, and no-autocontinue loop guard. passed .opencode/agents/kaiju-coder-7.md; scripts/opencode-kaiju-no-autocontinue.mjs; scripts/install_kaiju_opencode_profile.py
OpenCode After install, plain opencode/opencode run works from this Mac with Kaiju as the selected/default model. passed runs/public-opencode-smoke latest passing summary; scripts/run_kaiju_public_opencode_smoke.py
OpenCode Customer-readiness pack passes without wrong-directory output, fake compaction completion, missing files, or secret leakage. passed runs/opencode-customer-readiness/20260603T185835Z/summary.md
Runtime Direct API smoke passes using model=kaiju-coder-7. passed runs/benchmarks/20260603T223337Z-kaiju-coder-7-serving/summary.md
Runtime 12k, 16k, 24k, and 32k context benchmarks are recorded with a recommended default. passed release/SERVING_BENCHMARKS.md records 12288, 16384, 24576, 32768 and recommends 16k live default
Runtime SGLang and vLLM/practical faster serving path are benchmarked honestly. passed release/SERVING_BENCHMARKS.md; release/quantized-runtime/README.md
Runtime At least one public-friendly quantized/local candidate is working or clearly documented as blocked with evidence. passed release/quantized-runtime/README.md documents vLLM bitsandbytes runtime candidate and persisted-weights limitation
Hugging Face Public-friendly HF release structure is staged with adapter, OpenCode helper, runtime-quantized helper, model cards, provenance, evals, and docs. passed python3 scripts/check_hf_staging_integrity.py --require-checksums
Hugging Face At least one public Hugging Face release path is ready to upload or uploaded. passed python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
Hugging Face Merged 51GB model repo upload is complete and public, or guarded with explicit evidence. passed release/HF_UPLOAD_EVIDENCE.md; scripts/prepare_hf_merged_model_metadata.sh; scripts/upload_hf_merged_model_from_gojira_b.sh
Hugging Face Uploaded Hugging Face repos are downloadable by intended users. passed release/HF_UPLOAD_EVIDENCE.md; python3 scripts/check_hf_uploaded_release.py --namespace RMDWLLC --apply
Quality Customer-style evals cover website, proposal, Stripe/payment, CRM/reporting, CSV/parser, Kiyomi operating pack, and safety/provenance. passed evals/tasks/opencode-customer-readiness.jsonl; runs/opencode-customer-readiness/20260603T185835Z/summary.md
Quality Model/harness prompts produce file-oriented business-owner artifacts rather than vague advice. passed kaiju_harness/business_suite.py; release/EVAL_SCOREBOARD.md
Provenance Training/eval provenance is preserved and public docs avoid internal checkpoint naming except license/provenance attribution. passed release/SOURCE_INVENTORY.md; release/DATA_PROVENANCE_DRAFT.md; release/PUBLIC_TESTING_QUICKSTART.md
Paid API Paid API scaffold covers API keys, Stripe billing, rate limits, logging controls, abuse controls, rollback plan, and pricing assumptions. passed python3 scripts/check_paid_api_readiness.py --mode scaffold; gateway/cloudflare-worker tests
Paid API Paid API is ready for public charging. passed python3 scripts/check_paid_api_readiness.py --mode launch
Final Report Final report includes exact commands run, eval results, changed files, remaining risks, and what Richard should test first. passed release/FINAL_RELEASE_REPORT.md

Blocking Items

  • No blocking items.

Commands To Re-run

python3 scripts/check_kaiju_public_release_readiness.py --mode local
python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
python3 scripts/check_kaiju_public_release_readiness.py --mode public
python3 scripts/check_paid_api_readiness.py --mode scaffold
python3 scripts/check_paid_api_readiness.py --mode launch
python3 scripts/check_hf_staging_integrity.py --require-checksums
python3 scripts/check_hf_release_permission_evidence.py
python3 scripts/check_hf_uploaded_release.py --namespace RMDWLLC --apply
python3 scripts/check_human_release_review.py --mode public