Text Generation
Transformers
Safetensors
English
qwen3_5
image-text-to-text
kaiju-coder-7
coding
local-ai
business
opencode
tool-use
conversational
Instructions to use RMDWLLC/kaiju-coder-7 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use RMDWLLC/kaiju-coder-7 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="RMDWLLC/kaiju-coder-7") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("RMDWLLC/kaiju-coder-7") model = AutoModelForImageTextToText.from_pretrained("RMDWLLC/kaiju-coder-7") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use RMDWLLC/kaiju-coder-7 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "RMDWLLC/kaiju-coder-7" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RMDWLLC/kaiju-coder-7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/RMDWLLC/kaiju-coder-7
- SGLang
How to use RMDWLLC/kaiju-coder-7 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "RMDWLLC/kaiju-coder-7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RMDWLLC/kaiju-coder-7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "RMDWLLC/kaiju-coder-7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RMDWLLC/kaiju-coder-7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use RMDWLLC/kaiju-coder-7 with Docker Model Runner:
docker model run hf.co/RMDWLLC/kaiju-coder-7
Add files using upload-large-folder tool
Browse files- .gitattributes +1 -0
- DATA_PROVENANCE_DRAFT.md +87 -0
- EVAL_SCOREBOARD.md +70 -0
- FINAL_RELEASE_REPORT.md +269 -0
- GOAL_COMPLETION_AUDIT.md +60 -0
- LOCAL_TEST_INSTRUCTIONS.md +147 -0
- MERGED_MODEL_RELEASE_MANIFEST.json +11 -0
- PAID_API_READINESS.md +266 -0
- PUBLIC_TESTING_QUICKSTART.md +149 -0
- README.md +160 -0
- SERVING_BENCHMARKS.md +358 -0
- SOURCE_INVENTORY.md +41 -0
- UPSTREAM_LICENSE_CHECK.md +38 -0
- chat_template.jinja +154 -0
- config.json +140 -0
- configuration.json +1 -0
- generation_config.json +13 -0
- kaiju-merge-manifest.json +17 -0
- merges.txt +0 -0
- model-00001-of-00014.safetensors +3 -0
- model-00002-of-00014.safetensors +3 -0
- model-00003-of-00014.safetensors +3 -0
- model-00004-of-00014.safetensors +3 -0
- model-00005-of-00014.safetensors +3 -0
- model-00006-of-00014.safetensors +3 -0
- model-00007-of-00014.safetensors +3 -0
- model-00008-of-00014.safetensors +3 -0
- model-00009-of-00014.safetensors +3 -0
- model-00010-of-00014.safetensors +3 -0
- model-00011-of-00014.safetensors +3 -0
- model-00012-of-00014.safetensors +3 -0
- model-00013-of-00014.safetensors +3 -0
- model-00014-of-00014.safetensors +3 -0
- model.safetensors.index.json +859 -0
- preprocessor_config.json +21 -0
- tokenizer.json +3 -0
- tokenizer_config.json +36 -0
- upstream/qwen3.6-27b/LICENSE +202 -0
- video_preprocessor_config.json +21 -0
- vocab.json +0 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
DATA_PROVENANCE_DRAFT.md
ADDED
|
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 by Kiyomi - Data Provenance Draft
|
| 2 |
+
|
| 3 |
+
This draft records the current data boundary for release review.
|
| 4 |
+
|
| 5 |
+
## Policy
|
| 6 |
+
|
| 7 |
+
Kaiju Coder training data must be legally usable for a commercial derivative model.
|
| 8 |
+
|
| 9 |
+
Allowed:
|
| 10 |
+
|
| 11 |
+
- RMDW-authored examples.
|
| 12 |
+
- RMDW-owned repository diffs and documentation.
|
| 13 |
+
- Human-reviewed examples created specifically for Kaiju.
|
| 14 |
+
- Public permissive data only when license review confirms compatibility.
|
| 15 |
+
|
| 16 |
+
Not allowed:
|
| 17 |
+
|
| 18 |
+
- Closed-model answers from OpenAI, Anthropic, Gemini, or similar services as supervised completions.
|
| 19 |
+
- Unreviewed customer data.
|
| 20 |
+
- Private customer code without consent.
|
| 21 |
+
- Secrets, tokens, credentials, cookies, or private keys.
|
| 22 |
+
- Unlicensed scraped code.
|
| 23 |
+
|
| 24 |
+
## v0.1 Dataset Snapshot
|
| 25 |
+
|
| 26 |
+
- Total reviewed examples: 575
|
| 27 |
+
- Dataset build: `datasets/build/kaiju-sft-v0.1.jsonl`
|
| 28 |
+
- Candidate sources:
|
| 29 |
+
- `datasets/candidates/rmdw-git-patches.jsonl`
|
| 30 |
+
- `datasets/candidates/v0.1-safe-git-backlog.jsonl`
|
| 31 |
+
- `datasets/candidates/v0.1-file-level-git.jsonl`
|
| 32 |
+
- `datasets/candidates/v0.1-wiki-strategy-business-identity.jsonl`
|
| 33 |
+
|
| 34 |
+
## v1.7 Business-Owner Suite Addendum
|
| 35 |
+
|
| 36 |
+
- Date prepared: 2026-06-03
|
| 37 |
+
- Reviewed examples: 8
|
| 38 |
+
- Candidate file: `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl`
|
| 39 |
+
- Addendum-only SFT build: `datasets/build/kaiju-sft-v1.7-business-owner-suite.jsonl`
|
| 40 |
+
- Training SFT build: `datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl`
|
| 41 |
+
- Training config: `training/configs/qwen36-27b-lora-v1.7.example.json`
|
| 42 |
+
- v1.8 training config: `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json`
|
| 43 |
+
- New task type: `business_suite`
|
| 44 |
+
- Source inventory: `release/SOURCE_INVENTORY.md`, refreshed from GitHub source-of-truth repositories and the requested local RMDW wiki snapshot.
|
| 45 |
+
|
| 46 |
+
This addendum targets Kiyomi 7.7.7 style business-owner work: complete AI-company build packs, premium service websites, intake and CRM flows, sales follow-up, proposals, ROI dashboards, operator handbooks, and Workshop golden-run automations.
|
| 47 |
+
|
| 48 |
+
Every row includes:
|
| 49 |
+
|
| 50 |
+
- `source_repos`
|
| 51 |
+
- `source_paths`
|
| 52 |
+
- `provenance_notes`
|
| 53 |
+
- `reviewed: true`
|
| 54 |
+
- `license: RMDW-owned`
|
| 55 |
+
|
| 56 |
+
For the v1.7 LoRA run, the 8 reviewed business-owner rows are oversampled 24 times by `scripts/build_v17_business_owner_sft_dataset.py`. Repeated rows receive unique IDs ending in `__v17_business_repeat_NN` and preserve the original source repository, source path, and provenance metadata.
|
| 57 |
+
|
| 58 |
+
Client-site repositories are used only as eval and generalized pattern sources unless a row is explicitly reviewed for training eligibility. Do not bulk-train on client-specific text, contact details, contracts, or private business data.
|
| 59 |
+
|
| 60 |
+
The local wiki path `/Users/richardecholsai7/Documents/RMDW-Wiki` is present but is not a git checkout. It is recorded as `RMDW-Wiki-local`, `selective-reference-only`, with `credentials.md`, `customers.md`, `customers/`, and `raw/` excluded. The GitHub `RichardEchols/rmdw-agent-wiki` repo remains the authoritative wiki source for training/eval provenance unless a reviewer documents a local exception.
|
| 61 |
+
|
| 62 |
+
## Category Mix
|
| 63 |
+
|
| 64 |
+
The v0.1 category gate passed:
|
| 65 |
+
|
| 66 |
+
- Website/UI: at least 75 examples
|
| 67 |
+
- Coding: at least 75 examples
|
| 68 |
+
- Debugging: at least 50 examples
|
| 69 |
+
- Automation: at least 50 examples
|
| 70 |
+
- Tool-use: at least 50 examples
|
| 71 |
+
- Strategy: at least 25 examples
|
| 72 |
+
- Business: at least 15 examples
|
| 73 |
+
- Identity: at least 10 examples
|
| 74 |
+
|
| 75 |
+
## Release Review Checklist
|
| 76 |
+
|
| 77 |
+
Before public release:
|
| 78 |
+
|
| 79 |
+
- Re-run dataset validation.
|
| 80 |
+
- Re-run source inventory against the current GitHub source-of-truth SHAs.
|
| 81 |
+
- Spot-check examples for secrets and private data.
|
| 82 |
+
- Confirm client-site rows are generalized pattern examples or eval-only.
|
| 83 |
+
- Confirm closed-model outputs are not used as supervised completions.
|
| 84 |
+
- Record exact base model revision.
|
| 85 |
+
- Attach upstream license and notices.
|
| 86 |
+
- Attach eval summary.
|
| 87 |
+
- Document known limitations and unsafe use boundaries.
|
EVAL_SCOREBOARD.md
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 Business-Owner Eval Scoreboard
|
| 2 |
+
|
| 3 |
+
This scoreboard tracks the current release-candidate evidence. Do not publish weights or paid API claims until every required row has a dated result and reviewer.
|
| 4 |
+
|
| 5 |
+
## Completed Local Gates
|
| 6 |
+
|
| 7 |
+
| Gate | Command | Result | Date |
|
| 8 |
+
|---|---|---:|---|
|
| 9 |
+
| Source inventory refresh | `python3 scripts/build_source_inventory.py` | Passed | 2026-06-03 |
|
| 10 |
+
| Candidate validation | `python3 scripts/validate_training_data.py --min-examples 350` | 1,689 examples / passed | 2026-06-03 |
|
| 11 |
+
| v1.7 category targets | `python3 scripts/check_dataset_targets.py --targets datasets/v1.7-targets.json` | Passed | 2026-06-03 |
|
| 12 |
+
| Business-owner SFT build | `python3 scripts/build_v17_business_owner_sft_dataset.py` | 1,881 rows / 192 repeats | 2026-06-03 |
|
| 13 |
+
| Router hard harness | `python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl` | 23/23 | 2026-06-03 |
|
| 14 |
+
| Router static checks | `python3 evals/run_router_static_checks.py runs/evals/20260603T103915Z-kaiju_router_harness/results.jsonl` | 23/23 | 2026-06-03 |
|
| 15 |
+
| Business-suite prompts | Included in router hard harness | 2/2 | 2026-06-03 |
|
| 16 |
+
| Deterministic API harness smoke | `python3 scripts/run_kaiju_api_harness_smoke.py` | Passed: website + business-suite API artifacts | 2026-06-03 |
|
| 17 |
+
| Direct business-suite artifact | `python3 scripts/run_kaiju_router.py --prompt "...Kiyomi 7.7.7 AI company operating pack..." --print-manifest` | 19 files / passed | 2026-06-03 |
|
| 18 |
+
| Full local RC smoke gate | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` | Passed; latest router/static run `20260603T103915Z-kaiju_router_harness` | 2026-06-03 |
|
| 19 |
+
| v1.7 LoRA train | `./scripts/run-gojira-b-qwen36-lora-train.sh` | Finished; runtime `1663.7101s`, train loss `1.7260706673065822`, adapter present | 2026-06-03 |
|
| 20 |
+
| v1.7 SGLang serve | `./scripts/start-qwen36-lora-sglang.sh` with `KAIJU_QWEN36_LORA_CONTEXT=4096`, `KAIJU_QWEN36_LORA_MEM_FRACTION=0.90` | `/v1/models` returned `kaiju_v17_business_owner` | 2026-06-03 |
|
| 21 |
+
| Raw served adapter smoke: website | `python3 evals/run_openai_compat_smoke.py --base-url http://100.109.109.14:18083/v1 --model kaiju_v17_business_owner --tasks evals/tasks/smoke.jsonl --max-tasks 1 --disable-thinking` | Passed; `20260603T031300Z-kaiju_v17_business_owner`, 2,726 chars in 174.49s | 2026-06-03 |
|
| 22 |
+
| Raw served adapter smoke: proposal | `python3 evals/run_openai_compat_smoke.py --base-url http://100.109.109.14:18083/v1 --model kaiju_v17_business_owner --tasks /tmp/kaiju-proposal-smoke.jsonl --system-prompt-file prompts/kaiju-coder-api-system.md --disable-thinking` | Passed; `20260603T032107Z-kaiju_v17_business_owner`, 4,306 chars in 232.27s | 2026-06-03 |
|
| 23 |
+
| Raw served adapter quality: website | `python3 evals/score_quality_gate.py runs/evals/20260603T033825Z-kaiju_v17_business_owner/results.jsonl` | Failed paid-ready: `3.71/4.0`, missing complete HTML after 12,706 chars / 793.96s | 2026-06-03 |
|
| 24 |
+
| Raw served adapter quality: proposal | `python3 evals/score_quality_gate.py runs/evals/20260603T032107Z-kaiju_v17_business_owner/results.jsonl` | Passed paid-ready: `4.0/4.0` | 2026-06-03 |
|
| 25 |
+
| Raw served adapter quality: Jah credits | `python3 evals/score_quality_gate.py runs/evals/20260603T035612Z-kaiju_v17_business_owner/results.jsonl` | Passed paid-ready: `4.0/4.0` | 2026-06-03 |
|
| 26 |
+
| Base Qwen comparison: proposal | `python3 evals/compare_quality_runs.py runs/quality-gates/20260603T035200Z-qwen36-27b/scores.jsonl runs/quality-gates/20260603T032107Z-kaiju_v17_business_owner/scores.jsonl` | Tie: base `4.0/4.0`, Kaiju v1.7 `4.0/4.0` | 2026-06-03 |
|
| 27 |
+
| Base Qwen comparison: Jah credits | `python3 evals/compare_quality_runs.py runs/quality-gates/20260603T040140Z-qwen36-27b/scores.jsonl runs/quality-gates/20260603T035612Z-kaiju_v17_business_owner/scores.jsonl` | Tie: base `4.0/4.0`, Kaiju v1.7 `4.0/4.0`; deterministic outputs were byte-identical | 2026-06-03 |
|
| 28 |
+
| Raw adapter differentiation probe | Identity and Jah probes comparing `qwen36-27b` to `kaiju_v17_business_owner` | Current v1.7 SGLang outputs can be byte-identical to base on deterministic prompts; 24-step v1.7 is too weak as a raw-weight differentiator | 2026-06-03 |
|
| 29 |
+
| v1.8 stronger LoRA train | `KAIJU_LORA_CONFIG=training/configs/qwen36-27b-lora-v1.8-business-owner.example.json KAIJU_SFT_DATASET=datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl KAIJU_LORA_RUN_DIR=runs/qwen36-27b-lora-v1.8-business-owner KAIJU_MIN_TRAIN_EXAMPLES=350 KAIJU_SKIP_DATASET_BUILD=1 KAIJU_TRAIN_BACKGROUND=1 ./scripts/run-gojira-b-qwen36-lora-train.sh` | Finished; runtime `11666.7564s`, train loss `0.9281658741335074`, adapter present | 2026-06-03 |
|
| 30 |
+
| v1.8 SGLang dynamic LoRA serve | `./scripts/start-qwen36-lora-sglang.sh` with v1.8 adapter, `KAIJU_QWEN36_LORA_CONTEXT=8192`, `KAIJU_QWEN36_LORA_MEM_FRACTION=0.90` | Historical only: `/v1/models` listed `kaiju_v18_business_owner`, but adapter-name-only output can be base-equivalent; not release evidence | 2026-06-03 |
|
| 31 |
+
| Corrected v1.8 dynamic LoRA selector | Model selector `qwen36-27b:kaiju_v18_business_owner` under SGLang with fused target modules | Fails: `LoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16])`; dynamic LoRA is not the release path | 2026-06-03 |
|
| 32 |
+
| v1.8 LoRA merge | `KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter ./scripts/run-gojira-b-qwen36-lora-merge.sh` | Passed; merged full model at `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`, `51G`, `14` shards | 2026-06-03 |
|
| 33 |
+
| Kaiju Coder 7 merged SGLang serve | `./scripts/start-qwen36-merged-sglang.sh` with `KAIJU_QWEN36_MERGED_CONTEXT=32768`, `KAIJU_QWEN36_MERGED_MEM_FRACTION=0.90` | `/v1/models` returned `kaiju-coder-7`, max model len `32768`; 12k/16k/24k/32k evidence is recorded in `release/SERVING_BENCHMARKS.md` | 2026-06-03 |
|
| 34 |
+
| Kaiju Coder 7 restored 32k direct API smoke | `python3 scripts/benchmark_kaiju_serving.py --contexts 32768 --prompts identity business_doc --max-tokens 768 --timeout 420` | Passed; `/v1/models` returned `kaiju-coder-7`, max model len `32768`; identity `2.92s`; business proposal `94.28s`, `1,737` chars | 2026-06-03 |
|
| 35 |
+
| Kaiju Coder 7 restored 32k OpenCode one-file smoke | `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-32k-final-smoke 'Create hello.txt with exactly: Kaiju Coder 7 final 32k ok'` | Passed; wrote `hello.txt` with exactly `Kaiju Coder 7 final 32k ok` | 2026-06-03 |
|
| 36 |
+
| Kaiju Coder 7 current restored 16k direct API smoke | `python3 scripts/benchmark_kaiju_serving.py --contexts 16384 --prompts identity --max-tokens 64 --timeout 120` | Passed; latest run `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`, identity `2.3s`, `26` chars | 2026-06-03 |
|
| 37 |
+
| Kaiju Coder 7 current restored 16k OpenCode one-file smoke | `mkdir -p /tmp/kaiju-opencode-fresh-public-smoke && opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-fresh-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'` | Passed; `/v1/models` returned `kaiju-coder-7`, max model len `16384`; wrote `hello.txt` with exactly `Kaiju Coder 7 fresh public smoke ok` | 2026-06-03 |
|
| 38 |
+
| Kaiju Coder 7 packaged public OpenCode smoke | `python3 scripts/run_kaiju_public_opencode_smoke.py --timeout 900 --keep-dir` | Passed; latest run `runs/public-opencode-smoke/20260603T182222Z/summary.md`, `4/4` checks passed; installer dry-run, OpenCode `1.15.13`, live 16k model, and file written only in the requested temp workspace | 2026-06-03 |
|
| 39 |
+
| Kaiju Coder 7 loop-guarded OpenCode install | `python3 scripts/install_kaiju_opencode_profile.py`; `opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'` | Passed; config includes `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`; wrote `loopguard.txt` with exact requested content and exited cleanly | 2026-06-03 |
|
| 40 |
+
| Current harnessed OpenCode customer-readiness pack | `python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed` | Passed; latest run `runs/opencode-customer-readiness/20260603T185835Z/summary.md`, `4/4` tasks passed and `28/28` required files written, including release provenance and safety review | 2026-06-03 |
|
| 41 |
+
| Paid API Worker scaffold | `cd gateway/cloudflare-worker && npm run check && npm run preflight` | Passed `16/16` Worker tests and `17` scaffold preflight checks; covers bearer auth, inactive keys, insufficient credits, debit/refund, rate limit before debit, model `kaiju-coder-7` enforcement, stream/thinking/token caps, secret-content rejection without logging, signed Stripe Checkout top-up idempotency, origin-only R2 artifact upload, account-scoped artifact download, guarded Cloudflare resource prep, Wrangler dry-run deploy, sanitized paid-launch evidence template packaging, reviewed Cloudflare bindings template, binding applier guardrails, and sanitized evidence collection helper | 2026-06-03 |
|
| 42 |
+
| Kaiju Coder 7 merged vLLM serve | `KAIJU_VLLM_CONTEXT=16384 ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed at 16k with Gojira nightly vLLM after `pandas` preinstall and `--language-model-only`; identity `19.99s`, code patch `28.8s`; not faster enough to replace SGLang | 2026-06-03 |
|
| 43 |
+
| Kaiju Coder 7 runtime-quantized vLLM serve | `KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_QUANTIZATION=bitsandbytes KAIJU_VLLM_LOAD_FORMAT=bitsandbytes ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed at 8k and 16k; 16k identity `19.51s`, code patch `11.3s`; vLLM log reported about `17.8 GiB` model memory | 2026-06-03 |
|
| 44 |
+
| Kaiju Coder 7 runtime-quantized business-doc smoke | `KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_QUANTIZATION=bitsandbytes KAIJU_VLLM_LOAD_FORMAT=bitsandbytes KAIJU_VLLM_PROMPTS=business_doc KAIJU_VLLM_MAX_TOKENS=768 KAIJU_VLLM_PROMPT_TIMEOUT=420 ./scripts/run-gojira-b-vllm-serving-benchmark.sh` | Passed; business proposal `53.44s`, `1,610` chars, `30.127` chars/s; wrapper restored SGLang after completion | 2026-06-03 |
|
| 45 |
+
| Kaiju Coder 7 runtime-quantized OpenCode one-file smoke | `bash scripts/run_kaiju_quantized_opencode_smoke.sh` | Passed at 16k after vLLM `--enable-auto-tool-choice`; OpenCode wrote `hello.txt` with exactly `Kaiju Coder 7 quantized runtime ok` | 2026-06-03 |
|
| 46 |
+
| Hugging Face CLI install/auth check | `hf version && hf auth whoami && hf auth list` | `hf` installed locally at version `1.17.0`; auth user `restokes92`; token name `gojirakiyomikode` | 2026-06-03 |
|
| 47 |
+
| Hugging Face private repo create attempt | `KAIJU_HF_UPLOAD_APPLY=1 bash scripts/upload_hf_release_staging.sh` with namespaces `RichardEchols`, `RMDWLLC`, and `restokes92` | Blocked by Hugging Face `403 Forbidden`; current token cannot create model repos in those namespaces | 2026-06-03 |
|
| 48 |
+
| Hugging Face merged-model metadata and upload boundary | `bash scripts/prepare_hf_merged_model_metadata.sh`; `KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh`; `bash scripts/upload_hf_merged_model_from_gojira_b.sh`; `KAIJU_HF_UPLOAD_APPLY=1 bash scripts/upload_hf_merged_model_from_gojira_b.sh` | Metadata prep synced model card, quickstarts, provenance, benchmarks, evals, paid API status, final report, upstream license, and `MERGED_MODEL_RELEASE_MANIFEST.json` to Gojira-B; sudo rsync handled the root-owned merged folder; upload dry run confirmed metadata plus the `51G`/`14`-shard merged model before printing `hf upload-large-folder`; apply remains blocked by human review and Hugging Face namespace permission before any large upload | 2026-06-03 |
|
| 49 |
+
| v1.8 merged endpoint probe | Direct OpenAI-compatible chat request with top-level `chat_template_kwargs` disabling thinking | Passed; `1,155` visible chars in `60.17s`, normal `content` response | 2026-06-03 |
|
| 50 |
+
| Kaiju Coder 7 merged focused proposal eval | `python3 evals/run_openai_compat_smoke.py --model kaiju-coder-7 --tasks evals/tasks/business-owner-v18-comparison.jsonl --max-tasks 1 --max-tokens 1800 ...` then `python3 evals/score_quality_gate.py <results.jsonl>` | Passed: `1/1` paid-ready, `4.0/4.0`, `4,014` chars, `212.72s` | 2026-06-03 |
|
| 51 |
+
| Kaiju Coder 7 merged focused Jah credits eval | `python3 evals/run_openai_compat_smoke.py --model kaiju-coder-7 --tasks evals/tasks/business-owner-v18-comparison.jsonl ...` then `python3 evals/score_quality_gate.py <results.jsonl>` | Passed: `4.0/4.0`, `9,718` chars, `566.36s` | 2026-06-03 |
|
| 52 |
+
| Full local RC smoke gate | `python3 scripts/run_kaiju_business_owner_rc_smoke.py` | Passed; latest router/static run `20260603T103915Z-kaiju_router_harness` | 2026-06-03 |
|
| 53 |
+
|
| 54 |
+
## Required Before Release
|
| 55 |
+
|
| 56 |
+
| Gate | Required result | Status |
|
| 57 |
+
|---|---|---|
|
| 58 |
+
| v1.7 LoRA train | Finished metrics and adapter under `runs/qwen36-27b-lora-v1.7-business-owner` | Passed |
|
| 59 |
+
| v1.8 stronger LoRA train | Finished metrics and adapter under `runs/qwen36-27b-lora-v1.8-business-owner` | Passed |
|
| 60 |
+
| v1.8 merged focused smoke | `python3 evals/run_openai_compat_smoke.py --tasks evals/tasks/business-owner-v18-comparison.jsonl --model kaiju-coder-7 ...` then `python3 evals/score_quality_gate.py` | Passed for proposal rerun and Jah credits backend; broader sweep pending |
|
| 61 |
+
| Direct commercial eval | No critical failures, scored summary attached | Passed for targeted high-value tasks when using the product harness plus 8k raw website mode; broader task sweep still pending |
|
| 62 |
+
| Base Qwen comparison | Kaiju beats base Qwen on RMDW/Kiyomi practical tasks | Not yet: raw deterministic identity still matches base; compare broader tasks before model-level improvement claims |
|
| 63 |
+
| GLM comparison | Kaiju is near or above GLM on highest-value business-owner tasks | Pending |
|
| 64 |
+
| Local inference smoke | OpenAI-compatible endpoint returns usable business-owner artifact | Passed for v1.8 merged SGLang endpoint and product harness |
|
| 65 |
+
| Human review | Richard reviews artifacts for usefulness, privacy, and sellability | Pending |
|
| 66 |
+
| Release package | Model card, provenance, license notes, eval summary, limitations, Hugging Face draft, completion audit, and run instructions complete | Staged and upload-scripted; upload blocked by HF token permissions and human/public-review decision |
|
| 67 |
+
|
| 68 |
+
## Decision Rule
|
| 69 |
+
|
| 70 |
+
The v1.8 adapter is a completed local checkpoint and the merged full model is the current served raw-model path. The business-owner product should still be published honestly as merged model plus deterministic harness plus verifier. Raw merged v1.8 is useful on business documents and Jah credits but slow on this SGLang stack. Do not claim raw-weight superiority until broader base/GLM and raw website comparisons pass.
|
FINAL_RELEASE_REPORT.md
ADDED
|
@@ -0,0 +1,269 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 Final Release Report
|
| 2 |
+
|
| 3 |
+
Generated: `2026-06-03T20:03:02Z`
|
| 4 |
+
|
| 5 |
+
Product name: `Kaiju Coder 7`
|
| 6 |
+
Public model id: `kaiju-coder-7`
|
| 7 |
+
Current source branch: `codex/kaiju-business-owner-rc`
|
| 8 |
+
Current HEAD: `3d57eae92ad523519473f0ff3eca6661a9736de3`
|
| 9 |
+
Current `origin/main`: `3d57eae92ad523519473f0ff3eca6661a9736de3`
|
| 10 |
+
|
| 11 |
+
## Current Verdict
|
| 12 |
+
|
| 13 |
+
Kaiju Coder 7 is a local public-testing release candidate, not a fully public
|
| 14 |
+
commercial launch yet. The local model path, OpenCode profile, harnessed
|
| 15 |
+
business-owner evals, Hugging Face staging package, runtime-quantized recipe,
|
| 16 |
+
and paid API scaffold are in place. Public release still requires human
|
| 17 |
+
approval, a write-capable Hugging Face namespace/token, and live paid API
|
| 18 |
+
resources before the hosted API can be sold.
|
| 19 |
+
|
| 20 |
+
## Runtime
|
| 21 |
+
|
| 22 |
+
| Field | Value |
|
| 23 |
+
|---|---|
|
| 24 |
+
| Status | `pass` |
|
| 25 |
+
| Base URL | `http://100.109.109.14:18083/v1` |
|
| 26 |
+
| Model id | `kaiju-coder-7` |
|
| 27 |
+
| Max model length | `16384` |
|
| 28 |
+
| Detail | `` |
|
| 29 |
+
|
| 30 |
+
Recommended default today: `16k` context through `kaiju-coder-7`. Higher
|
| 31 |
+
context has benchmark evidence, but the currently parked default is 16k for
|
| 32 |
+
stability and speed.
|
| 33 |
+
|
| 34 |
+
## Readiness Summary
|
| 35 |
+
|
| 36 |
+
| Area | Result |
|
| 37 |
+
|---|---|
|
| 38 |
+
| Local public-testing readiness | `ready=True pass=23 fail=0 manual=1 rc=0` |
|
| 39 |
+
| Hugging Face release readiness | `ready=True pass=23 fail=0 manual=1 rc=0` |
|
| 40 |
+
| Public launch readiness | `ready=False pass=23 fail=1 manual=0 rc=1` |
|
| 41 |
+
| Hugging Face staging integrity | `ready=True pass=6 fail=0 manual=0 rc=0` |
|
| 42 |
+
| Paid API launch readiness | `ready=False pass=17 fail=3 manual=7 rc=1` |
|
| 43 |
+
|
| 44 |
+
## Hugging Face Release Blockers
|
| 45 |
+
|
| 46 |
+
| Status | Check | Detail |
|
| 47 |
+
|---|---|---|
|
| 48 |
+
| manual | paid API launch preflight | 17 pass, 3 fail, 7 manual |
|
| 49 |
+
|
| 50 |
+
## Public Launch Blockers
|
| 51 |
+
|
| 52 |
+
| Status | Check | Detail |
|
| 53 |
+
|---|---|---|
|
| 54 |
+
| fail | paid API launch preflight | 17 pass, 3 fail, 7 manual |
|
| 55 |
+
|
| 56 |
+
## Paid API Launch Blockers
|
| 57 |
+
|
| 58 |
+
| Status | Check | Detail |
|
| 59 |
+
|---|---|---|
|
| 60 |
+
| fail | live D1 binding | KAIJU_BILLING_DB is missing or still placeholder/commented |
|
| 61 |
+
| fail | live KV binding | KAIJU_RATE_LIMIT_KV is missing or still placeholder/commented |
|
| 62 |
+
| fail | artifact R2 binding | KAIJU_ARTIFACT_BUCKET is missing; artifact routes cannot launch |
|
| 63 |
+
| manual | public route mode | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `public_route_mode` |
|
| 64 |
+
| manual | wrangler secret list confirms KAIJU_ORIGIN_URL, KAIJU_ORIGIN_SECRET, and KAIJU_STRIPE_WEBHOOK_SECRET | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `wrangler_secrets_verified` |
|
| 65 |
+
| manual | D1 migration 0001_paid_api.sql applied to the live billing database | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `d1_migration_applied` |
|
| 66 |
+
| manual | Stripe Checkout top-up products and webhook endpoint tested with metadata.kaiju_api_key_id | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `stripe_checkout_topup_staging` |
|
| 67 |
+
| manual | staging request passed through Worker to Gojira-B origin with model=kaiju-coder-7 | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `worker_to_gojira_staging_request` |
|
| 68 |
+
| manual | rollback command or route switch was exercised and recorded | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `rollback_exercised` |
|
| 69 |
+
| manual | p95 latency for paid routes is recorded after staging traffic | attach sanitized evidence in /Users/richardecholsai7/Apps/kaiju-coder/release/paid-api-launch-evidence.json key `paid_route_latency` |
|
| 70 |
+
|
| 71 |
+
## Evidence Paths
|
| 72 |
+
|
| 73 |
+
| Evidence | Path |
|
| 74 |
+
|---|---|
|
| 75 |
+
| Completion audit | `release/COMPLETION_AUDIT.md` |
|
| 76 |
+
| Goal completion audit | `release/GOAL_COMPLETION_AUDIT.md` |
|
| 77 |
+
| Release evidence refresh runner | `scripts/refresh_kaiju_release_evidence.py` |
|
| 78 |
+
| Eval scoreboard | `release/EVAL_SCOREBOARD.md` |
|
| 79 |
+
| Public testing quickstart | `release/PUBLIC_TESTING_QUICKSTART.md` |
|
| 80 |
+
| Serving benchmarks | `release/SERVING_BENCHMARKS.md` |
|
| 81 |
+
| Hugging Face release draft | `release/HUGGINGFACE_RELEASE_DRAFT.md` |
|
| 82 |
+
| Hugging Face release bundle | `release/bundles/LATEST.md` |
|
| 83 |
+
| Hugging Face bundle integrity checker | `scripts/check_hf_release_bundle_integrity.py` |
|
| 84 |
+
| Hugging Face permission evidence template | `release/hf-release-permission-evidence.example.json` |
|
| 85 |
+
| Hugging Face permission evidence collector | `scripts/collect_hf_release_permission_evidence.py` |
|
| 86 |
+
| Hugging Face permission evidence checker | `scripts/check_hf_release_permission_evidence.py` |
|
| 87 |
+
| Merged-model metadata prep | `scripts/prepare_hf_merged_model_metadata.sh` |
|
| 88 |
+
| Human release review gate | `release/HUMAN_RELEASE_REVIEW.md` |
|
| 89 |
+
| Paid API readiness | `release/PAID_API_READINESS.md` |
|
| 90 |
+
| Paid API evidence collector | `scripts/collect_paid_api_launch_evidence.py` |
|
| 91 |
+
| Paid API launch evidence template | `release/paid-api-launch-evidence.example.json` |
|
| 92 |
+
| Cloudflare bindings template | `release/cloudflare-bindings.example.json` |
|
| 93 |
+
| Cloudflare bindings applier | `scripts/apply_paid_api_cloudflare_bindings.py` |
|
| 94 |
+
| Latest direct API smoke | `runs/benchmarks/20260603T193000Z-kaiju-coder-7-serving/summary.md` |
|
| 95 |
+
| Latest OpenCode customer pack | `runs/opencode-customer-readiness/20260603T185835Z/summary.md` |
|
| 96 |
+
| Latest public OpenCode smoke | `runs/public-opencode-smoke` |
|
| 97 |
+
|
| 98 |
+
## What Richard Should Test First
|
| 99 |
+
|
| 100 |
+
```bash
|
| 101 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode local
|
| 102 |
+
python3 scripts/install_kaiju_opencode_profile.py
|
| 103 |
+
mkdir -p /tmp/kaiju-public-smoke
|
| 104 |
+
opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-public-smoke --dangerously-skip-permissions 'Create hello.txt with exactly: Kaiju Coder 7 public smoke ok'
|
| 105 |
+
python3 scripts/run_kaiju_public_opencode_smoke.py
|
| 106 |
+
python3 scripts/run_kaiju_opencode_customer_pack.py --mode harnessed
|
| 107 |
+
bash scripts/prepare_hf_merged_model_metadata.sh
|
| 108 |
+
bash scripts/prepare_hf_release_staging.sh
|
| 109 |
+
python3 scripts/check_hf_staging_integrity.py --require-checksums
|
| 110 |
+
python3 scripts/create_hf_release_bundle.py
|
| 111 |
+
python3 scripts/check_hf_release_bundle_integrity.py
|
| 112 |
+
python3 scripts/check_kaiju_goal_completion.py --write
|
| 113 |
+
python3 scripts/refresh_kaiju_release_evidence.py --skip-opencode-smoke
|
| 114 |
+
python3 scripts/collect_hf_release_permission_evidence.py
|
| 115 |
+
# After HF repo-create permission is fixed:
|
| 116 |
+
python3 scripts/collect_hf_release_permission_evidence.py --apply --write
|
| 117 |
+
python3 scripts/check_hf_release_permission_evidence.py
|
| 118 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
|
| 119 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode public
|
| 120 |
+
cp release/cloudflare-bindings.example.json release/cloudflare-bindings.json
|
| 121 |
+
# Replace placeholder D1/KV IDs in release/cloudflare-bindings.json first.
|
| 122 |
+
python3 scripts/apply_paid_api_cloudflare_bindings.py --bindings-file release/cloudflare-bindings.json
|
| 123 |
+
cp release/paid-api-launch-evidence.example.json release/paid-api-launch-evidence.json
|
| 124 |
+
python3 scripts/collect_paid_api_launch_evidence.py --help
|
| 125 |
+
python3 scripts/check_paid_api_readiness.py --mode launch --evidence-file release/paid-api-launch-evidence.json
|
| 126 |
+
```
|
| 127 |
+
|
| 128 |
+
Do not expose the paid hosted API until `python3
|
| 129 |
+
scripts/check_paid_api_readiness.py --mode launch` has no failures and the
|
| 130 |
+
human release review explicitly approves public paid API launch.
|
| 131 |
+
|
| 132 |
+
## Changed Files
|
| 133 |
+
|
| 134 |
+
`git status --short` currently reports `112` changed paths.
|
| 135 |
+
|
| 136 |
+
| State | Path |
|
| 137 |
+
|---|---|
|
| 138 |
+
| M | `.gitignore` |
|
| 139 |
+
| M | `LICENSE_NOTES.md` |
|
| 140 |
+
| M | `README.md` |
|
| 141 |
+
| M | `datasets/schema.json` |
|
| 142 |
+
| M | `docs/custom-harness.md` |
|
| 143 |
+
| M | `evals/BAKEOFF_CURRENT.md` |
|
| 144 |
+
| M | `evals/run_openai_compat_smoke.py` |
|
| 145 |
+
| M | `evals/run_router_static_checks.py` |
|
| 146 |
+
| M | `evals/tasks/router-hard-harness.jsonl` |
|
| 147 |
+
| M | `gateway/README.md` |
|
| 148 |
+
| M | `gateway/cloudflare-worker/README.md` |
|
| 149 |
+
| M | `gateway/cloudflare-worker/migrations/0001_paid_api.sql` |
|
| 150 |
+
| M | `gateway/cloudflare-worker/package.json` |
|
| 151 |
+
| M | `gateway/cloudflare-worker/src/index.js` |
|
| 152 |
+
| M | `gateway/cloudflare-worker/test/index.test.js` |
|
| 153 |
+
| M | `kaiju_harness/router.py` |
|
| 154 |
+
| M | `kaiju_harness/verification.py` |
|
| 155 |
+
| D | `models/README.md` |
|
| 156 |
+
| D | `models/qwen3.6-27b-base.md` |
|
| 157 |
+
| D | `models/qwen3.6-27b-fp8.md` |
|
| 158 |
+
| M | `prompts/kaiju-coder-api-system.md` |
|
| 159 |
+
| M | `prompts/kaiju-coder-speed-system.md` |
|
| 160 |
+
| M | `release/DATA_PROVENANCE_DRAFT.md` |
|
| 161 |
+
| M | `release/MODEL_CARD_DRAFT.md` |
|
| 162 |
+
| M | `scripts/build_sft_dataset.py` |
|
| 163 |
+
| M | `scripts/check-gojira-b-capacity.sh` |
|
| 164 |
+
| M | `scripts/run-gojira-b-qwen36-lora-eval.sh` |
|
| 165 |
+
| M | `scripts/run-gojira-b-qwen36-lora-sglang-eval.sh` |
|
| 166 |
+
| M | `scripts/run-gojira-b-qwen36-lora-train.sh` |
|
| 167 |
+
| M | `scripts/run_kaiju_api_harness_smoke.py` |
|
| 168 |
+
| M | `scripts/start-qwen36-lora-sglang.sh` |
|
| 169 |
+
| M | `scripts/stop-qwen36-lora-sglang.sh` |
|
| 170 |
+
| M | `scripts/validate_training_data.py` |
|
| 171 |
+
| M | `scripts/watch-gojira-b-qwen36-lora-train.sh` |
|
| 172 |
+
| ?? | `.opencode/` |
|
| 173 |
+
| ?? | `datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl` |
|
| 174 |
+
| ?? | `datasets/v1.7-targets.json` |
|
| 175 |
+
| ?? | `evals/tasks/business-owner-v18-comparison.jsonl` |
|
| 176 |
+
| ?? | `evals/tasks/business-owner-v18-smoke.jsonl` |
|
| 177 |
+
| ?? | `evals/tasks/opencode-customer-readiness.jsonl` |
|
| 178 |
+
| ?? | `kaiju_harness/business_suite.py` |
|
| 179 |
+
| ?? | `release/COMPLETION_AUDIT.md` |
|
| 180 |
+
| ?? | `release/EVAL_SCOREBOARD.md` |
|
| 181 |
+
| ?? | `release/FINAL_RELEASE_REPORT.md` |
|
| 182 |
+
| ?? | `release/GOAL_COMPLETION_AUDIT.md` |
|
| 183 |
+
| ?? | `release/HF_ADAPTER_MODEL_CARD.md` |
|
| 184 |
+
| ?? | `release/HUGGINGFACE_RELEASE_DRAFT.md` |
|
| 185 |
+
| ?? | `release/HUMAN_RELEASE_REVIEW.md` |
|
| 186 |
+
| ?? | `release/LOCAL_TEST_INSTRUCTIONS.md` |
|
| 187 |
+
| ?? | `release/PAID_API_READINESS.md` |
|
| 188 |
+
| ?? | `release/PUBLIC_TESTING_QUICKSTART.md` |
|
| 189 |
+
| ?? | `release/QUANTIZATION_PLAN.md` |
|
| 190 |
+
| ?? | `release/SERVING_BENCHMARKS.md` |
|
| 191 |
+
| ?? | `release/SOURCE_INVENTORY.md` |
|
| 192 |
+
| ?? | `release/UPSTREAM_LICENSE_CHECK.md` |
|
| 193 |
+
| ?? | `release/bundles/` |
|
| 194 |
+
| ?? | `release/cloudflare-bindings.example.json` |
|
| 195 |
+
| ?? | `release/hf-release-permission-evidence.example.json` |
|
| 196 |
+
| ?? | `release/hf-release-permission-evidence.json` |
|
| 197 |
+
| ?? | `release/huggingface/` |
|
| 198 |
+
| ?? | `release/opencode/` |
|
| 199 |
+
| ?? | `release/paid-api-launch-evidence.example.json` |
|
| 200 |
+
| ?? | `release/quantized-runtime/` |
|
| 201 |
+
| ?? | `release/source-inventory.json` |
|
| 202 |
+
| ?? | `release/upstream/` |
|
| 203 |
+
| ?? | `scripts/apply_paid_api_cloudflare_bindings.py` |
|
| 204 |
+
| ?? | `scripts/benchmark_kaiju_serving.py` |
|
| 205 |
+
| ?? | `scripts/build_source_inventory.py` |
|
| 206 |
+
| ?? | `scripts/build_v17_business_owner_sft_dataset.py` |
|
| 207 |
+
| ?? | `scripts/check_hf_release_bundle_integrity.py` |
|
| 208 |
+
| ?? | `scripts/check_hf_release_permission_evidence.py` |
|
| 209 |
+
| ?? | `scripts/check_hf_release_permissions.sh` |
|
| 210 |
+
| ?? | `scripts/check_hf_staging_integrity.py` |
|
| 211 |
+
| ?? | `scripts/check_hf_uploaded_release.py` |
|
| 212 |
+
| ?? | `scripts/check_human_release_review.py` |
|
| 213 |
+
| ?? | `scripts/check_kaiju_goal_completion.py` |
|
| 214 |
+
| ?? | `scripts/check_kaiju_public_release_readiness.py` |
|
| 215 |
+
| ?? | `scripts/check_kaiju_quantization_prereqs.py` |
|
| 216 |
+
| ?? | `scripts/check_paid_api_readiness.py` |
|
| 217 |
+
| ?? | `scripts/collect_hf_release_permission_evidence.py` |
|
| 218 |
+
| ?? | `scripts/collect_paid_api_launch_evidence.py` |
|
| 219 |
+
| ?? | `scripts/create_hf_release_bundle.py` |
|
| 220 |
+
| ?? | `scripts/generate_kaiju_final_report.py` |
|
| 221 |
+
| ?? | `scripts/gojira-b-ssh-lib.sh` |
|
| 222 |
+
| ?? | `scripts/install_kaiju_opencode_profile.py` |
|
| 223 |
+
| ?? | `scripts/opencode-kaiju-no-autocontinue.mjs` |
|
| 224 |
+
| ?? | `scripts/prepare_hf_merged_model_metadata.sh` |
|
| 225 |
+
| ?? | `scripts/prepare_hf_release_staging.sh` |
|
| 226 |
+
| ?? | `scripts/prepare_paid_api_cloudflare_resources.sh` |
|
| 227 |
+
| ?? | `scripts/probe-gojira-b-kaiju-quantization.sh` |
|
| 228 |
+
| ?? | `scripts/refresh_kaiju_release_evidence.py` |
|
| 229 |
+
| ?? | `scripts/run-gojira-b-qwen36-lora-merge.sh` |
|
| 230 |
+
| ?? | `scripts/run-gojira-b-vllm-serving-benchmark.sh` |
|
| 231 |
+
| ?? | `scripts/run_kaiju_business_owner_rc_smoke.py` |
|
| 232 |
+
| ?? | `scripts/run_kaiju_opencode_customer_pack.py` |
|
| 233 |
+
| ?? | `scripts/run_kaiju_public_opencode_smoke.py` |
|
| 234 |
+
| ?? | `scripts/run_kaiju_quantized_opencode_smoke.sh` |
|
| 235 |
+
| ?? | `scripts/start-qwen36-merged-sglang.sh` |
|
| 236 |
+
| ?? | `scripts/start-qwen36-merged-vllm.sh` |
|
| 237 |
+
| ?? | `scripts/stop-qwen36-merged-sglang.sh` |
|
| 238 |
+
| ?? | `scripts/stop-qwen36-merged-vllm.sh` |
|
| 239 |
+
| ?? | `scripts/upload_hf_merged_model_from_gojira_b.sh` |
|
| 240 |
+
| ?? | `scripts/upload_hf_release_staging.sh` |
|
| 241 |
+
| ?? | `tests/test_kiyomi_business_suite.py` |
|
| 242 |
+
| ?? | `tests/test_release_package.py` |
|
| 243 |
+
| ?? | `tests/test_source_inventory.py` |
|
| 244 |
+
| ?? | `tests/test_training_provenance.py` |
|
| 245 |
+
| ?? | `tests/test_v17_business_dataset.py` |
|
| 246 |
+
| ?? | `training/configs/qwen36-27b-lora-v1.7.example.json` |
|
| 247 |
+
| ?? | `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json` |
|
| 248 |
+
| ?? | `training/scripts/qwen36_lora_merge.py` |
|
| 249 |
+
| ?? | `training/v1.7-business-owner-runbook.md` |
|
| 250 |
+
|
| 251 |
+
## Commands Run During Report Generation
|
| 252 |
+
|
| 253 |
+
| Label | Command | Return code |
|
| 254 |
+
|---|---|---|
|
| 255 |
+
| git branch | `git branch --show-current` | 0 |
|
| 256 |
+
| git HEAD | `git rev-parse HEAD` | 0 |
|
| 257 |
+
| git origin/main | `git rev-parse origin/main` | 0 |
|
| 258 |
+
| git status | `git status --short` | 0 |
|
| 259 |
+
| local readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode local --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 0 |
|
| 260 |
+
| HF release readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode hf-release --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 0 |
|
| 261 |
+
| public readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_kaiju_public_release_readiness.py --mode public --json --base-url http://100.109.109.14:18083/v1 --live-timeout 5 --staging-dir /tmp/kaiju-coder-7-hf-staging` | 1 |
|
| 262 |
+
| HF staging integrity | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_hf_staging_integrity.py --staging-dir /tmp/kaiju-coder-7-hf-staging --require-checksums --json` | 0 |
|
| 263 |
+
| paid API launch readiness | `/opt/homebrew/opt/python@3.14/bin/python3.14 scripts/check_paid_api_readiness.py --mode launch --json` | 1 |
|
| 264 |
+
|
| 265 |
+
## Report Safety
|
| 266 |
+
|
| 267 |
+
This generator intentionally avoids secret-bearing commands such as auth token
|
| 268 |
+
lists, environment dumps, process command-line scans, Wrangler secret lists, and
|
| 269 |
+
payment-provider credential output.
|
GOAL_COMPLETION_AUDIT.md
ADDED
|
@@ -0,0 +1,60 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 Goal Completion Audit
|
| 2 |
+
|
| 3 |
+
Generated: `2026-06-03T20:03:23Z`
|
| 4 |
+
|
| 5 |
+
Overall: `not complete`
|
| 6 |
+
Summary: `16 passed / 1 blocked / 0 manual`
|
| 7 |
+
|
| 8 |
+
This audit maps the active Kaiju Coder 7 objective to current evidence. It is stricter than local readiness: local public testing can pass while Hugging Face upload, human review, and paid API launch remain blocked.
|
| 9 |
+
|
| 10 |
+
## Readiness Commands
|
| 11 |
+
|
| 12 |
+
| Check | Ready | Return Code |
|
| 13 |
+
|---|---:|---:|
|
| 14 |
+
| Local public-testing readiness | `True` | `0` |
|
| 15 |
+
| Hugging Face release readiness | `True` | `0` |
|
| 16 |
+
| Public launch readiness | `False` | `1` |
|
| 17 |
+
| Paid API scaffold | `True` | `0` |
|
| 18 |
+
| Paid API launch | `False` | `1` |
|
| 19 |
+
| HF staging integrity | `True` | `0` |
|
| 20 |
+
| HF namespace permission evidence | `True` | `0` |
|
| 21 |
+
| Human public review | `True` | `0` |
|
| 22 |
+
|
| 23 |
+
## Requirement Audit
|
| 24 |
+
|
| 25 |
+
| Area | Requirement | Status | Evidence | Blocker |
|
| 26 |
+
|---|---|---|---|---|
|
| 27 |
+
| Identity | Product name is Kaiju Coder 7 and public/API model id is kaiju-coder-7. | `passed` | scripts/check_kaiju_public_release_readiness.py --mode local; release/PUBLIC_TESTING_QUICKSTART.md | |
|
| 28 |
+
| OpenCode | Lean Kaiju-specific OpenCode config/agent minimizes prompt overhead and disables synthetic auto-continue loops. | `passed` | .opencode/agents/kaiju-coder-7.md; scripts/opencode-kaiju-no-autocontinue.mjs; scripts/install_kaiju_opencode_profile.py | |
|
| 29 |
+
| OpenCode | opencode -m kaiju/kaiju-coder-7 works from this Mac with the recommended config. | `passed` | runs/public-opencode-smoke latest passing summary; scripts/run_kaiju_public_opencode_smoke.py | |
|
| 30 |
+
| OpenCode | Customer-readiness pack passes without wrong-directory output, fake compaction completion, missing files, or secret leakage. | `passed` | runs/opencode-customer-readiness/20260603T185835Z/summary.md | |
|
| 31 |
+
| Runtime | Direct API smoke passes using model=kaiju-coder-7. | `passed` | runs/benchmarks/20260603T193000Z-kaiju-coder-7-serving/summary.md | |
|
| 32 |
+
| Runtime | 12k, 16k, 24k, and 32k context benchmarks are recorded with a recommended default. | `passed` | release/SERVING_BENCHMARKS.md records 12288, 16384, 24576, 32768 and recommends 16k live default | |
|
| 33 |
+
| Runtime | SGLang and vLLM/practical faster serving path are benchmarked honestly. | `passed` | release/SERVING_BENCHMARKS.md; release/quantized-runtime/README.md | |
|
| 34 |
+
| Runtime | At least one public-friendly quantized/local candidate is working or clearly documented as blocked with evidence. | `passed` | release/quantized-runtime/README.md documents vLLM bitsandbytes runtime candidate and persisted-weights limitation | |
|
| 35 |
+
| Hugging Face | Public-friendly HF release structure is staged with adapter, OpenCode helper, runtime-quantized helper, model cards, provenance, evals, and docs. | `passed` | python3 scripts/check_hf_staging_integrity.py --require-checksums | |
|
| 36 |
+
| Hugging Face | At least one public Hugging Face release path is ready to upload or uploaded. | `passed` | python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release | |
|
| 37 |
+
| Hugging Face | Merged 51GB model repo upload is guarded and ready after human review/namespace permission. | `passed` | scripts/prepare_hf_merged_model_metadata.sh; scripts/upload_hf_merged_model_from_gojira_b.sh dry run | |
|
| 38 |
+
| Quality | Customer-style evals cover website, proposal, Stripe/payment, CRM/reporting, CSV/parser, Kiyomi operating pack, and safety/provenance. | `passed` | evals/tasks/opencode-customer-readiness.jsonl; runs/opencode-customer-readiness/20260603T185835Z/summary.md | |
|
| 39 |
+
| Quality | Model/harness prompts produce file-oriented business-owner artifacts rather than vague advice. | `passed` | kaiju_harness/business_suite.py; release/EVAL_SCOREBOARD.md | |
|
| 40 |
+
| Provenance | Training/eval provenance is preserved and public docs avoid internal checkpoint naming except license/provenance attribution. | `passed` | release/SOURCE_INVENTORY.md; release/DATA_PROVENANCE_DRAFT.md; release/PUBLIC_TESTING_QUICKSTART.md | |
|
| 41 |
+
| Paid API | Paid API scaffold covers API keys, Stripe billing, rate limits, logging controls, abuse controls, rollback plan, and pricing assumptions. | `passed` | python3 scripts/check_paid_api_readiness.py --mode scaffold; gateway/cloudflare-worker tests | |
|
| 42 |
+
| Paid API | Paid API is ready for public charging. | `blocked` | python3 scripts/check_paid_api_readiness.py --mode launch | Requires live D1/KV/R2 bindings, Wrangler secrets, Stripe staging evidence, Worker-to-Gojira staging request, rollback proof, latency evidence, and human approval. |
|
| 43 |
+
| Final Report | Final report includes exact commands run, eval results, changed files, remaining risks, and what Richard should test first. | `passed` | release/FINAL_RELEASE_REPORT.md | |
|
| 44 |
+
|
| 45 |
+
## Blocking Items
|
| 46 |
+
|
| 47 |
+
- Paid API: Paid API is ready for public charging.: Requires live D1/KV/R2 bindings, Wrangler secrets, Stripe staging evidence, Worker-to-Gojira staging request, rollback proof, latency evidence, and human approval.
|
| 48 |
+
|
| 49 |
+
## Commands To Re-run
|
| 50 |
+
|
| 51 |
+
```bash
|
| 52 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode local
|
| 53 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
|
| 54 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode public
|
| 55 |
+
python3 scripts/check_paid_api_readiness.py --mode scaffold
|
| 56 |
+
python3 scripts/check_paid_api_readiness.py --mode launch
|
| 57 |
+
python3 scripts/check_hf_staging_integrity.py --require-checksums
|
| 58 |
+
python3 scripts/check_hf_release_permission_evidence.py
|
| 59 |
+
python3 scripts/check_human_release_review.py --mode public
|
| 60 |
+
```
|
LOCAL_TEST_INSTRUCTIONS.md
ADDED
|
@@ -0,0 +1,147 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 Local Test Instructions
|
| 2 |
+
|
| 3 |
+
Use these commands from the repo root. The public release name is Kaiju Coder 7. Internally, this build is backed by the v1.8 adapter under `runs/qwen36-27b-lora-v1.8-business-owner/adapter`. The release-candidate raw model path is the merged full model on Gojira B at `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`. The deterministic harness commands work locally now; the SGLang commands require Gojira B over Tailscale.
|
| 4 |
+
|
| 5 |
+
## Run The Local Release-Candidate Gate
|
| 6 |
+
|
| 7 |
+
```bash
|
| 8 |
+
python3 scripts/run_kaiju_business_owner_rc_smoke.py
|
| 9 |
+
```
|
| 10 |
+
|
| 11 |
+
This validates reviewed data, checks v1.7 targets, builds the oversampled business-owner SFT file, smokes the local OpenAI-compatible harness API, runs the hard router suite, and runs static artifact checks.
|
| 12 |
+
|
| 13 |
+
For release status, read `release/COMPLETION_AUDIT.md` and `release/HUGGINGFACE_RELEASE_DRAFT.md`.
|
| 14 |
+
|
| 15 |
+
## Merge The v1.8 Adapter
|
| 16 |
+
|
| 17 |
+
Use this if the merged full model must be rebuilt:
|
| 18 |
+
|
| 19 |
+
```bash
|
| 20 |
+
KAIJU_LORA_ADAPTER=/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter \
|
| 21 |
+
KAIJU_MERGED_MODEL_DIR=/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged \
|
| 22 |
+
./scripts/run-gojira-b-qwen36-lora-merge.sh
|
| 23 |
+
```
|
| 24 |
+
|
| 25 |
+
## Start Kaiju Coder 7 Serving
|
| 26 |
+
|
| 27 |
+
Use this for the current model-side candidate:
|
| 28 |
+
|
| 29 |
+
```bash
|
| 30 |
+
KAIJU_QWEN36_MERGED_PORT=18083 \
|
| 31 |
+
KAIJU_QWEN36_MERGED_SESSION=kaiju_qwen36_v18_merged_sglang \
|
| 32 |
+
KAIJU_QWEN36_MERGED_CONTEXT=16384 \
|
| 33 |
+
KAIJU_QWEN36_MERGED_MEM_FRACTION=0.85 \
|
| 34 |
+
./scripts/start-qwen36-merged-sglang.sh
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
Confirm readiness:
|
| 38 |
+
|
| 39 |
+
```bash
|
| 40 |
+
curl http://100.109.109.14:18083/v1/models
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
The high-context `32768` target has benchmark evidence in
|
| 44 |
+
`release/SERVING_BENCHMARKS.md`, but the current restored Gojira-B endpoint is
|
| 45 |
+
parked at `16384` for reliable local/OpenCode testing after the quantized-vLLM
|
| 46 |
+
smoke work.
|
| 47 |
+
|
| 48 |
+
## Prepare Merged-Model Hugging Face Metadata
|
| 49 |
+
|
| 50 |
+
Use this before any full merged-model upload review. It syncs release metadata
|
| 51 |
+
into the Gojira-B model folder but does not upload or read Hugging Face tokens.
|
| 52 |
+
If the remote merged folder is root-owned, the helper automatically uses
|
| 53 |
+
passwordless sudo for rsync without changing model ownership:
|
| 54 |
+
|
| 55 |
+
```bash
|
| 56 |
+
bash scripts/prepare_hf_merged_model_metadata.sh
|
| 57 |
+
KAIJU_MERGED_METADATA_APPLY=1 bash scripts/prepare_hf_merged_model_metadata.sh
|
| 58 |
+
bash scripts/upload_hf_merged_model_from_gojira_b.sh
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
## Install And Smoke OpenCode
|
| 62 |
+
|
| 63 |
+
```bash
|
| 64 |
+
python3 scripts/install_kaiju_opencode_profile.py
|
| 65 |
+
opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
|
| 66 |
+
--dir /tmp/kaiju-opencode-loopguard-smoke \
|
| 67 |
+
--dangerously-skip-permissions \
|
| 68 |
+
'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
The installer writes the `kaiju` provider, the lean `kaiju-coder-7` agent, and
|
| 72 |
+
the scoped no-autocontinue plugin at
|
| 73 |
+
`~/.config/opencode/kaiju-no-autocontinue.mjs`.
|
| 74 |
+
|
| 75 |
+
## Run The Deterministic Harness Smoke
|
| 76 |
+
|
| 77 |
+
```bash
|
| 78 |
+
python3 scripts/run_kaiju_api_harness_smoke.py
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
## Run A Direct Model Eval
|
| 82 |
+
|
| 83 |
+
```bash
|
| 84 |
+
python3 evals/run_openai_compat_smoke.py \
|
| 85 |
+
--base-url http://100.109.109.14:18083/v1 \
|
| 86 |
+
--model kaiju-coder-7 \
|
| 87 |
+
--tasks evals/tasks/smoke.jsonl \
|
| 88 |
+
--max-tasks 1 \
|
| 89 |
+
--timeout 300 \
|
| 90 |
+
--max-tokens 768 \
|
| 91 |
+
--temperature 0 \
|
| 92 |
+
--disable-thinking \
|
| 93 |
+
--system-prompt-file prompts/kaiju-coder-api-system.md
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
For the selected final business-owner checkpoint, run the focused v1.8
|
| 97 |
+
business-owner pack and then score it. Raw merged model generation is slow, so
|
| 98 |
+
use the harness for practical paid website delivery until broader raw website
|
| 99 |
+
evals pass at acceptable latency:
|
| 100 |
+
|
| 101 |
+
```bash
|
| 102 |
+
python3 evals/run_openai_compat_smoke.py \
|
| 103 |
+
--base-url http://100.109.109.14:18083/v1 \
|
| 104 |
+
--model kaiju-coder-7 \
|
| 105 |
+
--tasks evals/tasks/business-owner-v18-comparison.jsonl \
|
| 106 |
+
--timeout 900 \
|
| 107 |
+
--max-tokens 2500 \
|
| 108 |
+
--temperature 0 \
|
| 109 |
+
--disable-thinking \
|
| 110 |
+
--stream \
|
| 111 |
+
--system-prompt-file prompts/kaiju-coder-api-system.md
|
| 112 |
+
|
| 113 |
+
python3 evals/score_quality_gate.py runs/evals/<merged-v18-run>/results.jsonl
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
Current merged evidence:
|
| 117 |
+
|
| 118 |
+
- Probe: `1,155` visible chars in `60.17s`.
|
| 119 |
+
- Proposal rerun: `1/1` paid-ready, `4.0/4.0`, `4,014` chars in `212.72s`.
|
| 120 |
+
- Jah credits backend: `4.0/4.0`, `9,718` chars in `566.36s`.
|
| 121 |
+
|
| 122 |
+
## Dynamic LoRA Serving Caveat
|
| 123 |
+
|
| 124 |
+
Do not use dynamic SGLang LoRA serving as release evidence for v1.8. The adapter-name-only path can be base-equivalent, and the corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes this SGLang build with a fused-module LoRA buffer shape mismatch. Use the merged full-model path above.
|
| 125 |
+
|
| 126 |
+
## Run The Business-Owner Harness
|
| 127 |
+
|
| 128 |
+
```bash
|
| 129 |
+
python3 evals/run_router_harness_eval.py --tasks evals/tasks/router-hard-harness.jsonl
|
| 130 |
+
python3 evals/run_router_static_checks.py runs/evals/<router-run>/results.jsonl
|
| 131 |
+
```
|
| 132 |
+
|
| 133 |
+
## Manual Prompt To Try First
|
| 134 |
+
|
| 135 |
+
```text
|
| 136 |
+
Build me the full Kiyomi 7.7.7 AI company operating pack for a local business owner. I need the launch kit, website, content engine, connector checklist, intake CRM, money report, automations, operator handbook, lead generator, sales closer, ROI dashboard, and Workshop golden run. Make it owner-ready with no developer setup required.
|
| 137 |
+
```
|
| 138 |
+
|
| 139 |
+
Expected shape:
|
| 140 |
+
|
| 141 |
+
- A project folder with multiple files, not advice only.
|
| 142 |
+
- Complete HTML where HTML is requested.
|
| 143 |
+
- Lead/sales CSVs.
|
| 144 |
+
- Connector verification gates.
|
| 145 |
+
- ROI audit gate.
|
| 146 |
+
- Workshop golden-run gate.
|
| 147 |
+
- Clear owner commands such as `/kiyomi` and `/kiyomi-do`.
|
MERGED_MODEL_RELEASE_MANIFEST.json
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"product": "Kaiju Coder 7",
|
| 3 |
+
"model_id": "kaiju-coder-7",
|
| 4 |
+
"remote_model_dir": "/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged",
|
| 5 |
+
"metadata_status": "prepared_for_huggingface_review",
|
| 6 |
+
"notes": [
|
| 7 |
+
"Local metadata sync only; no Hugging Face upload performed.",
|
| 8 |
+
"Qwen attribution belongs in README/provenance/license notes, not the product model id.",
|
| 9 |
+
"Public paid API launch remains blocked until live launch preflight and human review pass."
|
| 10 |
+
]
|
| 11 |
+
}
|
PAID_API_READINESS.md
ADDED
|
@@ -0,0 +1,266 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 Paid API Readiness
|
| 2 |
+
|
| 3 |
+
Do not sell the hosted API as generally available until the gates below pass.
|
| 4 |
+
|
| 5 |
+
## Current Position
|
| 6 |
+
|
| 7 |
+
Kaiju Coder 7 can be served locally through an OpenAI-compatible SGLang
|
| 8 |
+
endpoint. The reliable commercial product path is:
|
| 9 |
+
|
| 10 |
+
```text
|
| 11 |
+
Kaiju Coder 7 model + deterministic business-owner harness + verifier + gateway controls
|
| 12 |
+
```
|
| 13 |
+
|
| 14 |
+
Raw multi-file OpenCode generation is not yet fast enough to be the paid API
|
| 15 |
+
promise by itself. The harnessed customer-readiness pack passes and should be
|
| 16 |
+
the paid-route baseline until raw-agent generation improves.
|
| 17 |
+
|
| 18 |
+
## Required Gateway Behavior
|
| 19 |
+
|
| 20 |
+
- Use model id `kaiju-coder-7`.
|
| 21 |
+
- Disable hidden thinking where the serving stack supports it.
|
| 22 |
+
- Stream responses for long outputs.
|
| 23 |
+
- Cap max output by route.
|
| 24 |
+
- Reject requests with secret-looking prompt content when possible.
|
| 25 |
+
- Never log API keys, bearer tokens, OAuth tokens, payment credentials, or full
|
| 26 |
+
private customer prompts by default.
|
| 27 |
+
- Keep request ids, customer id, route, token counts, latency, status, and coarse
|
| 28 |
+
failure reason.
|
| 29 |
+
|
| 30 |
+
## Billing And Access
|
| 31 |
+
|
| 32 |
+
- API keys must be scoped per customer/account.
|
| 33 |
+
- Stripe subscription or prepaid credit balance must be checked before serving.
|
| 34 |
+
- Rate limits must be per key and per account.
|
| 35 |
+
- Failed auth and rate-limit events should be logged without prompt content.
|
| 36 |
+
- Admin override keys must be separate from customer keys.
|
| 37 |
+
|
| 38 |
+
## Current Gateway Scaffold Evidence
|
| 39 |
+
|
| 40 |
+
Local Worker scaffold:
|
| 41 |
+
|
| 42 |
+
- `gateway/cloudflare-worker/src/index.js`
|
| 43 |
+
- `gateway/cloudflare-worker/migrations/0001_paid_api.sql`
|
| 44 |
+
- `gateway/cloudflare-worker/test/index.test.js`
|
| 45 |
+
|
| 46 |
+
Verified on 2026-06-03 with:
|
| 47 |
+
|
| 48 |
+
```bash
|
| 49 |
+
cd gateway/cloudflare-worker
|
| 50 |
+
npm run check
|
| 51 |
+
npm run preflight
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
Result: `16/16` Worker tests passed and `17` paid API scaffold preflight checks
|
| 55 |
+
passed.
|
| 56 |
+
The scaffold preflight also checks that the guarded Cloudflare resource-prep
|
| 57 |
+
script, `scripts/prepare_paid_api_cloudflare_resources.sh`, is wired through
|
| 58 |
+
`npm run prepare:cloudflare`, and that the reviewed binding template is present.
|
| 59 |
+
|
| 60 |
+
Covered locally:
|
| 61 |
+
|
| 62 |
+
- missing bearer token returns `401`
|
| 63 |
+
- inactive API key returns `403`
|
| 64 |
+
- insufficient credits return `402` before origin fetch
|
| 65 |
+
- successful chat request forwards `x-kaiju-origin-secret` and debits credits
|
| 66 |
+
- origin fetch failure refunds credits
|
| 67 |
+
- fixed-window rate limit blocks before debit
|
| 68 |
+
- public chat payload is forced to model `kaiju-coder-7`, streaming, thinking
|
| 69 |
+
disabled, and token capped
|
| 70 |
+
- unsupported model is rejected before debit
|
| 71 |
+
- secret-looking prompt content is rejected before debit, origin fetch, or logs
|
| 72 |
+
- signed Stripe Checkout webhook credits prepaid balance
|
| 73 |
+
- duplicate Stripe Checkout webhook does not double-credit
|
| 74 |
+
- invalid Stripe signature is rejected
|
| 75 |
+
- origin-only artifact upload stores bounded text artifacts in R2
|
| 76 |
+
- authenticated artifact download is scoped to the caller's account namespace
|
| 77 |
+
- unsafe artifact paths are rejected before R2 storage
|
| 78 |
+
- secret-looking artifact content is rejected before R2 storage
|
| 79 |
+
|
| 80 |
+
Executable preflight:
|
| 81 |
+
|
| 82 |
+
```bash
|
| 83 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode local
|
| 84 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode hf-release
|
| 85 |
+
python3 scripts/check_kaiju_public_release_readiness.py --mode public
|
| 86 |
+
python3 scripts/generate_kaiju_final_report.py
|
| 87 |
+
python3 scripts/check_kaiju_goal_completion.py --write
|
| 88 |
+
python3 scripts/refresh_kaiju_release_evidence.py --skip-opencode-smoke
|
| 89 |
+
python3 scripts/check_hf_staging_integrity.py
|
| 90 |
+
python3 scripts/check_hf_release_bundle_integrity.py
|
| 91 |
+
python3 scripts/collect_hf_release_permission_evidence.py
|
| 92 |
+
python3 scripts/check_hf_release_permission_evidence.py
|
| 93 |
+
python3 scripts/check_human_release_review.py --mode local
|
| 94 |
+
python3 scripts/check_human_release_review.py --mode public
|
| 95 |
+
cd gateway/cloudflare-worker
|
| 96 |
+
npm run prepare:cloudflare
|
| 97 |
+
cd ../..
|
| 98 |
+
cp release/cloudflare-bindings.example.json release/cloudflare-bindings.json
|
| 99 |
+
# Replace placeholder D1/KV IDs in release/cloudflare-bindings.json first.
|
| 100 |
+
python3 scripts/apply_paid_api_cloudflare_bindings.py --bindings-file release/cloudflare-bindings.json
|
| 101 |
+
python3 scripts/check_paid_api_readiness.py --mode scaffold
|
| 102 |
+
python3 scripts/check_paid_api_readiness.py --mode launch
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
`check_kaiju_public_release_readiness.py --mode local` is the consolidated
|
| 106 |
+
public-testing readiness command. It can pass while public upload and paid API
|
| 107 |
+
launch remain manual blockers. `--mode hf-release` checks the downloadable
|
| 108 |
+
model/helper release and requires sanitized Hugging Face namespace permission
|
| 109 |
+
evidence plus human review while keeping paid API launch manual. `--mode public`
|
| 110 |
+
must remain red until Hugging Face write permissions, live Cloudflare resources,
|
| 111 |
+
Stripe staging evidence, rollback proof, and human review are complete.
|
| 112 |
+
|
| 113 |
+
`generate_kaiju_final_report.py` writes `release/FINAL_RELEASE_REPORT.md` with
|
| 114 |
+
the current local/public readiness summaries, launch blockers, changed files,
|
| 115 |
+
commands run, and first commands Richard should test. It is part of the release
|
| 116 |
+
packet and does not inspect tokens, environment variables, or process command
|
| 117 |
+
lines.
|
| 118 |
+
|
| 119 |
+
`check_kaiju_goal_completion.py --write` writes
|
| 120 |
+
`release/GOAL_COMPLETION_AUDIT.md`, a stricter objective-level audit. It should
|
| 121 |
+
remain red while Hugging Face upload, human review, or live paid API launch
|
| 122 |
+
evidence are missing.
|
| 123 |
+
|
| 124 |
+
`refresh_kaiju_release_evidence.py` is a safe local refresh runner. It updates
|
| 125 |
+
direct API smoke evidence, goal audit, final report, HF staging, local bundle,
|
| 126 |
+
merged-model metadata on Gojira-B, and dry-run upload previews without reading
|
| 127 |
+
tokens or uploading anything.
|
| 128 |
+
|
| 129 |
+
`check_hf_staging_integrity.py` validates the staged Hugging Face package for
|
| 130 |
+
required files, public naming hygiene, raw secret-looking values, and staging
|
| 131 |
+
checksums. It does not upload, create repos, or print matched secret values.
|
| 132 |
+
|
| 133 |
+
`check_hf_release_permission_evidence.py` validates sanitized Hugging Face
|
| 134 |
+
repo-create evidence in `release/hf-release-permission-evidence.json`. Start
|
| 135 |
+
from `release/hf-release-permission-evidence.example.json` only after the
|
| 136 |
+
private permission probe succeeds, or use
|
| 137 |
+
`scripts/collect_hf_release_permission_evidence.py --apply --write` to run the
|
| 138 |
+
probe and write the sanitized evidence automatically. Never include raw auth
|
| 139 |
+
output or tokens.
|
| 140 |
+
|
| 141 |
+
`check_human_release_review.py` reads `release/HUMAN_RELEASE_REVIEW.md`. Local
|
| 142 |
+
mode may pass with pending/manual review fields; public mode must fail until
|
| 143 |
+
Richard changes the signoff fields to approved decisions.
|
| 144 |
+
|
| 145 |
+
`npm run prepare:cloudflare` is dry-run safe by default. It prints the exact
|
| 146 |
+
Wrangler commands for creating `KAIJU_BILLING_DB`, `KAIJU_RATE_LIMIT_KV`, and
|
| 147 |
+
`KAIJU_ARTIFACT_BUCKET`, applying the D1 migration, setting required secrets,
|
| 148 |
+
deploying, listing deployments, and exercising rollback. `npm run check` also
|
| 149 |
+
runs `npx wrangler deploy --dry-run` so the current Worker build path is validated
|
| 150 |
+
without publishing. Set
|
| 151 |
+
`KAIJU_CF_RESOURCE_APPLY=1` only when the intended Cloudflare account is active.
|
| 152 |
+
|
| 153 |
+
After real D1/KV/R2 resources exist, copy
|
| 154 |
+
`release/cloudflare-bindings.example.json` to `release/cloudflare-bindings.json`,
|
| 155 |
+
replace the placeholder IDs, and preview the reviewed config update:
|
| 156 |
+
|
| 157 |
+
```bash
|
| 158 |
+
python3 scripts/apply_paid_api_cloudflare_bindings.py \
|
| 159 |
+
--bindings-file release/cloudflare-bindings.json
|
| 160 |
+
```
|
| 161 |
+
|
| 162 |
+
The applier refuses placeholder values and secret-looking input. Only after the
|
| 163 |
+
preview is reviewed should it update `gateway/cloudflare-worker/wrangler.jsonc`:
|
| 164 |
+
|
| 165 |
+
```bash
|
| 166 |
+
python3 scripts/apply_paid_api_cloudflare_bindings.py \
|
| 167 |
+
--bindings-file release/cloudflare-bindings.json \
|
| 168 |
+
--write
|
| 169 |
+
```
|
| 170 |
+
|
| 171 |
+
`--mode scaffold` verifies the local gateway implementation and should pass.
|
| 172 |
+
`--mode launch` is stricter and should fail until real Cloudflare bindings,
|
| 173 |
+
Wrangler secrets, Stripe webhook evidence, staging traffic, latency evidence,
|
| 174 |
+
and rollback proof are attached.
|
| 175 |
+
|
| 176 |
+
Launch evidence is attached through a sanitized JSON file:
|
| 177 |
+
|
| 178 |
+
```bash
|
| 179 |
+
cp release/paid-api-launch-evidence.example.json release/paid-api-launch-evidence.json
|
| 180 |
+
python3 scripts/collect_paid_api_launch_evidence.py --help
|
| 181 |
+
python3 scripts/check_paid_api_readiness.py --mode launch \
|
| 182 |
+
--evidence-file release/paid-api-launch-evidence.json
|
| 183 |
+
```
|
| 184 |
+
|
| 185 |
+
Use `scripts/collect_paid_api_launch_evidence.py` to preview or write sanitized
|
| 186 |
+
launch evidence after staging resources exist. It can read the staging API key
|
| 187 |
+
from an environment variable for live probes, but it never writes the key, full
|
| 188 |
+
prompt, or model response to the evidence file. By default it prints a preview;
|
| 189 |
+
pass `--write` only after reviewing the target file path.
|
| 190 |
+
|
| 191 |
+
Only record secret names, route names, request ids, coarse latency numbers, and
|
| 192 |
+
pass/fail facts. Do not put raw API keys, bearer tokens, OAuth tokens, Stripe
|
| 193 |
+
secret keys, webhook signing secrets, tunnel credentials, full private prompts,
|
| 194 |
+
or customer private data in the evidence file. The checker scans the evidence
|
| 195 |
+
file for common secret-looking values and fails launch readiness if it finds
|
| 196 |
+
them.
|
| 197 |
+
|
| 198 |
+
## Minimum API Gates
|
| 199 |
+
|
| 200 |
+
| Gate | Required Evidence |
|
| 201 |
+
| --- | --- |
|
| 202 |
+
| Auth | Unauthorized requests fail; valid test key works |
|
| 203 |
+
| Billing | Unpaid/suspended account is denied before model call |
|
| 204 |
+
| Rate limit | Burst and daily caps work per key |
|
| 205 |
+
| Logging | Logs omit secrets and full private prompts |
|
| 206 |
+
| Abuse control | Secret-looking payloads and obviously unsafe automation requests are rejected or redacted |
|
| 207 |
+
| Artifacts | Origin-only R2 upload and account-scoped artifact download pass |
|
| 208 |
+
| Rollback | One command can route traffic back to previous stable model/harness |
|
| 209 |
+
| Latency | p95 for paid routes is documented and acceptable |
|
| 210 |
+
| Quality | Business-owner eval pack passes with complete files/artifacts |
|
| 211 |
+
|
| 212 |
+
Current quality evidence:
|
| 213 |
+
|
| 214 |
+
- Harnessed customer-readiness pack:
|
| 215 |
+
`runs/opencode-customer-readiness/20260603T185835Z/summary.md`, `4/4`
|
| 216 |
+
passed, `28/28` required files written, including the release provenance and
|
| 217 |
+
safety review task.
|
| 218 |
+
- Restored 32k SGLang direct API smoke:
|
| 219 |
+
`runs/benchmarks/20260603T155233Z-kaiju-coder-7-serving/summary.md`,
|
| 220 |
+
identity passed in `2.92s`; business proposal passed in `94.28s` with
|
| 221 |
+
`1,737` chars.
|
| 222 |
+
- Runtime-quantized vLLM OpenCode smoke:
|
| 223 |
+
`bash scripts/run_kaiju_quantized_opencode_smoke.sh` passed at 16k after
|
| 224 |
+
vLLM launched with `--enable-auto-tool-choice`; OpenCode wrote
|
| 225 |
+
`hello.txt` with exactly `Kaiju Coder 7 quantized runtime ok`.
|
| 226 |
+
- Current restored 16k SGLang direct API smoke:
|
| 227 |
+
`runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`,
|
| 228 |
+
identity passed in `2.3s`.
|
| 229 |
+
- Raw OpenCode multi-file pack remains a blocker for raw-agent claims.
|
| 230 |
+
|
| 231 |
+
## Pricing Assumptions To Validate
|
| 232 |
+
|
| 233 |
+
- Raw model tokens are slow and expensive enough that per-token pricing alone is
|
| 234 |
+
not the right first product.
|
| 235 |
+
- Better first API product: priced business-owner routes such as website pack,
|
| 236 |
+
proposal pack, ROI/report pack, and Kiyomi operating pack.
|
| 237 |
+
- Charge for complete artifacts and verified workflow output, with token usage
|
| 238 |
+
as an internal cost-control metric.
|
| 239 |
+
|
| 240 |
+
## Release Blockers
|
| 241 |
+
|
| 242 |
+
- Raw OpenCode customer-readiness task currently times out on multi-file work.
|
| 243 |
+
- Harnessed customer-readiness route passes; paid API must route through that
|
| 244 |
+
deterministic product path until a faster raw/quantized path passes.
|
| 245 |
+
- Context-size benchmarks passed at 12k, 16k, 24k, and 32k, but the current
|
| 246 |
+
parked Gojira-B/OpenCode profile is 16k. Treat 32k as the high-context target
|
| 247 |
+
to re-confirm after restart before using it as a public default.
|
| 248 |
+
- Restored 32k business-document direct API smoke passed, but the `94.28s`
|
| 249 |
+
latency is too slow for ungated paid API use without streaming, queueing,
|
| 250 |
+
and route-level caps.
|
| 251 |
+
- vLLM serving has been tested at 16k, but it is not clearly faster than SGLang
|
| 252 |
+
and needs the Gojira nightly image plus text-only launch flags.
|
| 253 |
+
- Runtime-quantized vLLM bitsandbytes has passed 8k and 16k identity/code
|
| 254 |
+
smoke tests, passed a 16k business-document smoke in `53.44s`, and reduces
|
| 255 |
+
model memory to about `17.8 GiB`; its OpenCode one-file smoke now passes.
|
| 256 |
+
- Persisted quantized public weights are still pending.
|
| 257 |
+
- Hosted gateway scaffold now has local-tested API key, D1 prepaid credits,
|
| 258 |
+
fixed-window rate limit, model enforcement, secret-content rejection, and
|
| 259 |
+
signed Stripe webhook top-up behavior. It also has a sanitized launch-evidence
|
| 260 |
+
collector for the remaining staging proof. It is not live-paid ready until real
|
| 261 |
+
Cloudflare resources, Stripe products/webhook endpoint, deployment secrets,
|
| 262 |
+
sanitized launch evidence, and staging end-to-end requests pass.
|
| 263 |
+
- `python3 scripts/check_paid_api_readiness.py --mode launch` currently fails
|
| 264 |
+
by design because live D1/KV/R2 bindings and manual launch evidence are not
|
| 265 |
+
attached. This prevents local scaffold readiness from being mistaken for
|
| 266 |
+
paid public launch approval.
|
PUBLIC_TESTING_QUICKSTART.md
ADDED
|
@@ -0,0 +1,149 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 Public Testing Quickstart
|
| 2 |
+
|
| 3 |
+
Kaiju Coder 7 is the public model name. The OpenAI-compatible model id is:
|
| 4 |
+
|
| 5 |
+
```text
|
| 6 |
+
kaiju-coder-7
|
| 7 |
+
```
|
| 8 |
+
|
| 9 |
+
Use this guide for serious public testing. It avoids internal checkpoint names
|
| 10 |
+
and keeps the current limitations clear.
|
| 11 |
+
|
| 12 |
+
## Pick A Test Path
|
| 13 |
+
|
| 14 |
+
### Path 1: OpenCode Against An Existing Endpoint
|
| 15 |
+
|
| 16 |
+
Use this if you already have Kaiju Coder 7 served at an OpenAI-compatible
|
| 17 |
+
`/v1` endpoint.
|
| 18 |
+
|
| 19 |
+
```bash
|
| 20 |
+
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
|
| 21 |
+
cd kaiju-coder-7-opencode
|
| 22 |
+
python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
|
| 23 |
+
```
|
| 24 |
+
|
| 25 |
+
Then run OpenCode inside the project you want to edit:
|
| 26 |
+
|
| 27 |
+
```bash
|
| 28 |
+
opencode -m kaiju/kaiju-coder-7 --agent kaiju-coder-7
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
For a bounded smoke test:
|
| 32 |
+
|
| 33 |
+
```bash
|
| 34 |
+
mkdir -p /tmp/kaiju-public-smoke
|
| 35 |
+
opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
|
| 36 |
+
--dir /tmp/kaiju-public-smoke \
|
| 37 |
+
"Create hello.txt with exactly: Kaiju Coder 7 is ready"
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
Or run the packaged verifier, which checks the installer, live model endpoint,
|
| 41 |
+
OpenCode binary, actual file creation, and wrong-directory behavior:
|
| 42 |
+
|
| 43 |
+
```bash
|
| 44 |
+
python3 scripts/run_kaiju_public_opencode_smoke.py
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
The helper installer adds:
|
| 48 |
+
|
| 49 |
+
- the `kaiju` OpenAI-compatible provider
|
| 50 |
+
- the lean `kaiju-coder-7` OpenCode agent
|
| 51 |
+
- a scoped no-autocontinue plugin that prevents false completion loops after
|
| 52 |
+
compaction or output limits
|
| 53 |
+
|
| 54 |
+
### Path 2: Full Local Weights
|
| 55 |
+
|
| 56 |
+
Use this if the full `RMDWLLC/kaiju-coder-7` Hugging Face repo has been
|
| 57 |
+
uploaded and you have suitable local GPU hardware.
|
| 58 |
+
|
| 59 |
+
```bash
|
| 60 |
+
hf download RMDWLLC/kaiju-coder-7 --local-dir ./kaiju-coder-7
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
+
Serve the downloaded folder with an OpenAI-compatible local server. Configure
|
| 64 |
+
the server to expose:
|
| 65 |
+
|
| 66 |
+
```text
|
| 67 |
+
model id: kaiju-coder-7
|
| 68 |
+
base URL: http://127.0.0.1:18083/v1
|
| 69 |
+
context: 16384
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
Then install the OpenCode helper with:
|
| 73 |
+
|
| 74 |
+
```bash
|
| 75 |
+
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
|
| 76 |
+
cd kaiju-coder-7-opencode
|
| 77 |
+
python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
|
| 78 |
+
```
|
| 79 |
+
|
| 80 |
+
### Path 3: Runtime-Quantized Local Candidate
|
| 81 |
+
|
| 82 |
+
Use this only if you are comfortable with advanced serving setups. The current
|
| 83 |
+
working quantized option is a runtime bitsandbytes recipe, not a separate
|
| 84 |
+
persisted quantized weights repo.
|
| 85 |
+
|
| 86 |
+
```bash
|
| 87 |
+
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-quantized-runtime
|
| 88 |
+
cd kaiju-coder-7-quantized-runtime
|
| 89 |
+
```
|
| 90 |
+
|
| 91 |
+
Read `README.md` in that repo before serving. This path can reduce model memory
|
| 92 |
+
at runtime, but it still depends on access to the full Kaiju Coder 7 weights.
|
| 93 |
+
|
| 94 |
+
## Recommended Test Prompt
|
| 95 |
+
|
| 96 |
+
Run this from an empty project folder:
|
| 97 |
+
|
| 98 |
+
```text
|
| 99 |
+
Build a launch-ready local service business website and operating pack. Include
|
| 100 |
+
index.html, a Stripe checkout safety plan, a CSV parser with tests, a simple CRM
|
| 101 |
+
schema, a weekly money report, and a safety/provenance note. Write the files,
|
| 102 |
+
not just advice.
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
Expected result:
|
| 106 |
+
|
| 107 |
+
- files are written in the requested project folder
|
| 108 |
+
- `index.html` is complete HTML
|
| 109 |
+
- business docs start with Markdown H1 headings
|
| 110 |
+
- code includes a test or smoke-check command where practical
|
| 111 |
+
- no fake API keys, OAuth tokens, payment secrets, or private customer data
|
| 112 |
+
|
| 113 |
+
## Current Recommended Defaults
|
| 114 |
+
|
| 115 |
+
- Public model id: `kaiju-coder-7`
|
| 116 |
+
- OpenCode context: `16384`
|
| 117 |
+
- Output cap for public testing: `2500`
|
| 118 |
+
- Current reliable product path: model plus deterministic business-owner
|
| 119 |
+
harness plus verifier
|
| 120 |
+
- Raw multi-file OpenCode generation: still too slow for broad paid API claims
|
| 121 |
+
- Paid API: not public until launch preflight passes
|
| 122 |
+
|
| 123 |
+
## What Not To Claim Yet
|
| 124 |
+
|
| 125 |
+
Do not claim:
|
| 126 |
+
|
| 127 |
+
- that raw model weights alone reliably build every business-owner artifact
|
| 128 |
+
- that a paid hosted API is generally available
|
| 129 |
+
- that persisted quantized weights exist
|
| 130 |
+
- that 32k context is the current live default
|
| 131 |
+
|
| 132 |
+
Do claim:
|
| 133 |
+
|
| 134 |
+
- Kaiju Coder 7 has a working local/OpenCode release candidate
|
| 135 |
+
- the current tested OpenCode default is 16k context
|
| 136 |
+
- the helper package includes a lean agent and compaction loop guard
|
| 137 |
+
- the paid API scaffold has tests and a launch preflight, but is not yet public
|
| 138 |
+
- the packaged public smoke verifies a fresh OpenCode one-file write before
|
| 139 |
+
public claims are refreshed
|
| 140 |
+
|
| 141 |
+
## Current Blockers Before Public Release
|
| 142 |
+
|
| 143 |
+
- Hugging Face repo creation still requires a write-capable token or namespace.
|
| 144 |
+
- Full merged model upload has not completed; the merged folder must first have
|
| 145 |
+
the metadata packet synced by `prepare_hf_merged_model_metadata.sh`.
|
| 146 |
+
- Public paid API launch needs real Cloudflare D1/KV/R2 bindings, Wrangler
|
| 147 |
+
secret verification, Stripe webhook staging evidence, staging traffic, latency
|
| 148 |
+
evidence, and rollback proof.
|
| 149 |
+
- Human review is still required before public upload.
|
README.md
ADDED
|
@@ -0,0 +1,160 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 by Kiyomi - Model Card Draft
|
| 2 |
+
|
| 3 |
+
This is a draft. Do not publish until the current eval gate passes, final license files are attached, and the exact trained checkpoint is recorded.
|
| 4 |
+
|
| 5 |
+
## Model Summary
|
| 6 |
+
|
| 7 |
+
Kaiju Coder 7 by Kiyomi is an RMDW fine-tuned coding and builder model for solo entrepreneurs and local-first AI users.
|
| 8 |
+
|
| 9 |
+
Primary intended work:
|
| 10 |
+
|
| 11 |
+
- Build complete websites and landing pages.
|
| 12 |
+
- Build Kiyomi-style AI-company launch packs for business owners.
|
| 13 |
+
- Write scripts, small apps, and automation flows.
|
| 14 |
+
- Reason about Stripe, licensing, auth proxies, and release workflows.
|
| 15 |
+
- Draft practical business documents such as proposals, launch plans, support notes, operator handbooks, and follow-up sequences.
|
| 16 |
+
- Produce intake/CRM schemas, lead lists, ROI dashboards, and reporting artifacts.
|
| 17 |
+
- Help builders avoid overbuilt architecture and ship useful artifacts.
|
| 18 |
+
|
| 19 |
+
## Base Model
|
| 20 |
+
|
| 21 |
+
- Base candidate: `Qwen/Qwen3.6-27B`
|
| 22 |
+
- Base model URL: `https://huggingface.co/Qwen/Qwen3.6-27B`
|
| 23 |
+
- Checked revision: `6a9e13bd6fc8f0983b9b99948120bc37f49c13e9`
|
| 24 |
+
- License tag checked: `apache-2.0` on 2026-06-03
|
| 25 |
+
- Upstream license copy: `release/upstream/qwen3.6-27b/LICENSE`
|
| 26 |
+
- Upstream license check: `release/UPSTREAM_LICENSE_CHECK.md`
|
| 27 |
+
|
| 28 |
+
Required before release:
|
| 29 |
+
|
| 30 |
+
- Include upstream Apache 2.0 license.
|
| 31 |
+
- Include upstream notices if present.
|
| 32 |
+
- Do not imply Qwen or Alibaba endorsement.
|
| 33 |
+
- Use attribution language only: "Fine-tuned from Qwen under Apache 2.0."
|
| 34 |
+
|
| 35 |
+
## Fine-Tuning
|
| 36 |
+
|
| 37 |
+
- Method: LoRA
|
| 38 |
+
- Existing full run lineage: `qwen36-27b-lora-v0.1` through current Kaiju adapters
|
| 39 |
+
- Training hardware: Gojira B, 128GB NVIDIA Spark
|
| 40 |
+
- v0.1 training examples: 575 reviewed examples
|
| 41 |
+
- v1.7 training file: `datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl`
|
| 42 |
+
- v1.7 raw reviewed examples: 1,689
|
| 43 |
+
- v1.7 training rows after business-owner oversampling: 1,881
|
| 44 |
+
- v1.7 business-owner addendum: 8 reviewed examples, oversampled 24 times for the next run
|
| 45 |
+
- v1.7 config: `training/configs/qwen36-27b-lora-v1.7.example.json`
|
| 46 |
+
- v1.7 run scope: 1,024-token context, 24 steps, intended as a testable business-owner adapter rather than a final long-context bakeoff
|
| 47 |
+
- v1.7 train runtime: `1663.7101s`
|
| 48 |
+
- v1.7 train loss: `1.7260706673065822`
|
| 49 |
+
- v1.7 train/eval examples: `1,769` / `112`
|
| 50 |
+
- v1.7 adapter path: `runs/qwen36-27b-lora-v1.7-business-owner/adapter`
|
| 51 |
+
- v1.8 config: `training/configs/qwen36-27b-lora-v1.8-business-owner.example.json`
|
| 52 |
+
- v1.8 scope: 2,048-token context, 96 max steps, same reviewed/oversampled v1.7 business-owner SFT rows
|
| 53 |
+
- v1.8 train runtime: `11666.7564s`
|
| 54 |
+
- v1.8 train loss: `0.9281658741335074`
|
| 55 |
+
- v1.8 train/eval examples: `1,769` / `112`
|
| 56 |
+
- v1.8 adapter path: `runs/qwen36-27b-lora-v1.8-business-owner/adapter`
|
| 57 |
+
- v1.8 status: completed on 2026-06-03 and merged into a full local model for serving; do not publish externally until human review, upstream notices, broader comparison evals, and raw website limitation language are complete
|
| 58 |
+
- Trainable parameters: approximately 79.7M
|
| 59 |
+
- Base parameters: approximately 27.0B
|
| 60 |
+
- Merged full-model artifact: `/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged`, `51G`, `14` safetensor shards plus tokenizer/config sidecars
|
| 61 |
+
|
| 62 |
+
Release note: Kaiju's current product path may combine a compact model planner with deterministic harnesses and verifier checks. If the shipped experience uses that harness path, release copy must say so plainly instead of implying the raw model weights alone create every artifact.
|
| 63 |
+
|
| 64 |
+
## Data
|
| 65 |
+
|
| 66 |
+
The dataset is source-backed and RMDW-owned or RMDW-authored. The current source inventory is tracked in `release/SOURCE_INVENTORY.md`.
|
| 67 |
+
|
| 68 |
+
High-level categories:
|
| 69 |
+
|
| 70 |
+
- Website/UI
|
| 71 |
+
- Coding
|
| 72 |
+
- Debugging
|
| 73 |
+
- Automation
|
| 74 |
+
- Tool-use
|
| 75 |
+
- Strategy
|
| 76 |
+
- Business
|
| 77 |
+
- Business-suite
|
| 78 |
+
- Identity
|
| 79 |
+
|
| 80 |
+
Excluded data:
|
| 81 |
+
|
| 82 |
+
- Closed-model outputs from OpenAI, Anthropic, Gemini, or similar providers as supervised training completions.
|
| 83 |
+
- Customer private code without explicit permission.
|
| 84 |
+
- Client-specific website text, contact details, contracts, or private business details unless explicitly reviewed and approved.
|
| 85 |
+
- Secrets, credentials, private keys, tokens, cookies, and raw support logs containing personal data.
|
| 86 |
+
|
| 87 |
+
## Evaluation
|
| 88 |
+
|
| 89 |
+
Required bakeoff before release:
|
| 90 |
+
|
| 91 |
+
- Base Qwen 3.6 27B
|
| 92 |
+
- Kaiju Coder LoRA
|
| 93 |
+
- GLM 4.7 production baseline
|
| 94 |
+
|
| 95 |
+
Current local harness evidence:
|
| 96 |
+
|
| 97 |
+
- 2026-06-03 Kiyomi business-suite router hard gate: `23/23` passed.
|
| 98 |
+
- Business-suite prompts: `2/2` passed.
|
| 99 |
+
- Static artifact checks: `23/23` passed.
|
| 100 |
+
- Dataset validation: `1,689` reviewed candidate examples across `14` files.
|
| 101 |
+
- v1.7 target gate: all category minimums met, including `business_suite` and `proposal`.
|
| 102 |
+
- v1.7 served adapter smoke:
|
| 103 |
+
- Website task `website-barber-001`: passed, 2,726 chars in 174.49s.
|
| 104 |
+
- Proposal task `proposal-001` with Kaiju API system prompt: passed, 4,306 chars in 232.27s.
|
| 105 |
+
- v1.7 serving config: SGLang over Tailscale at `http://100.109.109.14:18083/v1`, model `kaiju_v17_business_owner`, context `4096`, memory fraction `0.90`.
|
| 106 |
+
- v1.8 training metrics: runtime `11666.7564s`, train loss `0.9281658741335074`, adapter present.
|
| 107 |
+
- v1.8 dynamic SGLang LoRA caveat:
|
| 108 |
+
- Adapter-name-only serving can be base-equivalent.
|
| 109 |
+
- Corrected selector `qwen36-27b:kaiju_v18_business_owner` crashes with `LoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16])`.
|
| 110 |
+
- Dynamic LoRA is not the release serving path for this checkpoint.
|
| 111 |
+
- Kaiju Coder 7 serving config: SGLang over Tailscale at `http://100.109.109.14:18083/v1`, model `kaiju-coder-7`, current parked Gojira-B/OpenCode context `16384`, tested high-context target `32768`, memory fraction `0.90`.
|
| 112 |
+
- v1.8 merged endpoint probe: `1,155` visible chars in `60.17s`.
|
| 113 |
+
- v1.8 merged focused eval:
|
| 114 |
+
- Proposal rerun: `1/1` paid-ready, `4.0/4.0`, `4,014` chars in `212.72s`.
|
| 115 |
+
- Jah credits backend: `4.0/4.0`, `9,718` chars in `566.36s`.
|
| 116 |
+
- Broader base-Qwen, GLM, and raw website comparisons are still pending.
|
| 117 |
+
|
| 118 |
+
Sellable-candidate gate:
|
| 119 |
+
|
| 120 |
+
- Beats base Qwen on RMDW practical evals.
|
| 121 |
+
- Near or above GLM 4.7 on highest-value customer tasks.
|
| 122 |
+
- No critical safety failures.
|
| 123 |
+
- Produces complete artifacts instead of plans only.
|
| 124 |
+
- Produces owner-ready Kiyomi/RMDW artifacts for websites, connector packs, CRM, reporting, leads, sales, ROI, and operator training.
|
| 125 |
+
- Distinct useful voice without becoming gimmicky.
|
| 126 |
+
|
| 127 |
+
## Limitations
|
| 128 |
+
|
| 129 |
+
Known limitations:
|
| 130 |
+
|
| 131 |
+
- Not a general frontier model.
|
| 132 |
+
- May be weaker than large cloud frontier models on broad reasoning and uncommon programming domains.
|
| 133 |
+
- Needs a strong harness for tool use, file editing, and long-running work.
|
| 134 |
+
- Raw merged serving is slow on this SGLang stack.
|
| 135 |
+
- Dynamic SGLang LoRA serving is not release-quality for this adapter; use the merged model path.
|
| 136 |
+
- Business-owner performance depends on source-backed evals, provenance controls, and deterministic artifact verification.
|
| 137 |
+
- Hosted API release requires billing, rate limits, abuse controls, logs, and rollback.
|
| 138 |
+
|
| 139 |
+
## Intended Use
|
| 140 |
+
|
| 141 |
+
Good fit:
|
| 142 |
+
|
| 143 |
+
- Solo-founder product work.
|
| 144 |
+
- Small-business websites and automations.
|
| 145 |
+
- Kiyomi-style local AI product workflows.
|
| 146 |
+
- Practical coding and deployment assistance.
|
| 147 |
+
|
| 148 |
+
Not a fit:
|
| 149 |
+
|
| 150 |
+
- High-risk medical, legal, financial, or safety-critical decisions without expert review.
|
| 151 |
+
- Secret handling without a secure app layer.
|
| 152 |
+
- Claims of guaranteed correctness.
|
| 153 |
+
|
| 154 |
+
## Release Status
|
| 155 |
+
|
| 156 |
+
Current status: business-owner release-candidate preparation.
|
| 157 |
+
|
| 158 |
+
Fresh v1.7 and v1.8 LoRA training finished on 2026-06-03 after clearing old ComfyUI/Ollama workloads from Gojira B. The current completed testable product path is the v1.8 merged model plus the deterministic business-owner harness and verifier. Raw merged model testing works for focused business-owner documents and backend automations, but the paid website path remains harness-first until broader raw website evals pass.
|
| 159 |
+
|
| 160 |
+
Do not publish weights or sell hosted API access until the eval and release checklist pass.
|
SERVING_BENCHMARKS.md
ADDED
|
@@ -0,0 +1,358 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Coder 7 Serving Benchmarks
|
| 2 |
+
|
| 3 |
+
This file records serving evidence for public download and paid API decisions.
|
| 4 |
+
The model id must remain `kaiju-coder-7`.
|
| 5 |
+
|
| 6 |
+
## Current Live Runtime
|
| 7 |
+
|
| 8 |
+
- Host: Gojira-B over Tailscale
|
| 9 |
+
- Base URL: `http://100.109.109.14:18083/v1`
|
| 10 |
+
- Serving stack: SGLang merged full model
|
| 11 |
+
- Current verified post-quantization restored context: `16384`
|
| 12 |
+
- Tested high-context target: `32768`
|
| 13 |
+
- Current container: `qwen36-merged-sglang-18083`
|
| 14 |
+
- Current caveat: direct raw generation is slow for multi-file OpenCode work.
|
| 15 |
+
|
| 16 |
+
## Benchmark Command
|
| 17 |
+
|
| 18 |
+
For current-context latency without restart:
|
| 19 |
+
|
| 20 |
+
```bash
|
| 21 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 22 |
+
--contexts 12288 \
|
| 23 |
+
--prompts identity business_doc code_patch \
|
| 24 |
+
--max-tokens 768 \
|
| 25 |
+
--timeout 420
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
For context restart benchmarking:
|
| 29 |
+
|
| 30 |
+
```bash
|
| 31 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 32 |
+
--restart \
|
| 33 |
+
--contexts 12288 16384 24576 32768 \
|
| 34 |
+
--prompts identity business_doc \
|
| 35 |
+
--max-tokens 768 \
|
| 36 |
+
--timeout 420 \
|
| 37 |
+
--ready-timeout 1200
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
Use `--contexts 16384` for the current restored Gojira-B endpoint. Use
|
| 41 |
+
`32768` when explicitly testing the high-context target; it has passed earlier
|
| 42 |
+
benchmarks but should be re-confirmed after a fresh restart before calling it
|
| 43 |
+
the live default.
|
| 44 |
+
|
| 45 |
+
## Current 12k Direct API Benchmark
|
| 46 |
+
|
| 47 |
+
Command:
|
| 48 |
+
|
| 49 |
+
```bash
|
| 50 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 51 |
+
--contexts 12288 \
|
| 52 |
+
--prompts identity code_patch \
|
| 53 |
+
--max-tokens 256 \
|
| 54 |
+
--timeout 300
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
Run: `runs/benchmarks/20260603T135017Z-kaiju-coder-7-serving/summary.md`
|
| 58 |
+
|
| 59 |
+
| Context | Prompt | OK | Seconds | Chars | Chars/s |
|
| 60 |
+
| --- | --- | --- | ---: | ---: | ---: |
|
| 61 |
+
| 12288 | identity | True | 2.41 | 26 | 10.788 |
|
| 62 |
+
| 12288 | code_patch | True | 57.61 | 860 | 14.928 |
|
| 63 |
+
|
| 64 |
+
Interpretation: direct API calls are usable for short tasks, but latency is too
|
| 65 |
+
high for a paid raw-code API unless outputs are streamed and route-specific
|
| 66 |
+
limits are enforced.
|
| 67 |
+
|
| 68 |
+
## 16k Context Benchmark
|
| 69 |
+
|
| 70 |
+
16k was tested to reduce OpenCode compaction pressure.
|
| 71 |
+
|
| 72 |
+
Commands:
|
| 73 |
+
|
| 74 |
+
```bash
|
| 75 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 76 |
+
--restart \
|
| 77 |
+
--contexts 16384 \
|
| 78 |
+
--prompts identity \
|
| 79 |
+
--max-tokens 128 \
|
| 80 |
+
--timeout 300 \
|
| 81 |
+
--ready-timeout 1200
|
| 82 |
+
|
| 83 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 84 |
+
--contexts 16384 \
|
| 85 |
+
--prompts code_patch \
|
| 86 |
+
--max-tokens 128 \
|
| 87 |
+
--timeout 300
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
Runs:
|
| 91 |
+
|
| 92 |
+
- `runs/benchmarks/20260603T135651Z-kaiju-coder-7-serving/summary.md`
|
| 93 |
+
- `runs/benchmarks/20260603T140318Z-kaiju-coder-7-serving/summary.md`
|
| 94 |
+
|
| 95 |
+
| Context | Prompt | OK | Load Wait | Seconds | Chars | Chars/s |
|
| 96 |
+
| --- | --- | --- | ---: | ---: | ---: | ---: |
|
| 97 |
+
| 16384 | identity | True | 354.16 | 14.9 | 26 | 1.745 |
|
| 98 |
+
| 16384 | code_patch | True | n/a | 28.99 | 416 | 14.35 |
|
| 99 |
+
|
| 100 |
+
Interpretation: `16384` is a stable lower-load fallback and still leaves more
|
| 101 |
+
room above OpenCode's prompt/tool overhead than the original 12k setting.
|
| 102 |
+
|
| 103 |
+
## 24k And 32k Context Benchmarks
|
| 104 |
+
|
| 105 |
+
24k and 32k were tested after 16k proved stable. Both loaded and returned the
|
| 106 |
+
same code-patch latency profile as 16k on the short patch benchmark.
|
| 107 |
+
|
| 108 |
+
Commands:
|
| 109 |
+
|
| 110 |
+
```bash
|
| 111 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 112 |
+
--restart \
|
| 113 |
+
--contexts 24576 \
|
| 114 |
+
--prompts identity \
|
| 115 |
+
--max-tokens 128 \
|
| 116 |
+
--timeout 300 \
|
| 117 |
+
--ready-timeout 1200
|
| 118 |
+
|
| 119 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 120 |
+
--contexts 24576 \
|
| 121 |
+
--prompts code_patch \
|
| 122 |
+
--max-tokens 128 \
|
| 123 |
+
--timeout 300
|
| 124 |
+
|
| 125 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 126 |
+
--restart \
|
| 127 |
+
--contexts 32768 \
|
| 128 |
+
--prompts identity \
|
| 129 |
+
--max-tokens 64 \
|
| 130 |
+
--timeout 300 \
|
| 131 |
+
--ready-timeout 1200
|
| 132 |
+
|
| 133 |
+
python3 scripts/benchmark_kaiju_serving.py \
|
| 134 |
+
--contexts 32768 \
|
| 135 |
+
--prompts code_patch \
|
| 136 |
+
--max-tokens 128 \
|
| 137 |
+
--timeout 300
|
| 138 |
+
```
|
| 139 |
+
|
| 140 |
+
Runs:
|
| 141 |
+
|
| 142 |
+
- `runs/benchmarks/20260603T141559Z-kaiju-coder-7-serving/summary.md`
|
| 143 |
+
- `runs/benchmarks/20260603T142354Z-kaiju-coder-7-serving/summary.md`
|
| 144 |
+
- `runs/benchmarks/20260603T142439Z-kaiju-coder-7-serving/summary.md`
|
| 145 |
+
- `runs/benchmarks/20260603T143256Z-kaiju-coder-7-serving/summary.md`
|
| 146 |
+
|
| 147 |
+
| Context | Prompt | OK | Load Wait | Seconds | Chars | Chars/s |
|
| 148 |
+
| --- | --- | --- | ---: | ---: | ---: | ---: |
|
| 149 |
+
| 24576 | identity | True | 439.54 | 16.84 | 26 | 1.544 |
|
| 150 |
+
| 24576 | code_patch | True | n/a | 29.03 | 416 | 14.33 |
|
| 151 |
+
| 32768 | identity | True | 386.53 | 16.27 | 26 | 1.598 |
|
| 152 |
+
| 32768 | code_patch | True | n/a | 28.99 | 416 | 14.35 |
|
| 153 |
+
|
| 154 |
+
Interpretation: `32768` is a proven high-context target from this benchmark set,
|
| 155 |
+
but it is not the currently parked live endpoint after the later
|
| 156 |
+
quantized-runtime testing. The current Gojira-B/OpenCode profile should stay at
|
| 157 |
+
`16384` until `32768` is freshly restarted and re-confirmed. Keep `12288` for
|
| 158 |
+
direct API smoke tests and constrained hardware.
|
| 159 |
+
|
| 160 |
+
Restored-service 32k direct API smoke after vLLM testing:
|
| 161 |
+
|
| 162 |
+
- Run: `runs/benchmarks/20260603T155233Z-kaiju-coder-7-serving/summary.md`
|
| 163 |
+
- `/v1/models`: `kaiju-coder-7`, max model len `32768`
|
| 164 |
+
|
| 165 |
+
| Context | Prompt | OK | Seconds | Chars | Chars/s |
|
| 166 |
+
| --- | --- | --- | ---: | ---: | ---: |
|
| 167 |
+
| 32768 | identity | True | 2.92 | 26 | 8.904 |
|
| 168 |
+
| 32768 | business_doc | True | 94.28 | 1737 | 18.424 |
|
| 169 |
+
|
| 170 |
+
Interpretation: the restored default endpoint is usable for business-owner
|
| 171 |
+
document work, but a long proposal response still takes about 94 seconds. Paid
|
| 172 |
+
routes must stream, cap output, queue carefully, and prefer verified
|
| 173 |
+
artifact routes over raw open-ended generation.
|
| 174 |
+
|
| 175 |
+
## OpenCode Customer-Readiness Evidence
|
| 176 |
+
|
| 177 |
+
Final restored-service small OpenCode smoke:
|
| 178 |
+
|
| 179 |
+
```bash
|
| 180 |
+
opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
|
| 181 |
+
--dir /tmp/kaiju-opencode-32k-final-smoke \
|
| 182 |
+
'Create hello.txt with exactly: Kaiju Coder 7 final 32k ok'
|
| 183 |
+
```
|
| 184 |
+
|
| 185 |
+
Result: passed. OpenCode wrote `hello.txt` with exactly
|
| 186 |
+
`Kaiju Coder 7 final 32k ok`.
|
| 187 |
+
|
| 188 |
+
Current restored 16k OpenCode smoke after quantized-vLLM testing:
|
| 189 |
+
|
| 190 |
+
```bash
|
| 191 |
+
mkdir -p /tmp/kaiju-opencode-fresh-public-smoke
|
| 192 |
+
opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 \
|
| 193 |
+
--dir /tmp/kaiju-opencode-fresh-public-smoke \
|
| 194 |
+
--dangerously-skip-permissions \
|
| 195 |
+
'Create hello.txt with exactly: Kaiju Coder 7 fresh public smoke ok'
|
| 196 |
+
```
|
| 197 |
+
|
| 198 |
+
Result: passed. OpenCode wrote `hello.txt` with exactly
|
| 199 |
+
`Kaiju Coder 7 fresh public smoke ok` in
|
| 200 |
+
`/tmp/kaiju-opencode-fresh-public-smoke`, and `/v1/models` returned
|
| 201 |
+
`kaiju-coder-7` with max model len `16384`.
|
| 202 |
+
|
| 203 |
+
Current restored 16k direct API identity smoke:
|
| 204 |
+
|
| 205 |
+
- Run: `runs/benchmarks/20260603T174545Z-kaiju-coder-7-serving/summary.md`
|
| 206 |
+
- `/v1/models`: `kaiju-coder-7`, max model len `16384`
|
| 207 |
+
|
| 208 |
+
| Context | Prompt | OK | Seconds | Chars | Chars/s |
|
| 209 |
+
| --- | --- | --- | ---: | ---: | ---: |
|
| 210 |
+
| 16384 | identity | True | 2.3 | 26 | 11.304 |
|
| 211 |
+
|
| 212 |
+
Command:
|
| 213 |
+
|
| 214 |
+
```bash
|
| 215 |
+
python3 scripts/run_kaiju_opencode_customer_pack.py
|
| 216 |
+
```
|
| 217 |
+
|
| 218 |
+
Latest harnessed product-path result on 2026-06-03:
|
| 219 |
+
|
| 220 |
+
- Run: `runs/opencode-customer-readiness/20260603T185835Z/summary.md`
|
| 221 |
+
- Mode: `harnessed`
|
| 222 |
+
- Status: `4/4` passed
|
| 223 |
+
- Tasks:
|
| 224 |
+
- `fade-flow-service-site`
|
| 225 |
+
- `kiyomi-owner-operating-pack`
|
| 226 |
+
- `paid-api-safety-scaffold`
|
| 227 |
+
- `release-provenance-safety-review`
|
| 228 |
+
- Required files written: `28/28`
|
| 229 |
+
- Forbidden secret-looking tokens: none found by verifier
|
| 230 |
+
|
| 231 |
+
Loop-guarded OpenCode install smoke:
|
| 232 |
+
|
| 233 |
+
- Command: `python3 scripts/install_kaiju_opencode_profile.py`, then
|
| 234 |
+
`opencode run -m kaiju/kaiju-coder-7 --agent kaiju-coder-7 --dir /tmp/kaiju-opencode-loopguard-smoke --dangerously-skip-permissions 'Create loopguard.txt with exactly: Kaiju Coder 7 loop guard installed'`
|
| 235 |
+
- Result: passed. OpenCode wrote `loopguard.txt` in the requested directory with
|
| 236 |
+
exactly `Kaiju Coder 7 loop guard installed` and exited cleanly.
|
| 237 |
+
- Installed guard: `/Users/richardecholsai7/.config/opencode/kaiju-no-autocontinue.mjs`
|
| 238 |
+
|
| 239 |
+
Raw OpenCode-agent result on 2026-06-03:
|
| 240 |
+
|
| 241 |
+
- Task: `fade-flow-service-site`
|
| 242 |
+
- Status: timed out after `900s`
|
| 243 |
+
- Required files written: `0`
|
| 244 |
+
- Observed Gojira-B decode throughput while running: about `4.4` tokens/sec
|
| 245 |
+
- Follow-up runner fix: workspaces now run outside the repo and pass `opencode
|
| 246 |
+
run --dir <workspace>` explicitly.
|
| 247 |
+
- Structured follow-up run:
|
| 248 |
+
`runs/opencode-customer-readiness/20260603T135520Z/results.jsonl`
|
| 249 |
+
timed out after `60s`, wrote `0` files, and recorded `pwd` as the intended
|
| 250 |
+
temp workspace.
|
| 251 |
+
- 16k/stricter-agent follow-up runs:
|
| 252 |
+
- `runs/opencode-customer-readiness/20260603T140650Z/results.jsonl`
|
| 253 |
+
timed out after `120s`, wrote `0` files, and recorded the intended temp
|
| 254 |
+
workspace.
|
| 255 |
+
- `runs/opencode-customer-readiness/20260603T140908Z/results.jsonl`
|
| 256 |
+
timed out after `120s`, wrote `0` files after adding stricter "write first
|
| 257 |
+
file immediately" prompt guidance.
|
| 258 |
+
- Interpretation: the lean OpenCode agent fits and can write small files.
|
| 259 |
+
Harnessed file-plan delivery passes the customer pack. Current raw multi-file
|
| 260 |
+
OpenCode generation is still not public/API ready, so public and paid claims
|
| 261 |
+
must describe the reliable product path as model plus deterministic harness
|
| 262 |
+
and verifier.
|
| 263 |
+
|
| 264 |
+
## Recommendation Until Faster Serving Is Proven
|
| 265 |
+
|
| 266 |
+
- Public local release can proceed only with clear speed/hardware caveats.
|
| 267 |
+
- Paid API should route business-owner deliverables through deterministic
|
| 268 |
+
harnesses and verifiers, not raw OpenCode multi-file generation.
|
| 269 |
+
- Quantized candidates and/or a smaller distilled variant are required for
|
| 270 |
+
broad public OpenCode usability.
|
| 271 |
+
|
| 272 |
+
## vLLM Serving Probe
|
| 273 |
+
|
| 274 |
+
vLLM was tested as the practical alternative serving path after SGLang. The
|
| 275 |
+
standard `vllm/vllm-openai:latest` image cannot read the merged checkpoint's
|
| 276 |
+
`qwen3_5` config. The Gojira nightly image can read it, but needed two launch
|
| 277 |
+
fixes for this checkpoint:
|
| 278 |
+
|
| 279 |
+
- preinstall `pandas`, because the Qwen3.5 model path imports it in this image
|
| 280 |
+
- pass `--language-model-only`, because the merged text-serving checkpoint does
|
| 281 |
+
not include the visual encoder weights expected by the multimodal config
|
| 282 |
+
|
| 283 |
+
Guarded benchmark command:
|
| 284 |
+
|
| 285 |
+
```bash
|
| 286 |
+
KAIJU_VLLM_CONTEXT=16384 KAIJU_VLLM_READY_TIMEOUT=900 \
|
| 287 |
+
./scripts/run-gojira-b-vllm-serving-benchmark.sh
|
| 288 |
+
```
|
| 289 |
+
|
| 290 |
+
Run: `runs/benchmarks/20260603T151244Z-kaiju-coder-7-serving/summary.md`
|
| 291 |
+
|
| 292 |
+
| Stack | Context | Prompt | OK | Seconds | Chars | Chars/s |
|
| 293 |
+
| --- | ---: | --- | --- | ---: | ---: | ---: |
|
| 294 |
+
| vLLM nightly | 16384 | identity | True | 19.99 | 26 | 1.301 |
|
| 295 |
+
| vLLM nightly | 16384 | code_patch | True | 28.8 | 416 | 14.444 |
|
| 296 |
+
|
| 297 |
+
Interpretation: vLLM now runs Kaiju Coder 7 at 16k, but it is not clearly
|
| 298 |
+
faster than SGLang on the current smoke prompts. Keep SGLang as the recommended
|
| 299 |
+
runtime because it has stable OpenCode smoke evidence, a simpler launch path,
|
| 300 |
+
and historical 32k proof. Keep the live/default OpenCode profile at 16k until
|
| 301 |
+
32k is freshly re-confirmed. Keep the vLLM scripts for future nightly-image or
|
| 302 |
+
quantized-weight testing.
|
| 303 |
+
|
| 304 |
+
## vLLM bitsandbytes Runtime-Quantized Candidate
|
| 305 |
+
|
| 306 |
+
The first working quantized local variant is a runtime bitsandbytes vLLM path.
|
| 307 |
+
It does not create separate quantized weights yet; it loads the full merged
|
| 308 |
+
model through vLLM's bitsandbytes loader.
|
| 309 |
+
|
| 310 |
+
Command:
|
| 311 |
+
|
| 312 |
+
```bash
|
| 313 |
+
KAIJU_VLLM_CONTEXT=16384 \
|
| 314 |
+
KAIJU_VLLM_READY_TIMEOUT=1200 \
|
| 315 |
+
KAIJU_VLLM_QUANTIZATION=bitsandbytes \
|
| 316 |
+
KAIJU_VLLM_LOAD_FORMAT=bitsandbytes \
|
| 317 |
+
./scripts/run-gojira-b-vllm-serving-benchmark.sh
|
| 318 |
+
```
|
| 319 |
+
|
| 320 |
+
Runs:
|
| 321 |
+
|
| 322 |
+
- `runs/benchmarks/20260603T153257Z-kaiju-coder-7-serving/summary.md`
|
| 323 |
+
- `runs/benchmarks/20260603T154450Z-kaiju-coder-7-serving/summary.md`
|
| 324 |
+
- `runs/benchmarks/20260603T161316Z-kaiju-coder-7-serving/summary.md`
|
| 325 |
+
- `runs/benchmarks/20260603T165512Z-kaiju-coder-7-serving/summary.md`
|
| 326 |
+
|
| 327 |
+
| Stack | Context | Prompt | OK | Seconds | Chars | Chars/s |
|
| 328 |
+
| --- | ---: | --- | --- | ---: | ---: | ---: |
|
| 329 |
+
| vLLM bitsandbytes | 8192 | identity | True | 21.19 | 26 | 1.227 |
|
| 330 |
+
| vLLM bitsandbytes | 8192 | code_patch | True | 11.31 | 424 | 37.489 |
|
| 331 |
+
| vLLM bitsandbytes | 16384 | identity | True | 19.51 | 26 | 1.333 |
|
| 332 |
+
| vLLM bitsandbytes | 16384 | code_patch | True | 11.3 | 416 | 36.814 |
|
| 333 |
+
| vLLM bitsandbytes | 16384 | business_doc | True | 53.44 | 1610 | 30.127 |
|
| 334 |
+
| vLLM bitsandbytes | 16384 | identity | True | 19.65 | 26 | 1.323 |
|
| 335 |
+
|
| 336 |
+
Gojira-B vLLM logs reported about `17.8 GiB` model memory for the bitsandbytes
|
| 337 |
+
load at both 8k and 16k, compared with about `50.22 GiB` for the unquantized
|
| 338 |
+
vLLM load. Code-patch latency improved materially on this smoke prompt.
|
| 339 |
+
Business-document latency improved versus the restored 32k SGLang business-doc
|
| 340 |
+
smoke (`53.44s` at 16k vLLM bitsandbytes versus `94.28s` at 32k SGLang).
|
| 341 |
+
Identity latency remains slower than SGLang.
|
| 342 |
+
|
| 343 |
+
Quantized OpenCode one-file smoke passed after launching vLLM with
|
| 344 |
+
`--enable-auto-tool-choice` plus `--tool-call-parser qwen3_coder` and running:
|
| 345 |
+
|
| 346 |
+
```bash
|
| 347 |
+
bash scripts/run_kaiju_quantized_opencode_smoke.sh
|
| 348 |
+
```
|
| 349 |
+
|
| 350 |
+
Result: OpenCode wrote `/tmp/kaiju-opencode-quantized-smoke/hello.txt` with
|
| 351 |
+
exactly `Kaiju Coder 7 quantized runtime ok`.
|
| 352 |
+
|
| 353 |
+
Recommendation: keep SGLang as the default public/OpenCode runtime and keep the
|
| 354 |
+
currently installed OpenCode profile at 16k unless the 32k target has just been
|
| 355 |
+
restarted and re-confirmed. Treat vLLM bitsandbytes as the current working
|
| 356 |
+
quantized local candidate for advanced GPU users and future paid API speed
|
| 357 |
+
experiments. It now has direct identity/code/business-doc evidence plus an
|
| 358 |
+
OpenCode one-file smoke, but it is not a persisted quantized-weights repo.
|
SOURCE_INVENTORY.md
ADDED
|
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Kaiju Source Inventory
|
| 2 |
+
|
| 3 |
+
Generated from GitHub source-of-truth repositories plus the requested local RMDW wiki snapshot. This inventory defines what may become Kaiju training data, what is eval-only, and what must stay excluded.
|
| 4 |
+
|
| 5 |
+
## Global Training Rules
|
| 6 |
+
|
| 7 |
+
- Do not train on raw secrets, API keys, OAuth tokens, cookies, private keys, or credential files.
|
| 8 |
+
- Do not train on closed-model responses from OpenAI, Anthropic, Gemini, or similar providers unless the terms clearly allow it.
|
| 9 |
+
- Do not train on client-specific private data without explicit review and consent.
|
| 10 |
+
- Preserve repository name, commit SHA, source path, license, and reviewer status for every promoted dataset row.
|
| 11 |
+
|
| 12 |
+
## GitHub Repository Inventory
|
| 13 |
+
|
| 14 |
+
| Repo | SHA | Role | Training use | Required gates | Exclusions | Notes |
|
| 15 |
+
|---|---|---|---|---|---|---|
|
| 16 |
+
| [RichardEchols/kaiju-coder](https://github.com/RichardEchols/kaiju-coder) | `3d57eae92ad5` | model lab, harness, evals, training scripts | candidate-after-review | secret-scan, closed-model-output-check, license-review | runs, models, .secrets, private datasets, raw logs | Use repo-owned harnesses, evals, docs, scripts, and curated datasets. Exclude weights, generated runs, and local secrets. |
|
| 17 |
+
| [RichardEchols/Kiyomi-7.7.7](https://github.com/RichardEchols/Kiyomi-7.7.7) | `294b31008135` | business-owner AI-company module contracts | candidate-after-review | secret-scan, closed-model-output-check, private-data-review | credentials, tokens, private client state, closed-model transcripts | Use module contracts, templates, acceptance gates, and owner-facing task structure as high-signal business-owner curriculum. |
|
| 18 |
+
| [RichardEchols/kiyomi-agent](https://github.com/RichardEchols/kiyomi-agent) | `b192c910f3f7` | business OS wrapper and local-agent patterns | candidate-after-review | secret-scan, closed-model-output-check, private-data-review | credentials, tokens, local runtime state, private support logs | Use architecture, docs, scripts, and safe wrapper patterns. Do not train on runtime secrets or private logs. |
|
| 19 |
+
| [RichardEchols/rmdw-site](https://github.com/RichardEchols/rmdw-site) | `df089dc3b2d3` | public RMDW offer, site, and conversion surface | candidate-after-review | secret-scan, closed-model-output-check, public-copy-review | environment files, deployment secrets, analytics tokens | Use public offer copy, app structure, pricing/CTA patterns, and website implementation patterns. |
|
| 20 |
+
| [RichardEchols/makotoair](https://github.com/RichardEchols/makotoair) | `7568f07fea6e` | client website implementation pattern | eval-and-patterns-only | secret-scan, client-data-review, consent-review | client-specific, contact data, contracts, private business details | Use as eval/pattern inspiration for local service business sites. Do not bulk-train on client-specific text without explicit review. |
|
| 21 |
+
| [RichardEchols/Mezzal-Construction](https://github.com/RichardEchols/Mezzal-Construction) | `e8f2eede0405` | client website implementation pattern | eval-and-patterns-only | secret-scan, client-data-review, consent-review | client-specific, contact data, contracts, private business details | Use as eval/pattern inspiration for premium contractor site work. Do not bulk-train on client-specific text without explicit review. |
|
| 22 |
+
| [RichardEchols/rmdw-agent-wiki](https://github.com/RichardEchols/rmdw-agent-wiki) | `ae1b8e85d3fe` | RMDW/Kiyomi operational wiki | selective-reference-only | secret-scan, credentials-redaction, private-data-review, closed-model-output-check | credentials.md, customers.md, raw, contracts, private client notes, support logs | Use only redacted strategy/product notes and documented decisions. Never use raw credentials or private client data. |
|
| 23 |
+
|
| 24 |
+
## Local Source Inventory
|
| 25 |
+
|
| 26 |
+
Local files are context snapshots, not the source of truth. Promote local wiki material into training only after explicit review, redaction, and either sync/diff against the GitHub wiki or a documented reviewer exception.
|
| 27 |
+
|
| 28 |
+
| Source | Path | Git repo | Files | Training use | Required gates | Excluded paths present | Safe reference candidates | Notes |
|
| 29 |
+
|---|---|---:|---:|---|---|---|---|---|
|
| 30 |
+
| RMDW-Wiki-local | `/Users/richardecholsai7/Documents/RMDW-Wiki` | no | 93 | selective-reference-only | secret-scan, credentials-redaction, private-data-review, sync-or-diff-against-github | credentials.md, customers.md, customers/, raw/ | README.md, kaiju-coder-build-log.md, kaiju-coder-business-plan.md, kaiju-coder-soul.md, kiyomi-agent-build-log.md, pricing-history.md, product/kiyomi-private-ai-workstation.md, ops/product-ops-automation.md, client-acquisition-engine/README.md | Use as a local context snapshot only after explicit row-level review. Do not treat unsynced local files as the authoritative training source. |
|
| 31 |
+
|
| 32 |
+
## Training Eligibility Meaning
|
| 33 |
+
|
| 34 |
+
- `candidate-after-review`: source can produce training or eval examples only after secret scanning, closed-model-output review, and row-level provenance.
|
| 35 |
+
- `eval-and-patterns-only`: use for hard eval prompts, harness behavior, screenshots, or generalized patterns. Do not bulk-train on client-specific source text.
|
| 36 |
+
- `selective-reference-only`: use narrowly after redaction. Treat credentials, customer notes, and raw operational data as excluded by default.
|
| 37 |
+
- Local snapshots require review against the GitHub source of truth before promotion into dataset rows.
|
| 38 |
+
|
| 39 |
+
## Next Dataset Step
|
| 40 |
+
|
| 41 |
+
Generate candidate examples only from reviewed paths, attach this inventory SHA or local snapshot data to each row, then run `scripts/validate_training_data.py` before any training run.
|
UPSTREAM_LICENSE_CHECK.md
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Upstream License Check
|
| 2 |
+
|
| 3 |
+
Date: 2026-06-03
|
| 4 |
+
|
| 5 |
+
This is an engineering release check, not legal advice.
|
| 6 |
+
|
| 7 |
+
## Base Model
|
| 8 |
+
|
| 9 |
+
- Upstream model: `Qwen/Qwen3.6-27B`
|
| 10 |
+
- Hugging Face URL: `https://huggingface.co/Qwen/Qwen3.6-27B`
|
| 11 |
+
- Checked revision from Hugging Face API: `6a9e13bd6fc8f0983b9b99948120bc37f49c13e9`
|
| 12 |
+
- Hugging Face license metadata: `apache-2.0`
|
| 13 |
+
- Local license copy: `release/upstream/qwen3.6-27b/LICENSE`
|
| 14 |
+
- Common upstream notice files checked: `NOTICE`, `NOTICE.txt`, `NOTICE.md`
|
| 15 |
+
- Notice result: no common notice file found at the checked upstream paths
|
| 16 |
+
|
| 17 |
+
## Release Obligations To Preserve
|
| 18 |
+
|
| 19 |
+
- Include the upstream Apache 2.0 license with the adapter release package.
|
| 20 |
+
- Keep the upstream base model name and revision in the model card.
|
| 21 |
+
- State that Kaiju Coder is fine-tuned from Qwen; do not imply Qwen, Alibaba, or upstream-author endorsement.
|
| 22 |
+
- Include a modification note for the LoRA adapter and RMDW/Kiyomi training/eval package.
|
| 23 |
+
- Retain warranty and limitation language through the included Apache 2.0 license.
|
| 24 |
+
|
| 25 |
+
## Current Packaging Status
|
| 26 |
+
|
| 27 |
+
Passed for release review:
|
| 28 |
+
|
| 29 |
+
- Upstream license file copied locally.
|
| 30 |
+
- Upstream revision recorded.
|
| 31 |
+
- Upstream license metadata recorded.
|
| 32 |
+
- Notice check recorded.
|
| 33 |
+
|
| 34 |
+
Still requires human release review:
|
| 35 |
+
|
| 36 |
+
- Confirm no upstream files changed before upload.
|
| 37 |
+
- Confirm the final Hugging Face repository includes the copied license file and model card.
|
| 38 |
+
- Confirm public wording avoids endorsement or trademark confusion.
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,154 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{%- set image_count = namespace(value=0) %}
|
| 2 |
+
{%- set video_count = namespace(value=0) %}
|
| 3 |
+
{%- macro render_content(content, do_vision_count, is_system_content=false) %}
|
| 4 |
+
{%- if content is string %}
|
| 5 |
+
{{- content }}
|
| 6 |
+
{%- elif content is iterable and content is not mapping %}
|
| 7 |
+
{%- for item in content %}
|
| 8 |
+
{%- if 'image' in item or 'image_url' in item or item.type == 'image' %}
|
| 9 |
+
{%- if is_system_content %}
|
| 10 |
+
{{- raise_exception('System message cannot contain images.') }}
|
| 11 |
+
{%- endif %}
|
| 12 |
+
{%- if do_vision_count %}
|
| 13 |
+
{%- set image_count.value = image_count.value + 1 %}
|
| 14 |
+
{%- endif %}
|
| 15 |
+
{%- if add_vision_id %}
|
| 16 |
+
{{- 'Picture ' ~ image_count.value ~ ': ' }}
|
| 17 |
+
{%- endif %}
|
| 18 |
+
{{- '<|vision_start|><|image_pad|><|vision_end|>' }}
|
| 19 |
+
{%- elif 'video' in item or item.type == 'video' %}
|
| 20 |
+
{%- if is_system_content %}
|
| 21 |
+
{{- raise_exception('System message cannot contain videos.') }}
|
| 22 |
+
{%- endif %}
|
| 23 |
+
{%- if do_vision_count %}
|
| 24 |
+
{%- set video_count.value = video_count.value + 1 %}
|
| 25 |
+
{%- endif %}
|
| 26 |
+
{%- if add_vision_id %}
|
| 27 |
+
{{- 'Video ' ~ video_count.value ~ ': ' }}
|
| 28 |
+
{%- endif %}
|
| 29 |
+
{{- '<|vision_start|><|video_pad|><|vision_end|>' }}
|
| 30 |
+
{%- elif 'text' in item %}
|
| 31 |
+
{{- item.text }}
|
| 32 |
+
{%- else %}
|
| 33 |
+
{{- raise_exception('Unexpected item type in content.') }}
|
| 34 |
+
{%- endif %}
|
| 35 |
+
{%- endfor %}
|
| 36 |
+
{%- elif content is none or content is undefined %}
|
| 37 |
+
{{- '' }}
|
| 38 |
+
{%- else %}
|
| 39 |
+
{{- raise_exception('Unexpected content type.') }}
|
| 40 |
+
{%- endif %}
|
| 41 |
+
{%- endmacro %}
|
| 42 |
+
{%- if not messages %}
|
| 43 |
+
{{- raise_exception('No messages provided.') }}
|
| 44 |
+
{%- endif %}
|
| 45 |
+
{%- if tools and tools is iterable and tools is not mapping %}
|
| 46 |
+
{{- '<|im_start|>system\n' }}
|
| 47 |
+
{{- "# Tools\n\nYou have access to the following functions:\n\n<tools>" }}
|
| 48 |
+
{%- for tool in tools %}
|
| 49 |
+
{{- "\n" }}
|
| 50 |
+
{{- tool | tojson }}
|
| 51 |
+
{%- endfor %}
|
| 52 |
+
{{- "\n</tools>" }}
|
| 53 |
+
{{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
|
| 54 |
+
{%- if messages[0].role == 'system' %}
|
| 55 |
+
{%- set content = render_content(messages[0].content, false, true)|trim %}
|
| 56 |
+
{%- if content %}
|
| 57 |
+
{{- '\n\n' + content }}
|
| 58 |
+
{%- endif %}
|
| 59 |
+
{%- endif %}
|
| 60 |
+
{{- '<|im_end|>\n' }}
|
| 61 |
+
{%- else %}
|
| 62 |
+
{%- if messages[0].role == 'system' %}
|
| 63 |
+
{%- set content = render_content(messages[0].content, false, true)|trim %}
|
| 64 |
+
{{- '<|im_start|>system\n' + content + '<|im_end|>\n' }}
|
| 65 |
+
{%- endif %}
|
| 66 |
+
{%- endif %}
|
| 67 |
+
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
|
| 68 |
+
{%- for message in messages[::-1] %}
|
| 69 |
+
{%- set index = (messages|length - 1) - loop.index0 %}
|
| 70 |
+
{%- if ns.multi_step_tool and message.role == "user" %}
|
| 71 |
+
{%- set content = render_content(message.content, false)|trim %}
|
| 72 |
+
{%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %}
|
| 73 |
+
{%- set ns.multi_step_tool = false %}
|
| 74 |
+
{%- set ns.last_query_index = index %}
|
| 75 |
+
{%- endif %}
|
| 76 |
+
{%- endif %}
|
| 77 |
+
{%- endfor %}
|
| 78 |
+
{%- if ns.multi_step_tool %}
|
| 79 |
+
{{- raise_exception('No user query found in messages.') }}
|
| 80 |
+
{%- endif %}
|
| 81 |
+
{%- for message in messages %}
|
| 82 |
+
{%- set content = render_content(message.content, true)|trim %}
|
| 83 |
+
{%- if message.role == "system" %}
|
| 84 |
+
{%- if not loop.first %}
|
| 85 |
+
{{- raise_exception('System message must be at the beginning.') }}
|
| 86 |
+
{%- endif %}
|
| 87 |
+
{%- elif message.role == "user" %}
|
| 88 |
+
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
|
| 89 |
+
{%- elif message.role == "assistant" %}
|
| 90 |
+
{%- set reasoning_content = '' %}
|
| 91 |
+
{%- if message.reasoning_content is string %}
|
| 92 |
+
{%- set reasoning_content = message.reasoning_content %}
|
| 93 |
+
{%- else %}
|
| 94 |
+
{%- if '</think>' in content %}
|
| 95 |
+
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
|
| 96 |
+
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
|
| 97 |
+
{%- endif %}
|
| 98 |
+
{%- endif %}
|
| 99 |
+
{%- set reasoning_content = reasoning_content|trim %}
|
| 100 |
+
{%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %}
|
| 101 |
+
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
|
| 102 |
+
{%- else %}
|
| 103 |
+
{{- '<|im_start|>' + message.role + '\n' + content }}
|
| 104 |
+
{%- endif %}
|
| 105 |
+
{%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %}
|
| 106 |
+
{%- for tool_call in message.tool_calls %}
|
| 107 |
+
{%- if tool_call.function is defined %}
|
| 108 |
+
{%- set tool_call = tool_call.function %}
|
| 109 |
+
{%- endif %}
|
| 110 |
+
{%- if loop.first %}
|
| 111 |
+
{%- if content|trim %}
|
| 112 |
+
{{- '\n\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
|
| 113 |
+
{%- else %}
|
| 114 |
+
{{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
|
| 115 |
+
{%- endif %}
|
| 116 |
+
{%- else %}
|
| 117 |
+
{{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
|
| 118 |
+
{%- endif %}
|
| 119 |
+
{%- if tool_call.arguments is defined %}
|
| 120 |
+
{%- for args_name, args_value in tool_call.arguments|items %}
|
| 121 |
+
{{- '<parameter=' + args_name + '>\n' }}
|
| 122 |
+
{%- set args_value = args_value | string if args_value is string else args_value | tojson | safe %}
|
| 123 |
+
{{- args_value }}
|
| 124 |
+
{{- '\n</parameter>\n' }}
|
| 125 |
+
{%- endfor %}
|
| 126 |
+
{%- endif %}
|
| 127 |
+
{{- '</function>\n</tool_call>' }}
|
| 128 |
+
{%- endfor %}
|
| 129 |
+
{%- endif %}
|
| 130 |
+
{{- '<|im_end|>\n' }}
|
| 131 |
+
{%- elif message.role == "tool" %}
|
| 132 |
+
{%- if loop.previtem and loop.previtem.role != "tool" %}
|
| 133 |
+
{{- '<|im_start|>user' }}
|
| 134 |
+
{%- endif %}
|
| 135 |
+
{{- '\n<tool_response>\n' }}
|
| 136 |
+
{{- content }}
|
| 137 |
+
{{- '\n</tool_response>' }}
|
| 138 |
+
{%- if not loop.last and loop.nextitem.role != "tool" %}
|
| 139 |
+
{{- '<|im_end|>\n' }}
|
| 140 |
+
{%- elif loop.last %}
|
| 141 |
+
{{- '<|im_end|>\n' }}
|
| 142 |
+
{%- endif %}
|
| 143 |
+
{%- else %}
|
| 144 |
+
{{- raise_exception('Unexpected message role.') }}
|
| 145 |
+
{%- endif %}
|
| 146 |
+
{%- endfor %}
|
| 147 |
+
{%- if add_generation_prompt %}
|
| 148 |
+
{{- '<|im_start|>assistant\n' }}
|
| 149 |
+
{%- if enable_thinking is defined and enable_thinking is false %}
|
| 150 |
+
{{- '<think>\n\n</think>\n\n' }}
|
| 151 |
+
{%- else %}
|
| 152 |
+
{{- '<think>\n' }}
|
| 153 |
+
{%- endif %}
|
| 154 |
+
{%- endif %}
|
config.json
ADDED
|
@@ -0,0 +1,140 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"Qwen3_5ForConditionalGeneration"
|
| 4 |
+
],
|
| 5 |
+
"image_token_id": 248056,
|
| 6 |
+
"language_model_only": false,
|
| 7 |
+
"model_type": "qwen3_5",
|
| 8 |
+
"text_config": {
|
| 9 |
+
"attention_bias": false,
|
| 10 |
+
"attention_dropout": 0.0,
|
| 11 |
+
"attn_output_gate": true,
|
| 12 |
+
"bos_token_id": 248044,
|
| 13 |
+
"dtype": "bfloat16",
|
| 14 |
+
"eos_token_id": 248044,
|
| 15 |
+
"full_attention_interval": 4,
|
| 16 |
+
"head_dim": 256,
|
| 17 |
+
"hidden_act": "silu",
|
| 18 |
+
"hidden_size": 5120,
|
| 19 |
+
"initializer_range": 0.02,
|
| 20 |
+
"intermediate_size": 17408,
|
| 21 |
+
"layer_types": [
|
| 22 |
+
"linear_attention",
|
| 23 |
+
"linear_attention",
|
| 24 |
+
"linear_attention",
|
| 25 |
+
"full_attention",
|
| 26 |
+
"linear_attention",
|
| 27 |
+
"linear_attention",
|
| 28 |
+
"linear_attention",
|
| 29 |
+
"full_attention",
|
| 30 |
+
"linear_attention",
|
| 31 |
+
"linear_attention",
|
| 32 |
+
"linear_attention",
|
| 33 |
+
"full_attention",
|
| 34 |
+
"linear_attention",
|
| 35 |
+
"linear_attention",
|
| 36 |
+
"linear_attention",
|
| 37 |
+
"full_attention",
|
| 38 |
+
"linear_attention",
|
| 39 |
+
"linear_attention",
|
| 40 |
+
"linear_attention",
|
| 41 |
+
"full_attention",
|
| 42 |
+
"linear_attention",
|
| 43 |
+
"linear_attention",
|
| 44 |
+
"linear_attention",
|
| 45 |
+
"full_attention",
|
| 46 |
+
"linear_attention",
|
| 47 |
+
"linear_attention",
|
| 48 |
+
"linear_attention",
|
| 49 |
+
"full_attention",
|
| 50 |
+
"linear_attention",
|
| 51 |
+
"linear_attention",
|
| 52 |
+
"linear_attention",
|
| 53 |
+
"full_attention",
|
| 54 |
+
"linear_attention",
|
| 55 |
+
"linear_attention",
|
| 56 |
+
"linear_attention",
|
| 57 |
+
"full_attention",
|
| 58 |
+
"linear_attention",
|
| 59 |
+
"linear_attention",
|
| 60 |
+
"linear_attention",
|
| 61 |
+
"full_attention",
|
| 62 |
+
"linear_attention",
|
| 63 |
+
"linear_attention",
|
| 64 |
+
"linear_attention",
|
| 65 |
+
"full_attention",
|
| 66 |
+
"linear_attention",
|
| 67 |
+
"linear_attention",
|
| 68 |
+
"linear_attention",
|
| 69 |
+
"full_attention",
|
| 70 |
+
"linear_attention",
|
| 71 |
+
"linear_attention",
|
| 72 |
+
"linear_attention",
|
| 73 |
+
"full_attention",
|
| 74 |
+
"linear_attention",
|
| 75 |
+
"linear_attention",
|
| 76 |
+
"linear_attention",
|
| 77 |
+
"full_attention",
|
| 78 |
+
"linear_attention",
|
| 79 |
+
"linear_attention",
|
| 80 |
+
"linear_attention",
|
| 81 |
+
"full_attention",
|
| 82 |
+
"linear_attention",
|
| 83 |
+
"linear_attention",
|
| 84 |
+
"linear_attention",
|
| 85 |
+
"full_attention"
|
| 86 |
+
],
|
| 87 |
+
"linear_conv_kernel_dim": 4,
|
| 88 |
+
"linear_key_head_dim": 128,
|
| 89 |
+
"linear_num_key_heads": 16,
|
| 90 |
+
"linear_num_value_heads": 48,
|
| 91 |
+
"linear_value_head_dim": 128,
|
| 92 |
+
"mamba_ssm_dtype": "float32",
|
| 93 |
+
"max_position_embeddings": 262144,
|
| 94 |
+
"model_type": "qwen3_5_text",
|
| 95 |
+
"mtp_num_hidden_layers": 1,
|
| 96 |
+
"mtp_use_dedicated_embeddings": false,
|
| 97 |
+
"num_attention_heads": 24,
|
| 98 |
+
"num_hidden_layers": 64,
|
| 99 |
+
"num_key_value_heads": 4,
|
| 100 |
+
"output_gate_type": "swish",
|
| 101 |
+
"pad_token_id": null,
|
| 102 |
+
"partial_rotary_factor": 0.25,
|
| 103 |
+
"rms_norm_eps": 1e-06,
|
| 104 |
+
"rope_parameters": {
|
| 105 |
+
"mrope_interleaved": true,
|
| 106 |
+
"mrope_section": [
|
| 107 |
+
11,
|
| 108 |
+
11,
|
| 109 |
+
10
|
| 110 |
+
],
|
| 111 |
+
"partial_rotary_factor": 0.25,
|
| 112 |
+
"rope_theta": 10000000,
|
| 113 |
+
"rope_type": "default"
|
| 114 |
+
},
|
| 115 |
+
"tie_word_embeddings": false,
|
| 116 |
+
"use_cache": true,
|
| 117 |
+
"vocab_size": 248320
|
| 118 |
+
},
|
| 119 |
+
"tie_word_embeddings": false,
|
| 120 |
+
"transformers_version": "4.57.1",
|
| 121 |
+
"video_token_id": 248057,
|
| 122 |
+
"vision_config": {
|
| 123 |
+
"deepstack_visual_indexes": [],
|
| 124 |
+
"depth": 27,
|
| 125 |
+
"hidden_act": "gelu_pytorch_tanh",
|
| 126 |
+
"hidden_size": 1152,
|
| 127 |
+
"in_channels": 3,
|
| 128 |
+
"initializer_range": 0.02,
|
| 129 |
+
"intermediate_size": 4304,
|
| 130 |
+
"model_type": "qwen3_5",
|
| 131 |
+
"num_heads": 16,
|
| 132 |
+
"num_position_embeddings": 2304,
|
| 133 |
+
"out_hidden_size": 5120,
|
| 134 |
+
"patch_size": 16,
|
| 135 |
+
"spatial_merge_size": 2,
|
| 136 |
+
"temporal_patch_size": 2
|
| 137 |
+
},
|
| 138 |
+
"vision_end_token_id": 248054,
|
| 139 |
+
"vision_start_token_id": 248053
|
| 140 |
+
}
|
configuration.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"framework":"Pytorch","task":"image-text-to-text"}
|
generation_config.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"bos_token_id": 248044,
|
| 3 |
+
"do_sample": true,
|
| 4 |
+
"eos_token_id": [
|
| 5 |
+
248046,
|
| 6 |
+
248044
|
| 7 |
+
],
|
| 8 |
+
"pad_token_id": 248044,
|
| 9 |
+
"temperature": 1.0,
|
| 10 |
+
"top_k": 20,
|
| 11 |
+
"top_p": 0.95,
|
| 12 |
+
"transformers_version": "5.9.0"
|
| 13 |
+
}
|
kaiju-merge-manifest.json
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"adapter": "/workspace/kaiju-coder/runs/qwen36-27b-lora-v1.8-business-owner/adapter",
|
| 3 |
+
"attn_implementation": "sdpa",
|
| 4 |
+
"base_model": "/workspace/kaiju-coder/models/Qwen3.6-27B",
|
| 5 |
+
"copied_base_sidecars": [
|
| 6 |
+
"preprocessor_config.json",
|
| 7 |
+
"video_preprocessor_config.json",
|
| 8 |
+
"vocab.json",
|
| 9 |
+
"merges.txt",
|
| 10 |
+
"configuration.json"
|
| 11 |
+
],
|
| 12 |
+
"device_map": "auto",
|
| 13 |
+
"dtype": "torch.bfloat16",
|
| 14 |
+
"max_shard_size": "4GB",
|
| 15 |
+
"output_dir": "/workspace/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged",
|
| 16 |
+
"preserved_base_config": true
|
| 17 |
+
}
|
merges.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
model-00001-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:994d372da0812952c05a3fb1093db5b1245e314385290892f0dce3883f4c2643
|
| 3 |
+
size 2542796928
|
model-00002-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4878748a25512bacb544e85e40836a4fcecb3743a3e692271b80034bc95c0349
|
| 3 |
+
size 3897624936
|
model-00003-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:353ff0b588a1bcf7036b3dcd389f4e1dbeb331649cc4fb3a536ac1c530a07c7f
|
| 3 |
+
size 3988983480
|
model-00004-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2dc161642c42bce8c857d30fe116a30fbec1e9759fc0ff99226ba4249d949ab9
|
| 3 |
+
size 3957507000
|
model-00005-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4e0ba05394a91e792be56d53b9d617bae38ae4153d410b2adae38b9680e3cd6d
|
| 3 |
+
size 3873619464
|
model-00006-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:db1fea86bcaca2f1eeaa393b3e9ed3612058acc07982605966ca11e4a48067f6
|
| 3 |
+
size 3988962816
|
model-00007-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:991d211462416f84603bcad3709bb2cb39ca4e739d2cc85ea2b97e9e23bdbe91
|
| 3 |
+
size 3945964456
|
model-00008-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e3483733b734e5b83aa1416f5d77511599fad67a47311be48c6cc521e5b4656b
|
| 3 |
+
size 3831666800
|
model-00009-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4e6a20e6662125b644c0e8eb98f0ed0c0256909487a4000185658c96b7bb2ce3
|
| 3 |
+
size 3990049344
|
model-00010-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:73de119646ab221c1ee39c4bbf6c3434253d8660f033536f0419801eadf3750b
|
| 3 |
+
size 3987897024
|
model-00011-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:de6b00388b22b8a4a3ab10ca8784f7ff4c0bdf60e6614d6c008893c5d91d29cd
|
| 3 |
+
size 3842163712
|
model-00012-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:980e5afe6cf660e15c3ced9f79c429dd778d5cc180cf4712d33475162c1f827e
|
| 3 |
+
size 3988962816
|
model-00013-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9befeac4d999cd15fa085d4a1a70fc2292446ceb839bbcb0a3953057ce08f86e
|
| 3 |
+
size 3988962816
|
model-00014-of-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3ddb6acc0578997b79761e66b223a4024db3e6341718a2bfb8ec56311d8d3801
|
| 3 |
+
size 3966947216
|
model.safetensors.index.json
ADDED
|
@@ -0,0 +1,859 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"metadata": {
|
| 3 |
+
"total_parameters": 26895998464,
|
| 4 |
+
"total_size": 53791996928
|
| 5 |
+
},
|
| 6 |
+
"weight_map": {
|
| 7 |
+
"lm_head.weight": "model-00001-of-00014.safetensors",
|
| 8 |
+
"model.language_model.embed_tokens.weight": "model-00002-of-00014.safetensors",
|
| 9 |
+
"model.language_model.layers.0.input_layernorm.weight": "model-00002-of-00014.safetensors",
|
| 10 |
+
"model.language_model.layers.0.linear_attn.A_log": "model-00002-of-00014.safetensors",
|
| 11 |
+
"model.language_model.layers.0.linear_attn.conv1d.weight": "model-00002-of-00014.safetensors",
|
| 12 |
+
"model.language_model.layers.0.linear_attn.dt_bias": "model-00002-of-00014.safetensors",
|
| 13 |
+
"model.language_model.layers.0.linear_attn.in_proj_a.weight": "model-00002-of-00014.safetensors",
|
| 14 |
+
"model.language_model.layers.0.linear_attn.in_proj_b.weight": "model-00002-of-00014.safetensors",
|
| 15 |
+
"model.language_model.layers.0.linear_attn.in_proj_qkv.weight": "model-00002-of-00014.safetensors",
|
| 16 |
+
"model.language_model.layers.0.linear_attn.in_proj_z.weight": "model-00002-of-00014.safetensors",
|
| 17 |
+
"model.language_model.layers.0.linear_attn.norm.weight": "model-00002-of-00014.safetensors",
|
| 18 |
+
"model.language_model.layers.0.linear_attn.out_proj.weight": "model-00002-of-00014.safetensors",
|
| 19 |
+
"model.language_model.layers.0.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
|
| 20 |
+
"model.language_model.layers.0.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
|
| 21 |
+
"model.language_model.layers.0.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
|
| 22 |
+
"model.language_model.layers.0.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
|
| 23 |
+
"model.language_model.layers.1.input_layernorm.weight": "model-00002-of-00014.safetensors",
|
| 24 |
+
"model.language_model.layers.1.linear_attn.A_log": "model-00002-of-00014.safetensors",
|
| 25 |
+
"model.language_model.layers.1.linear_attn.conv1d.weight": "model-00002-of-00014.safetensors",
|
| 26 |
+
"model.language_model.layers.1.linear_attn.dt_bias": "model-00002-of-00014.safetensors",
|
| 27 |
+
"model.language_model.layers.1.linear_attn.in_proj_a.weight": "model-00002-of-00014.safetensors",
|
| 28 |
+
"model.language_model.layers.1.linear_attn.in_proj_b.weight": "model-00002-of-00014.safetensors",
|
| 29 |
+
"model.language_model.layers.1.linear_attn.in_proj_qkv.weight": "model-00002-of-00014.safetensors",
|
| 30 |
+
"model.language_model.layers.1.linear_attn.in_proj_z.weight": "model-00002-of-00014.safetensors",
|
| 31 |
+
"model.language_model.layers.1.linear_attn.norm.weight": "model-00002-of-00014.safetensors",
|
| 32 |
+
"model.language_model.layers.1.linear_attn.out_proj.weight": "model-00002-of-00014.safetensors",
|
| 33 |
+
"model.language_model.layers.1.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
|
| 34 |
+
"model.language_model.layers.1.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
|
| 35 |
+
"model.language_model.layers.1.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
|
| 36 |
+
"model.language_model.layers.1.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 37 |
+
"model.language_model.layers.10.input_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 38 |
+
"model.language_model.layers.10.linear_attn.A_log": "model-00004-of-00014.safetensors",
|
| 39 |
+
"model.language_model.layers.10.linear_attn.conv1d.weight": "model-00004-of-00014.safetensors",
|
| 40 |
+
"model.language_model.layers.10.linear_attn.dt_bias": "model-00004-of-00014.safetensors",
|
| 41 |
+
"model.language_model.layers.10.linear_attn.in_proj_a.weight": "model-00004-of-00014.safetensors",
|
| 42 |
+
"model.language_model.layers.10.linear_attn.in_proj_b.weight": "model-00004-of-00014.safetensors",
|
| 43 |
+
"model.language_model.layers.10.linear_attn.in_proj_qkv.weight": "model-00004-of-00014.safetensors",
|
| 44 |
+
"model.language_model.layers.10.linear_attn.in_proj_z.weight": "model-00004-of-00014.safetensors",
|
| 45 |
+
"model.language_model.layers.10.linear_attn.norm.weight": "model-00004-of-00014.safetensors",
|
| 46 |
+
"model.language_model.layers.10.linear_attn.out_proj.weight": "model-00004-of-00014.safetensors",
|
| 47 |
+
"model.language_model.layers.10.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
|
| 48 |
+
"model.language_model.layers.10.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
|
| 49 |
+
"model.language_model.layers.10.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
|
| 50 |
+
"model.language_model.layers.10.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 51 |
+
"model.language_model.layers.11.input_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 52 |
+
"model.language_model.layers.11.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
|
| 53 |
+
"model.language_model.layers.11.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
|
| 54 |
+
"model.language_model.layers.11.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
|
| 55 |
+
"model.language_model.layers.11.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 56 |
+
"model.language_model.layers.11.self_attn.k_norm.weight": "model-00004-of-00014.safetensors",
|
| 57 |
+
"model.language_model.layers.11.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
|
| 58 |
+
"model.language_model.layers.11.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
|
| 59 |
+
"model.language_model.layers.11.self_attn.q_norm.weight": "model-00004-of-00014.safetensors",
|
| 60 |
+
"model.language_model.layers.11.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
|
| 61 |
+
"model.language_model.layers.11.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
|
| 62 |
+
"model.language_model.layers.12.input_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 63 |
+
"model.language_model.layers.12.linear_attn.A_log": "model-00004-of-00014.safetensors",
|
| 64 |
+
"model.language_model.layers.12.linear_attn.conv1d.weight": "model-00004-of-00014.safetensors",
|
| 65 |
+
"model.language_model.layers.12.linear_attn.dt_bias": "model-00004-of-00014.safetensors",
|
| 66 |
+
"model.language_model.layers.12.linear_attn.in_proj_a.weight": "model-00004-of-00014.safetensors",
|
| 67 |
+
"model.language_model.layers.12.linear_attn.in_proj_b.weight": "model-00004-of-00014.safetensors",
|
| 68 |
+
"model.language_model.layers.12.linear_attn.in_proj_qkv.weight": "model-00004-of-00014.safetensors",
|
| 69 |
+
"model.language_model.layers.12.linear_attn.in_proj_z.weight": "model-00004-of-00014.safetensors",
|
| 70 |
+
"model.language_model.layers.12.linear_attn.norm.weight": "model-00004-of-00014.safetensors",
|
| 71 |
+
"model.language_model.layers.12.linear_attn.out_proj.weight": "model-00005-of-00014.safetensors",
|
| 72 |
+
"model.language_model.layers.12.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
|
| 73 |
+
"model.language_model.layers.12.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
|
| 74 |
+
"model.language_model.layers.12.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
|
| 75 |
+
"model.language_model.layers.12.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 76 |
+
"model.language_model.layers.13.input_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 77 |
+
"model.language_model.layers.13.linear_attn.A_log": "model-00005-of-00014.safetensors",
|
| 78 |
+
"model.language_model.layers.13.linear_attn.conv1d.weight": "model-00005-of-00014.safetensors",
|
| 79 |
+
"model.language_model.layers.13.linear_attn.dt_bias": "model-00005-of-00014.safetensors",
|
| 80 |
+
"model.language_model.layers.13.linear_attn.in_proj_a.weight": "model-00005-of-00014.safetensors",
|
| 81 |
+
"model.language_model.layers.13.linear_attn.in_proj_b.weight": "model-00005-of-00014.safetensors",
|
| 82 |
+
"model.language_model.layers.13.linear_attn.in_proj_qkv.weight": "model-00005-of-00014.safetensors",
|
| 83 |
+
"model.language_model.layers.13.linear_attn.in_proj_z.weight": "model-00005-of-00014.safetensors",
|
| 84 |
+
"model.language_model.layers.13.linear_attn.norm.weight": "model-00005-of-00014.safetensors",
|
| 85 |
+
"model.language_model.layers.13.linear_attn.out_proj.weight": "model-00005-of-00014.safetensors",
|
| 86 |
+
"model.language_model.layers.13.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
|
| 87 |
+
"model.language_model.layers.13.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
|
| 88 |
+
"model.language_model.layers.13.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
|
| 89 |
+
"model.language_model.layers.13.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 90 |
+
"model.language_model.layers.14.input_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 91 |
+
"model.language_model.layers.14.linear_attn.A_log": "model-00005-of-00014.safetensors",
|
| 92 |
+
"model.language_model.layers.14.linear_attn.conv1d.weight": "model-00005-of-00014.safetensors",
|
| 93 |
+
"model.language_model.layers.14.linear_attn.dt_bias": "model-00005-of-00014.safetensors",
|
| 94 |
+
"model.language_model.layers.14.linear_attn.in_proj_a.weight": "model-00005-of-00014.safetensors",
|
| 95 |
+
"model.language_model.layers.14.linear_attn.in_proj_b.weight": "model-00005-of-00014.safetensors",
|
| 96 |
+
"model.language_model.layers.14.linear_attn.in_proj_qkv.weight": "model-00005-of-00014.safetensors",
|
| 97 |
+
"model.language_model.layers.14.linear_attn.in_proj_z.weight": "model-00005-of-00014.safetensors",
|
| 98 |
+
"model.language_model.layers.14.linear_attn.norm.weight": "model-00005-of-00014.safetensors",
|
| 99 |
+
"model.language_model.layers.14.linear_attn.out_proj.weight": "model-00005-of-00014.safetensors",
|
| 100 |
+
"model.language_model.layers.14.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
|
| 101 |
+
"model.language_model.layers.14.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
|
| 102 |
+
"model.language_model.layers.14.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
|
| 103 |
+
"model.language_model.layers.14.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 104 |
+
"model.language_model.layers.15.input_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 105 |
+
"model.language_model.layers.15.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
|
| 106 |
+
"model.language_model.layers.15.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
|
| 107 |
+
"model.language_model.layers.15.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
|
| 108 |
+
"model.language_model.layers.15.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 109 |
+
"model.language_model.layers.15.self_attn.k_norm.weight": "model-00005-of-00014.safetensors",
|
| 110 |
+
"model.language_model.layers.15.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
|
| 111 |
+
"model.language_model.layers.15.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
|
| 112 |
+
"model.language_model.layers.15.self_attn.q_norm.weight": "model-00005-of-00014.safetensors",
|
| 113 |
+
"model.language_model.layers.15.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
|
| 114 |
+
"model.language_model.layers.15.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
|
| 115 |
+
"model.language_model.layers.16.input_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 116 |
+
"model.language_model.layers.16.linear_attn.A_log": "model-00005-of-00014.safetensors",
|
| 117 |
+
"model.language_model.layers.16.linear_attn.conv1d.weight": "model-00005-of-00014.safetensors",
|
| 118 |
+
"model.language_model.layers.16.linear_attn.dt_bias": "model-00005-of-00014.safetensors",
|
| 119 |
+
"model.language_model.layers.16.linear_attn.in_proj_a.weight": "model-00005-of-00014.safetensors",
|
| 120 |
+
"model.language_model.layers.16.linear_attn.in_proj_b.weight": "model-00005-of-00014.safetensors",
|
| 121 |
+
"model.language_model.layers.16.linear_attn.in_proj_qkv.weight": "model-00005-of-00014.safetensors",
|
| 122 |
+
"model.language_model.layers.16.linear_attn.in_proj_z.weight": "model-00005-of-00014.safetensors",
|
| 123 |
+
"model.language_model.layers.16.linear_attn.norm.weight": "model-00005-of-00014.safetensors",
|
| 124 |
+
"model.language_model.layers.16.linear_attn.out_proj.weight": "model-00005-of-00014.safetensors",
|
| 125 |
+
"model.language_model.layers.16.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
|
| 126 |
+
"model.language_model.layers.16.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
|
| 127 |
+
"model.language_model.layers.16.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
|
| 128 |
+
"model.language_model.layers.16.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 129 |
+
"model.language_model.layers.17.input_layernorm.weight": "model-00005-of-00014.safetensors",
|
| 130 |
+
"model.language_model.layers.17.linear_attn.A_log": "model-00005-of-00014.safetensors",
|
| 131 |
+
"model.language_model.layers.17.linear_attn.conv1d.weight": "model-00005-of-00014.safetensors",
|
| 132 |
+
"model.language_model.layers.17.linear_attn.dt_bias": "model-00005-of-00014.safetensors",
|
| 133 |
+
"model.language_model.layers.17.linear_attn.in_proj_a.weight": "model-00005-of-00014.safetensors",
|
| 134 |
+
"model.language_model.layers.17.linear_attn.in_proj_b.weight": "model-00005-of-00014.safetensors",
|
| 135 |
+
"model.language_model.layers.17.linear_attn.in_proj_qkv.weight": "model-00005-of-00014.safetensors",
|
| 136 |
+
"model.language_model.layers.17.linear_attn.in_proj_z.weight": "model-00005-of-00014.safetensors",
|
| 137 |
+
"model.language_model.layers.17.linear_attn.norm.weight": "model-00005-of-00014.safetensors",
|
| 138 |
+
"model.language_model.layers.17.linear_attn.out_proj.weight": "model-00005-of-00014.safetensors",
|
| 139 |
+
"model.language_model.layers.17.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
|
| 140 |
+
"model.language_model.layers.17.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
|
| 141 |
+
"model.language_model.layers.17.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
|
| 142 |
+
"model.language_model.layers.17.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 143 |
+
"model.language_model.layers.18.input_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 144 |
+
"model.language_model.layers.18.linear_attn.A_log": "model-00006-of-00014.safetensors",
|
| 145 |
+
"model.language_model.layers.18.linear_attn.conv1d.weight": "model-00006-of-00014.safetensors",
|
| 146 |
+
"model.language_model.layers.18.linear_attn.dt_bias": "model-00006-of-00014.safetensors",
|
| 147 |
+
"model.language_model.layers.18.linear_attn.in_proj_a.weight": "model-00006-of-00014.safetensors",
|
| 148 |
+
"model.language_model.layers.18.linear_attn.in_proj_b.weight": "model-00006-of-00014.safetensors",
|
| 149 |
+
"model.language_model.layers.18.linear_attn.in_proj_qkv.weight": "model-00006-of-00014.safetensors",
|
| 150 |
+
"model.language_model.layers.18.linear_attn.in_proj_z.weight": "model-00006-of-00014.safetensors",
|
| 151 |
+
"model.language_model.layers.18.linear_attn.norm.weight": "model-00006-of-00014.safetensors",
|
| 152 |
+
"model.language_model.layers.18.linear_attn.out_proj.weight": "model-00006-of-00014.safetensors",
|
| 153 |
+
"model.language_model.layers.18.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
|
| 154 |
+
"model.language_model.layers.18.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
|
| 155 |
+
"model.language_model.layers.18.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
|
| 156 |
+
"model.language_model.layers.18.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 157 |
+
"model.language_model.layers.19.input_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 158 |
+
"model.language_model.layers.19.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
|
| 159 |
+
"model.language_model.layers.19.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
|
| 160 |
+
"model.language_model.layers.19.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
|
| 161 |
+
"model.language_model.layers.19.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 162 |
+
"model.language_model.layers.19.self_attn.k_norm.weight": "model-00006-of-00014.safetensors",
|
| 163 |
+
"model.language_model.layers.19.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
|
| 164 |
+
"model.language_model.layers.19.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
|
| 165 |
+
"model.language_model.layers.19.self_attn.q_norm.weight": "model-00006-of-00014.safetensors",
|
| 166 |
+
"model.language_model.layers.19.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
|
| 167 |
+
"model.language_model.layers.19.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
|
| 168 |
+
"model.language_model.layers.2.input_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 169 |
+
"model.language_model.layers.2.linear_attn.A_log": "model-00003-of-00014.safetensors",
|
| 170 |
+
"model.language_model.layers.2.linear_attn.conv1d.weight": "model-00003-of-00014.safetensors",
|
| 171 |
+
"model.language_model.layers.2.linear_attn.dt_bias": "model-00003-of-00014.safetensors",
|
| 172 |
+
"model.language_model.layers.2.linear_attn.in_proj_a.weight": "model-00003-of-00014.safetensors",
|
| 173 |
+
"model.language_model.layers.2.linear_attn.in_proj_b.weight": "model-00003-of-00014.safetensors",
|
| 174 |
+
"model.language_model.layers.2.linear_attn.in_proj_qkv.weight": "model-00003-of-00014.safetensors",
|
| 175 |
+
"model.language_model.layers.2.linear_attn.in_proj_z.weight": "model-00003-of-00014.safetensors",
|
| 176 |
+
"model.language_model.layers.2.linear_attn.norm.weight": "model-00003-of-00014.safetensors",
|
| 177 |
+
"model.language_model.layers.2.linear_attn.out_proj.weight": "model-00003-of-00014.safetensors",
|
| 178 |
+
"model.language_model.layers.2.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
|
| 179 |
+
"model.language_model.layers.2.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
|
| 180 |
+
"model.language_model.layers.2.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
|
| 181 |
+
"model.language_model.layers.2.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 182 |
+
"model.language_model.layers.20.input_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 183 |
+
"model.language_model.layers.20.linear_attn.A_log": "model-00006-of-00014.safetensors",
|
| 184 |
+
"model.language_model.layers.20.linear_attn.conv1d.weight": "model-00006-of-00014.safetensors",
|
| 185 |
+
"model.language_model.layers.20.linear_attn.dt_bias": "model-00006-of-00014.safetensors",
|
| 186 |
+
"model.language_model.layers.20.linear_attn.in_proj_a.weight": "model-00006-of-00014.safetensors",
|
| 187 |
+
"model.language_model.layers.20.linear_attn.in_proj_b.weight": "model-00006-of-00014.safetensors",
|
| 188 |
+
"model.language_model.layers.20.linear_attn.in_proj_qkv.weight": "model-00006-of-00014.safetensors",
|
| 189 |
+
"model.language_model.layers.20.linear_attn.in_proj_z.weight": "model-00006-of-00014.safetensors",
|
| 190 |
+
"model.language_model.layers.20.linear_attn.norm.weight": "model-00006-of-00014.safetensors",
|
| 191 |
+
"model.language_model.layers.20.linear_attn.out_proj.weight": "model-00006-of-00014.safetensors",
|
| 192 |
+
"model.language_model.layers.20.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
|
| 193 |
+
"model.language_model.layers.20.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
|
| 194 |
+
"model.language_model.layers.20.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
|
| 195 |
+
"model.language_model.layers.20.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 196 |
+
"model.language_model.layers.21.input_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 197 |
+
"model.language_model.layers.21.linear_attn.A_log": "model-00006-of-00014.safetensors",
|
| 198 |
+
"model.language_model.layers.21.linear_attn.conv1d.weight": "model-00006-of-00014.safetensors",
|
| 199 |
+
"model.language_model.layers.21.linear_attn.dt_bias": "model-00006-of-00014.safetensors",
|
| 200 |
+
"model.language_model.layers.21.linear_attn.in_proj_a.weight": "model-00006-of-00014.safetensors",
|
| 201 |
+
"model.language_model.layers.21.linear_attn.in_proj_b.weight": "model-00006-of-00014.safetensors",
|
| 202 |
+
"model.language_model.layers.21.linear_attn.in_proj_qkv.weight": "model-00006-of-00014.safetensors",
|
| 203 |
+
"model.language_model.layers.21.linear_attn.in_proj_z.weight": "model-00006-of-00014.safetensors",
|
| 204 |
+
"model.language_model.layers.21.linear_attn.norm.weight": "model-00006-of-00014.safetensors",
|
| 205 |
+
"model.language_model.layers.21.linear_attn.out_proj.weight": "model-00006-of-00014.safetensors",
|
| 206 |
+
"model.language_model.layers.21.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
|
| 207 |
+
"model.language_model.layers.21.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
|
| 208 |
+
"model.language_model.layers.21.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
|
| 209 |
+
"model.language_model.layers.21.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 210 |
+
"model.language_model.layers.22.input_layernorm.weight": "model-00006-of-00014.safetensors",
|
| 211 |
+
"model.language_model.layers.22.linear_attn.A_log": "model-00006-of-00014.safetensors",
|
| 212 |
+
"model.language_model.layers.22.linear_attn.conv1d.weight": "model-00006-of-00014.safetensors",
|
| 213 |
+
"model.language_model.layers.22.linear_attn.dt_bias": "model-00006-of-00014.safetensors",
|
| 214 |
+
"model.language_model.layers.22.linear_attn.in_proj_a.weight": "model-00006-of-00014.safetensors",
|
| 215 |
+
"model.language_model.layers.22.linear_attn.in_proj_b.weight": "model-00006-of-00014.safetensors",
|
| 216 |
+
"model.language_model.layers.22.linear_attn.in_proj_qkv.weight": "model-00006-of-00014.safetensors",
|
| 217 |
+
"model.language_model.layers.22.linear_attn.in_proj_z.weight": "model-00006-of-00014.safetensors",
|
| 218 |
+
"model.language_model.layers.22.linear_attn.norm.weight": "model-00006-of-00014.safetensors",
|
| 219 |
+
"model.language_model.layers.22.linear_attn.out_proj.weight": "model-00006-of-00014.safetensors",
|
| 220 |
+
"model.language_model.layers.22.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
|
| 221 |
+
"model.language_model.layers.22.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
|
| 222 |
+
"model.language_model.layers.22.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
|
| 223 |
+
"model.language_model.layers.22.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 224 |
+
"model.language_model.layers.23.input_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 225 |
+
"model.language_model.layers.23.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
|
| 226 |
+
"model.language_model.layers.23.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
|
| 227 |
+
"model.language_model.layers.23.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
|
| 228 |
+
"model.language_model.layers.23.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 229 |
+
"model.language_model.layers.23.self_attn.k_norm.weight": "model-00007-of-00014.safetensors",
|
| 230 |
+
"model.language_model.layers.23.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
|
| 231 |
+
"model.language_model.layers.23.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
|
| 232 |
+
"model.language_model.layers.23.self_attn.q_norm.weight": "model-00007-of-00014.safetensors",
|
| 233 |
+
"model.language_model.layers.23.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
|
| 234 |
+
"model.language_model.layers.23.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
|
| 235 |
+
"model.language_model.layers.24.input_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 236 |
+
"model.language_model.layers.24.linear_attn.A_log": "model-00007-of-00014.safetensors",
|
| 237 |
+
"model.language_model.layers.24.linear_attn.conv1d.weight": "model-00007-of-00014.safetensors",
|
| 238 |
+
"model.language_model.layers.24.linear_attn.dt_bias": "model-00007-of-00014.safetensors",
|
| 239 |
+
"model.language_model.layers.24.linear_attn.in_proj_a.weight": "model-00007-of-00014.safetensors",
|
| 240 |
+
"model.language_model.layers.24.linear_attn.in_proj_b.weight": "model-00007-of-00014.safetensors",
|
| 241 |
+
"model.language_model.layers.24.linear_attn.in_proj_qkv.weight": "model-00007-of-00014.safetensors",
|
| 242 |
+
"model.language_model.layers.24.linear_attn.in_proj_z.weight": "model-00007-of-00014.safetensors",
|
| 243 |
+
"model.language_model.layers.24.linear_attn.norm.weight": "model-00007-of-00014.safetensors",
|
| 244 |
+
"model.language_model.layers.24.linear_attn.out_proj.weight": "model-00007-of-00014.safetensors",
|
| 245 |
+
"model.language_model.layers.24.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
|
| 246 |
+
"model.language_model.layers.24.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
|
| 247 |
+
"model.language_model.layers.24.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
|
| 248 |
+
"model.language_model.layers.24.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 249 |
+
"model.language_model.layers.25.input_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 250 |
+
"model.language_model.layers.25.linear_attn.A_log": "model-00007-of-00014.safetensors",
|
| 251 |
+
"model.language_model.layers.25.linear_attn.conv1d.weight": "model-00007-of-00014.safetensors",
|
| 252 |
+
"model.language_model.layers.25.linear_attn.dt_bias": "model-00007-of-00014.safetensors",
|
| 253 |
+
"model.language_model.layers.25.linear_attn.in_proj_a.weight": "model-00007-of-00014.safetensors",
|
| 254 |
+
"model.language_model.layers.25.linear_attn.in_proj_b.weight": "model-00007-of-00014.safetensors",
|
| 255 |
+
"model.language_model.layers.25.linear_attn.in_proj_qkv.weight": "model-00007-of-00014.safetensors",
|
| 256 |
+
"model.language_model.layers.25.linear_attn.in_proj_z.weight": "model-00007-of-00014.safetensors",
|
| 257 |
+
"model.language_model.layers.25.linear_attn.norm.weight": "model-00007-of-00014.safetensors",
|
| 258 |
+
"model.language_model.layers.25.linear_attn.out_proj.weight": "model-00007-of-00014.safetensors",
|
| 259 |
+
"model.language_model.layers.25.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
|
| 260 |
+
"model.language_model.layers.25.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
|
| 261 |
+
"model.language_model.layers.25.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
|
| 262 |
+
"model.language_model.layers.25.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 263 |
+
"model.language_model.layers.26.input_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 264 |
+
"model.language_model.layers.26.linear_attn.A_log": "model-00007-of-00014.safetensors",
|
| 265 |
+
"model.language_model.layers.26.linear_attn.conv1d.weight": "model-00007-of-00014.safetensors",
|
| 266 |
+
"model.language_model.layers.26.linear_attn.dt_bias": "model-00007-of-00014.safetensors",
|
| 267 |
+
"model.language_model.layers.26.linear_attn.in_proj_a.weight": "model-00007-of-00014.safetensors",
|
| 268 |
+
"model.language_model.layers.26.linear_attn.in_proj_b.weight": "model-00007-of-00014.safetensors",
|
| 269 |
+
"model.language_model.layers.26.linear_attn.in_proj_qkv.weight": "model-00007-of-00014.safetensors",
|
| 270 |
+
"model.language_model.layers.26.linear_attn.in_proj_z.weight": "model-00007-of-00014.safetensors",
|
| 271 |
+
"model.language_model.layers.26.linear_attn.norm.weight": "model-00007-of-00014.safetensors",
|
| 272 |
+
"model.language_model.layers.26.linear_attn.out_proj.weight": "model-00007-of-00014.safetensors",
|
| 273 |
+
"model.language_model.layers.26.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
|
| 274 |
+
"model.language_model.layers.26.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
|
| 275 |
+
"model.language_model.layers.26.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
|
| 276 |
+
"model.language_model.layers.26.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 277 |
+
"model.language_model.layers.27.input_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 278 |
+
"model.language_model.layers.27.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
|
| 279 |
+
"model.language_model.layers.27.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
|
| 280 |
+
"model.language_model.layers.27.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
|
| 281 |
+
"model.language_model.layers.27.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
|
| 282 |
+
"model.language_model.layers.27.self_attn.k_norm.weight": "model-00007-of-00014.safetensors",
|
| 283 |
+
"model.language_model.layers.27.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
|
| 284 |
+
"model.language_model.layers.27.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
|
| 285 |
+
"model.language_model.layers.27.self_attn.q_norm.weight": "model-00008-of-00014.safetensors",
|
| 286 |
+
"model.language_model.layers.27.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
|
| 287 |
+
"model.language_model.layers.27.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
|
| 288 |
+
"model.language_model.layers.28.input_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 289 |
+
"model.language_model.layers.28.linear_attn.A_log": "model-00008-of-00014.safetensors",
|
| 290 |
+
"model.language_model.layers.28.linear_attn.conv1d.weight": "model-00008-of-00014.safetensors",
|
| 291 |
+
"model.language_model.layers.28.linear_attn.dt_bias": "model-00008-of-00014.safetensors",
|
| 292 |
+
"model.language_model.layers.28.linear_attn.in_proj_a.weight": "model-00008-of-00014.safetensors",
|
| 293 |
+
"model.language_model.layers.28.linear_attn.in_proj_b.weight": "model-00008-of-00014.safetensors",
|
| 294 |
+
"model.language_model.layers.28.linear_attn.in_proj_qkv.weight": "model-00008-of-00014.safetensors",
|
| 295 |
+
"model.language_model.layers.28.linear_attn.in_proj_z.weight": "model-00008-of-00014.safetensors",
|
| 296 |
+
"model.language_model.layers.28.linear_attn.norm.weight": "model-00008-of-00014.safetensors",
|
| 297 |
+
"model.language_model.layers.28.linear_attn.out_proj.weight": "model-00008-of-00014.safetensors",
|
| 298 |
+
"model.language_model.layers.28.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
|
| 299 |
+
"model.language_model.layers.28.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
|
| 300 |
+
"model.language_model.layers.28.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
|
| 301 |
+
"model.language_model.layers.28.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 302 |
+
"model.language_model.layers.29.input_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 303 |
+
"model.language_model.layers.29.linear_attn.A_log": "model-00008-of-00014.safetensors",
|
| 304 |
+
"model.language_model.layers.29.linear_attn.conv1d.weight": "model-00008-of-00014.safetensors",
|
| 305 |
+
"model.language_model.layers.29.linear_attn.dt_bias": "model-00008-of-00014.safetensors",
|
| 306 |
+
"model.language_model.layers.29.linear_attn.in_proj_a.weight": "model-00008-of-00014.safetensors",
|
| 307 |
+
"model.language_model.layers.29.linear_attn.in_proj_b.weight": "model-00008-of-00014.safetensors",
|
| 308 |
+
"model.language_model.layers.29.linear_attn.in_proj_qkv.weight": "model-00008-of-00014.safetensors",
|
| 309 |
+
"model.language_model.layers.29.linear_attn.in_proj_z.weight": "model-00008-of-00014.safetensors",
|
| 310 |
+
"model.language_model.layers.29.linear_attn.norm.weight": "model-00008-of-00014.safetensors",
|
| 311 |
+
"model.language_model.layers.29.linear_attn.out_proj.weight": "model-00008-of-00014.safetensors",
|
| 312 |
+
"model.language_model.layers.29.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
|
| 313 |
+
"model.language_model.layers.29.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
|
| 314 |
+
"model.language_model.layers.29.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
|
| 315 |
+
"model.language_model.layers.29.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 316 |
+
"model.language_model.layers.3.input_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 317 |
+
"model.language_model.layers.3.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
|
| 318 |
+
"model.language_model.layers.3.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
|
| 319 |
+
"model.language_model.layers.3.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
|
| 320 |
+
"model.language_model.layers.3.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 321 |
+
"model.language_model.layers.3.self_attn.k_norm.weight": "model-00003-of-00014.safetensors",
|
| 322 |
+
"model.language_model.layers.3.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
|
| 323 |
+
"model.language_model.layers.3.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
|
| 324 |
+
"model.language_model.layers.3.self_attn.q_norm.weight": "model-00003-of-00014.safetensors",
|
| 325 |
+
"model.language_model.layers.3.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
|
| 326 |
+
"model.language_model.layers.3.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
|
| 327 |
+
"model.language_model.layers.30.input_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 328 |
+
"model.language_model.layers.30.linear_attn.A_log": "model-00008-of-00014.safetensors",
|
| 329 |
+
"model.language_model.layers.30.linear_attn.conv1d.weight": "model-00008-of-00014.safetensors",
|
| 330 |
+
"model.language_model.layers.30.linear_attn.dt_bias": "model-00008-of-00014.safetensors",
|
| 331 |
+
"model.language_model.layers.30.linear_attn.in_proj_a.weight": "model-00008-of-00014.safetensors",
|
| 332 |
+
"model.language_model.layers.30.linear_attn.in_proj_b.weight": "model-00008-of-00014.safetensors",
|
| 333 |
+
"model.language_model.layers.30.linear_attn.in_proj_qkv.weight": "model-00008-of-00014.safetensors",
|
| 334 |
+
"model.language_model.layers.30.linear_attn.in_proj_z.weight": "model-00008-of-00014.safetensors",
|
| 335 |
+
"model.language_model.layers.30.linear_attn.norm.weight": "model-00008-of-00014.safetensors",
|
| 336 |
+
"model.language_model.layers.30.linear_attn.out_proj.weight": "model-00008-of-00014.safetensors",
|
| 337 |
+
"model.language_model.layers.30.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
|
| 338 |
+
"model.language_model.layers.30.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
|
| 339 |
+
"model.language_model.layers.30.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
|
| 340 |
+
"model.language_model.layers.30.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 341 |
+
"model.language_model.layers.31.input_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 342 |
+
"model.language_model.layers.31.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
|
| 343 |
+
"model.language_model.layers.31.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
|
| 344 |
+
"model.language_model.layers.31.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
|
| 345 |
+
"model.language_model.layers.31.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 346 |
+
"model.language_model.layers.31.self_attn.k_norm.weight": "model-00008-of-00014.safetensors",
|
| 347 |
+
"model.language_model.layers.31.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
|
| 348 |
+
"model.language_model.layers.31.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
|
| 349 |
+
"model.language_model.layers.31.self_attn.q_norm.weight": "model-00008-of-00014.safetensors",
|
| 350 |
+
"model.language_model.layers.31.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
|
| 351 |
+
"model.language_model.layers.31.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
|
| 352 |
+
"model.language_model.layers.32.input_layernorm.weight": "model-00008-of-00014.safetensors",
|
| 353 |
+
"model.language_model.layers.32.linear_attn.A_log": "model-00008-of-00014.safetensors",
|
| 354 |
+
"model.language_model.layers.32.linear_attn.conv1d.weight": "model-00008-of-00014.safetensors",
|
| 355 |
+
"model.language_model.layers.32.linear_attn.dt_bias": "model-00008-of-00014.safetensors",
|
| 356 |
+
"model.language_model.layers.32.linear_attn.in_proj_a.weight": "model-00008-of-00014.safetensors",
|
| 357 |
+
"model.language_model.layers.32.linear_attn.in_proj_b.weight": "model-00008-of-00014.safetensors",
|
| 358 |
+
"model.language_model.layers.32.linear_attn.in_proj_qkv.weight": "model-00008-of-00014.safetensors",
|
| 359 |
+
"model.language_model.layers.32.linear_attn.in_proj_z.weight": "model-00008-of-00014.safetensors",
|
| 360 |
+
"model.language_model.layers.32.linear_attn.norm.weight": "model-00008-of-00014.safetensors",
|
| 361 |
+
"model.language_model.layers.32.linear_attn.out_proj.weight": "model-00008-of-00014.safetensors",
|
| 362 |
+
"model.language_model.layers.32.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
|
| 363 |
+
"model.language_model.layers.32.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
|
| 364 |
+
"model.language_model.layers.32.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
|
| 365 |
+
"model.language_model.layers.32.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 366 |
+
"model.language_model.layers.33.input_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 367 |
+
"model.language_model.layers.33.linear_attn.A_log": "model-00009-of-00014.safetensors",
|
| 368 |
+
"model.language_model.layers.33.linear_attn.conv1d.weight": "model-00009-of-00014.safetensors",
|
| 369 |
+
"model.language_model.layers.33.linear_attn.dt_bias": "model-00009-of-00014.safetensors",
|
| 370 |
+
"model.language_model.layers.33.linear_attn.in_proj_a.weight": "model-00009-of-00014.safetensors",
|
| 371 |
+
"model.language_model.layers.33.linear_attn.in_proj_b.weight": "model-00009-of-00014.safetensors",
|
| 372 |
+
"model.language_model.layers.33.linear_attn.in_proj_qkv.weight": "model-00009-of-00014.safetensors",
|
| 373 |
+
"model.language_model.layers.33.linear_attn.in_proj_z.weight": "model-00009-of-00014.safetensors",
|
| 374 |
+
"model.language_model.layers.33.linear_attn.norm.weight": "model-00009-of-00014.safetensors",
|
| 375 |
+
"model.language_model.layers.33.linear_attn.out_proj.weight": "model-00009-of-00014.safetensors",
|
| 376 |
+
"model.language_model.layers.33.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
|
| 377 |
+
"model.language_model.layers.33.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
|
| 378 |
+
"model.language_model.layers.33.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
|
| 379 |
+
"model.language_model.layers.33.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 380 |
+
"model.language_model.layers.34.input_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 381 |
+
"model.language_model.layers.34.linear_attn.A_log": "model-00009-of-00014.safetensors",
|
| 382 |
+
"model.language_model.layers.34.linear_attn.conv1d.weight": "model-00009-of-00014.safetensors",
|
| 383 |
+
"model.language_model.layers.34.linear_attn.dt_bias": "model-00009-of-00014.safetensors",
|
| 384 |
+
"model.language_model.layers.34.linear_attn.in_proj_a.weight": "model-00009-of-00014.safetensors",
|
| 385 |
+
"model.language_model.layers.34.linear_attn.in_proj_b.weight": "model-00009-of-00014.safetensors",
|
| 386 |
+
"model.language_model.layers.34.linear_attn.in_proj_qkv.weight": "model-00009-of-00014.safetensors",
|
| 387 |
+
"model.language_model.layers.34.linear_attn.in_proj_z.weight": "model-00009-of-00014.safetensors",
|
| 388 |
+
"model.language_model.layers.34.linear_attn.norm.weight": "model-00009-of-00014.safetensors",
|
| 389 |
+
"model.language_model.layers.34.linear_attn.out_proj.weight": "model-00009-of-00014.safetensors",
|
| 390 |
+
"model.language_model.layers.34.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
|
| 391 |
+
"model.language_model.layers.34.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
|
| 392 |
+
"model.language_model.layers.34.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
|
| 393 |
+
"model.language_model.layers.34.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 394 |
+
"model.language_model.layers.35.input_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 395 |
+
"model.language_model.layers.35.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
|
| 396 |
+
"model.language_model.layers.35.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
|
| 397 |
+
"model.language_model.layers.35.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
|
| 398 |
+
"model.language_model.layers.35.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 399 |
+
"model.language_model.layers.35.self_attn.k_norm.weight": "model-00009-of-00014.safetensors",
|
| 400 |
+
"model.language_model.layers.35.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
|
| 401 |
+
"model.language_model.layers.35.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
|
| 402 |
+
"model.language_model.layers.35.self_attn.q_norm.weight": "model-00009-of-00014.safetensors",
|
| 403 |
+
"model.language_model.layers.35.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
|
| 404 |
+
"model.language_model.layers.35.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
|
| 405 |
+
"model.language_model.layers.36.input_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 406 |
+
"model.language_model.layers.36.linear_attn.A_log": "model-00009-of-00014.safetensors",
|
| 407 |
+
"model.language_model.layers.36.linear_attn.conv1d.weight": "model-00009-of-00014.safetensors",
|
| 408 |
+
"model.language_model.layers.36.linear_attn.dt_bias": "model-00009-of-00014.safetensors",
|
| 409 |
+
"model.language_model.layers.36.linear_attn.in_proj_a.weight": "model-00009-of-00014.safetensors",
|
| 410 |
+
"model.language_model.layers.36.linear_attn.in_proj_b.weight": "model-00009-of-00014.safetensors",
|
| 411 |
+
"model.language_model.layers.36.linear_attn.in_proj_qkv.weight": "model-00009-of-00014.safetensors",
|
| 412 |
+
"model.language_model.layers.36.linear_attn.in_proj_z.weight": "model-00009-of-00014.safetensors",
|
| 413 |
+
"model.language_model.layers.36.linear_attn.norm.weight": "model-00009-of-00014.safetensors",
|
| 414 |
+
"model.language_model.layers.36.linear_attn.out_proj.weight": "model-00009-of-00014.safetensors",
|
| 415 |
+
"model.language_model.layers.36.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
|
| 416 |
+
"model.language_model.layers.36.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
|
| 417 |
+
"model.language_model.layers.36.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
|
| 418 |
+
"model.language_model.layers.36.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 419 |
+
"model.language_model.layers.37.input_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 420 |
+
"model.language_model.layers.37.linear_attn.A_log": "model-00009-of-00014.safetensors",
|
| 421 |
+
"model.language_model.layers.37.linear_attn.conv1d.weight": "model-00009-of-00014.safetensors",
|
| 422 |
+
"model.language_model.layers.37.linear_attn.dt_bias": "model-00009-of-00014.safetensors",
|
| 423 |
+
"model.language_model.layers.37.linear_attn.in_proj_a.weight": "model-00009-of-00014.safetensors",
|
| 424 |
+
"model.language_model.layers.37.linear_attn.in_proj_b.weight": "model-00009-of-00014.safetensors",
|
| 425 |
+
"model.language_model.layers.37.linear_attn.in_proj_qkv.weight": "model-00009-of-00014.safetensors",
|
| 426 |
+
"model.language_model.layers.37.linear_attn.in_proj_z.weight": "model-00009-of-00014.safetensors",
|
| 427 |
+
"model.language_model.layers.37.linear_attn.norm.weight": "model-00009-of-00014.safetensors",
|
| 428 |
+
"model.language_model.layers.37.linear_attn.out_proj.weight": "model-00009-of-00014.safetensors",
|
| 429 |
+
"model.language_model.layers.37.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
|
| 430 |
+
"model.language_model.layers.37.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
|
| 431 |
+
"model.language_model.layers.37.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
|
| 432 |
+
"model.language_model.layers.37.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 433 |
+
"model.language_model.layers.38.input_layernorm.weight": "model-00009-of-00014.safetensors",
|
| 434 |
+
"model.language_model.layers.38.linear_attn.A_log": "model-00009-of-00014.safetensors",
|
| 435 |
+
"model.language_model.layers.38.linear_attn.conv1d.weight": "model-00009-of-00014.safetensors",
|
| 436 |
+
"model.language_model.layers.38.linear_attn.dt_bias": "model-00009-of-00014.safetensors",
|
| 437 |
+
"model.language_model.layers.38.linear_attn.in_proj_a.weight": "model-00009-of-00014.safetensors",
|
| 438 |
+
"model.language_model.layers.38.linear_attn.in_proj_b.weight": "model-00009-of-00014.safetensors",
|
| 439 |
+
"model.language_model.layers.38.linear_attn.in_proj_qkv.weight": "model-00010-of-00014.safetensors",
|
| 440 |
+
"model.language_model.layers.38.linear_attn.in_proj_z.weight": "model-00010-of-00014.safetensors",
|
| 441 |
+
"model.language_model.layers.38.linear_attn.norm.weight": "model-00010-of-00014.safetensors",
|
| 442 |
+
"model.language_model.layers.38.linear_attn.out_proj.weight": "model-00010-of-00014.safetensors",
|
| 443 |
+
"model.language_model.layers.38.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
|
| 444 |
+
"model.language_model.layers.38.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
|
| 445 |
+
"model.language_model.layers.38.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
|
| 446 |
+
"model.language_model.layers.38.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 447 |
+
"model.language_model.layers.39.input_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 448 |
+
"model.language_model.layers.39.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
|
| 449 |
+
"model.language_model.layers.39.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
|
| 450 |
+
"model.language_model.layers.39.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
|
| 451 |
+
"model.language_model.layers.39.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 452 |
+
"model.language_model.layers.39.self_attn.k_norm.weight": "model-00010-of-00014.safetensors",
|
| 453 |
+
"model.language_model.layers.39.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
|
| 454 |
+
"model.language_model.layers.39.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
|
| 455 |
+
"model.language_model.layers.39.self_attn.q_norm.weight": "model-00010-of-00014.safetensors",
|
| 456 |
+
"model.language_model.layers.39.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
|
| 457 |
+
"model.language_model.layers.39.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
|
| 458 |
+
"model.language_model.layers.4.input_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 459 |
+
"model.language_model.layers.4.linear_attn.A_log": "model-00003-of-00014.safetensors",
|
| 460 |
+
"model.language_model.layers.4.linear_attn.conv1d.weight": "model-00003-of-00014.safetensors",
|
| 461 |
+
"model.language_model.layers.4.linear_attn.dt_bias": "model-00003-of-00014.safetensors",
|
| 462 |
+
"model.language_model.layers.4.linear_attn.in_proj_a.weight": "model-00003-of-00014.safetensors",
|
| 463 |
+
"model.language_model.layers.4.linear_attn.in_proj_b.weight": "model-00003-of-00014.safetensors",
|
| 464 |
+
"model.language_model.layers.4.linear_attn.in_proj_qkv.weight": "model-00003-of-00014.safetensors",
|
| 465 |
+
"model.language_model.layers.4.linear_attn.in_proj_z.weight": "model-00003-of-00014.safetensors",
|
| 466 |
+
"model.language_model.layers.4.linear_attn.norm.weight": "model-00003-of-00014.safetensors",
|
| 467 |
+
"model.language_model.layers.4.linear_attn.out_proj.weight": "model-00003-of-00014.safetensors",
|
| 468 |
+
"model.language_model.layers.4.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
|
| 469 |
+
"model.language_model.layers.4.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
|
| 470 |
+
"model.language_model.layers.4.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
|
| 471 |
+
"model.language_model.layers.4.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 472 |
+
"model.language_model.layers.40.input_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 473 |
+
"model.language_model.layers.40.linear_attn.A_log": "model-00010-of-00014.safetensors",
|
| 474 |
+
"model.language_model.layers.40.linear_attn.conv1d.weight": "model-00010-of-00014.safetensors",
|
| 475 |
+
"model.language_model.layers.40.linear_attn.dt_bias": "model-00010-of-00014.safetensors",
|
| 476 |
+
"model.language_model.layers.40.linear_attn.in_proj_a.weight": "model-00010-of-00014.safetensors",
|
| 477 |
+
"model.language_model.layers.40.linear_attn.in_proj_b.weight": "model-00010-of-00014.safetensors",
|
| 478 |
+
"model.language_model.layers.40.linear_attn.in_proj_qkv.weight": "model-00010-of-00014.safetensors",
|
| 479 |
+
"model.language_model.layers.40.linear_attn.in_proj_z.weight": "model-00010-of-00014.safetensors",
|
| 480 |
+
"model.language_model.layers.40.linear_attn.norm.weight": "model-00010-of-00014.safetensors",
|
| 481 |
+
"model.language_model.layers.40.linear_attn.out_proj.weight": "model-00010-of-00014.safetensors",
|
| 482 |
+
"model.language_model.layers.40.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
|
| 483 |
+
"model.language_model.layers.40.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
|
| 484 |
+
"model.language_model.layers.40.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
|
| 485 |
+
"model.language_model.layers.40.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 486 |
+
"model.language_model.layers.41.input_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 487 |
+
"model.language_model.layers.41.linear_attn.A_log": "model-00010-of-00014.safetensors",
|
| 488 |
+
"model.language_model.layers.41.linear_attn.conv1d.weight": "model-00010-of-00014.safetensors",
|
| 489 |
+
"model.language_model.layers.41.linear_attn.dt_bias": "model-00010-of-00014.safetensors",
|
| 490 |
+
"model.language_model.layers.41.linear_attn.in_proj_a.weight": "model-00010-of-00014.safetensors",
|
| 491 |
+
"model.language_model.layers.41.linear_attn.in_proj_b.weight": "model-00010-of-00014.safetensors",
|
| 492 |
+
"model.language_model.layers.41.linear_attn.in_proj_qkv.weight": "model-00010-of-00014.safetensors",
|
| 493 |
+
"model.language_model.layers.41.linear_attn.in_proj_z.weight": "model-00010-of-00014.safetensors",
|
| 494 |
+
"model.language_model.layers.41.linear_attn.norm.weight": "model-00010-of-00014.safetensors",
|
| 495 |
+
"model.language_model.layers.41.linear_attn.out_proj.weight": "model-00010-of-00014.safetensors",
|
| 496 |
+
"model.language_model.layers.41.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
|
| 497 |
+
"model.language_model.layers.41.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
|
| 498 |
+
"model.language_model.layers.41.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
|
| 499 |
+
"model.language_model.layers.41.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 500 |
+
"model.language_model.layers.42.input_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 501 |
+
"model.language_model.layers.42.linear_attn.A_log": "model-00010-of-00014.safetensors",
|
| 502 |
+
"model.language_model.layers.42.linear_attn.conv1d.weight": "model-00010-of-00014.safetensors",
|
| 503 |
+
"model.language_model.layers.42.linear_attn.dt_bias": "model-00010-of-00014.safetensors",
|
| 504 |
+
"model.language_model.layers.42.linear_attn.in_proj_a.weight": "model-00010-of-00014.safetensors",
|
| 505 |
+
"model.language_model.layers.42.linear_attn.in_proj_b.weight": "model-00010-of-00014.safetensors",
|
| 506 |
+
"model.language_model.layers.42.linear_attn.in_proj_qkv.weight": "model-00010-of-00014.safetensors",
|
| 507 |
+
"model.language_model.layers.42.linear_attn.in_proj_z.weight": "model-00010-of-00014.safetensors",
|
| 508 |
+
"model.language_model.layers.42.linear_attn.norm.weight": "model-00010-of-00014.safetensors",
|
| 509 |
+
"model.language_model.layers.42.linear_attn.out_proj.weight": "model-00010-of-00014.safetensors",
|
| 510 |
+
"model.language_model.layers.42.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
|
| 511 |
+
"model.language_model.layers.42.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
|
| 512 |
+
"model.language_model.layers.42.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
|
| 513 |
+
"model.language_model.layers.42.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 514 |
+
"model.language_model.layers.43.input_layernorm.weight": "model-00010-of-00014.safetensors",
|
| 515 |
+
"model.language_model.layers.43.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
|
| 516 |
+
"model.language_model.layers.43.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
|
| 517 |
+
"model.language_model.layers.43.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
|
| 518 |
+
"model.language_model.layers.43.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 519 |
+
"model.language_model.layers.43.self_attn.k_norm.weight": "model-00011-of-00014.safetensors",
|
| 520 |
+
"model.language_model.layers.43.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
|
| 521 |
+
"model.language_model.layers.43.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
|
| 522 |
+
"model.language_model.layers.43.self_attn.q_norm.weight": "model-00011-of-00014.safetensors",
|
| 523 |
+
"model.language_model.layers.43.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
|
| 524 |
+
"model.language_model.layers.43.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
|
| 525 |
+
"model.language_model.layers.44.input_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 526 |
+
"model.language_model.layers.44.linear_attn.A_log": "model-00011-of-00014.safetensors",
|
| 527 |
+
"model.language_model.layers.44.linear_attn.conv1d.weight": "model-00011-of-00014.safetensors",
|
| 528 |
+
"model.language_model.layers.44.linear_attn.dt_bias": "model-00011-of-00014.safetensors",
|
| 529 |
+
"model.language_model.layers.44.linear_attn.in_proj_a.weight": "model-00011-of-00014.safetensors",
|
| 530 |
+
"model.language_model.layers.44.linear_attn.in_proj_b.weight": "model-00011-of-00014.safetensors",
|
| 531 |
+
"model.language_model.layers.44.linear_attn.in_proj_qkv.weight": "model-00011-of-00014.safetensors",
|
| 532 |
+
"model.language_model.layers.44.linear_attn.in_proj_z.weight": "model-00011-of-00014.safetensors",
|
| 533 |
+
"model.language_model.layers.44.linear_attn.norm.weight": "model-00011-of-00014.safetensors",
|
| 534 |
+
"model.language_model.layers.44.linear_attn.out_proj.weight": "model-00011-of-00014.safetensors",
|
| 535 |
+
"model.language_model.layers.44.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
|
| 536 |
+
"model.language_model.layers.44.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
|
| 537 |
+
"model.language_model.layers.44.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
|
| 538 |
+
"model.language_model.layers.44.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 539 |
+
"model.language_model.layers.45.input_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 540 |
+
"model.language_model.layers.45.linear_attn.A_log": "model-00011-of-00014.safetensors",
|
| 541 |
+
"model.language_model.layers.45.linear_attn.conv1d.weight": "model-00011-of-00014.safetensors",
|
| 542 |
+
"model.language_model.layers.45.linear_attn.dt_bias": "model-00011-of-00014.safetensors",
|
| 543 |
+
"model.language_model.layers.45.linear_attn.in_proj_a.weight": "model-00011-of-00014.safetensors",
|
| 544 |
+
"model.language_model.layers.45.linear_attn.in_proj_b.weight": "model-00011-of-00014.safetensors",
|
| 545 |
+
"model.language_model.layers.45.linear_attn.in_proj_qkv.weight": "model-00011-of-00014.safetensors",
|
| 546 |
+
"model.language_model.layers.45.linear_attn.in_proj_z.weight": "model-00011-of-00014.safetensors",
|
| 547 |
+
"model.language_model.layers.45.linear_attn.norm.weight": "model-00011-of-00014.safetensors",
|
| 548 |
+
"model.language_model.layers.45.linear_attn.out_proj.weight": "model-00011-of-00014.safetensors",
|
| 549 |
+
"model.language_model.layers.45.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
|
| 550 |
+
"model.language_model.layers.45.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
|
| 551 |
+
"model.language_model.layers.45.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
|
| 552 |
+
"model.language_model.layers.45.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 553 |
+
"model.language_model.layers.46.input_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 554 |
+
"model.language_model.layers.46.linear_attn.A_log": "model-00011-of-00014.safetensors",
|
| 555 |
+
"model.language_model.layers.46.linear_attn.conv1d.weight": "model-00011-of-00014.safetensors",
|
| 556 |
+
"model.language_model.layers.46.linear_attn.dt_bias": "model-00011-of-00014.safetensors",
|
| 557 |
+
"model.language_model.layers.46.linear_attn.in_proj_a.weight": "model-00011-of-00014.safetensors",
|
| 558 |
+
"model.language_model.layers.46.linear_attn.in_proj_b.weight": "model-00011-of-00014.safetensors",
|
| 559 |
+
"model.language_model.layers.46.linear_attn.in_proj_qkv.weight": "model-00011-of-00014.safetensors",
|
| 560 |
+
"model.language_model.layers.46.linear_attn.in_proj_z.weight": "model-00011-of-00014.safetensors",
|
| 561 |
+
"model.language_model.layers.46.linear_attn.norm.weight": "model-00011-of-00014.safetensors",
|
| 562 |
+
"model.language_model.layers.46.linear_attn.out_proj.weight": "model-00011-of-00014.safetensors",
|
| 563 |
+
"model.language_model.layers.46.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
|
| 564 |
+
"model.language_model.layers.46.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
|
| 565 |
+
"model.language_model.layers.46.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
|
| 566 |
+
"model.language_model.layers.46.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 567 |
+
"model.language_model.layers.47.input_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 568 |
+
"model.language_model.layers.47.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
|
| 569 |
+
"model.language_model.layers.47.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
|
| 570 |
+
"model.language_model.layers.47.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
|
| 571 |
+
"model.language_model.layers.47.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 572 |
+
"model.language_model.layers.47.self_attn.k_norm.weight": "model-00011-of-00014.safetensors",
|
| 573 |
+
"model.language_model.layers.47.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
|
| 574 |
+
"model.language_model.layers.47.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
|
| 575 |
+
"model.language_model.layers.47.self_attn.q_norm.weight": "model-00011-of-00014.safetensors",
|
| 576 |
+
"model.language_model.layers.47.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
|
| 577 |
+
"model.language_model.layers.47.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
|
| 578 |
+
"model.language_model.layers.48.input_layernorm.weight": "model-00011-of-00014.safetensors",
|
| 579 |
+
"model.language_model.layers.48.linear_attn.A_log": "model-00011-of-00014.safetensors",
|
| 580 |
+
"model.language_model.layers.48.linear_attn.conv1d.weight": "model-00011-of-00014.safetensors",
|
| 581 |
+
"model.language_model.layers.48.linear_attn.dt_bias": "model-00011-of-00014.safetensors",
|
| 582 |
+
"model.language_model.layers.48.linear_attn.in_proj_a.weight": "model-00011-of-00014.safetensors",
|
| 583 |
+
"model.language_model.layers.48.linear_attn.in_proj_b.weight": "model-00011-of-00014.safetensors",
|
| 584 |
+
"model.language_model.layers.48.linear_attn.in_proj_qkv.weight": "model-00011-of-00014.safetensors",
|
| 585 |
+
"model.language_model.layers.48.linear_attn.in_proj_z.weight": "model-00011-of-00014.safetensors",
|
| 586 |
+
"model.language_model.layers.48.linear_attn.norm.weight": "model-00011-of-00014.safetensors",
|
| 587 |
+
"model.language_model.layers.48.linear_attn.out_proj.weight": "model-00011-of-00014.safetensors",
|
| 588 |
+
"model.language_model.layers.48.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
|
| 589 |
+
"model.language_model.layers.48.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
|
| 590 |
+
"model.language_model.layers.48.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
|
| 591 |
+
"model.language_model.layers.48.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 592 |
+
"model.language_model.layers.49.input_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 593 |
+
"model.language_model.layers.49.linear_attn.A_log": "model-00012-of-00014.safetensors",
|
| 594 |
+
"model.language_model.layers.49.linear_attn.conv1d.weight": "model-00012-of-00014.safetensors",
|
| 595 |
+
"model.language_model.layers.49.linear_attn.dt_bias": "model-00012-of-00014.safetensors",
|
| 596 |
+
"model.language_model.layers.49.linear_attn.in_proj_a.weight": "model-00012-of-00014.safetensors",
|
| 597 |
+
"model.language_model.layers.49.linear_attn.in_proj_b.weight": "model-00012-of-00014.safetensors",
|
| 598 |
+
"model.language_model.layers.49.linear_attn.in_proj_qkv.weight": "model-00012-of-00014.safetensors",
|
| 599 |
+
"model.language_model.layers.49.linear_attn.in_proj_z.weight": "model-00012-of-00014.safetensors",
|
| 600 |
+
"model.language_model.layers.49.linear_attn.norm.weight": "model-00012-of-00014.safetensors",
|
| 601 |
+
"model.language_model.layers.49.linear_attn.out_proj.weight": "model-00012-of-00014.safetensors",
|
| 602 |
+
"model.language_model.layers.49.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
|
| 603 |
+
"model.language_model.layers.49.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
|
| 604 |
+
"model.language_model.layers.49.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
|
| 605 |
+
"model.language_model.layers.49.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 606 |
+
"model.language_model.layers.5.input_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 607 |
+
"model.language_model.layers.5.linear_attn.A_log": "model-00003-of-00014.safetensors",
|
| 608 |
+
"model.language_model.layers.5.linear_attn.conv1d.weight": "model-00003-of-00014.safetensors",
|
| 609 |
+
"model.language_model.layers.5.linear_attn.dt_bias": "model-00003-of-00014.safetensors",
|
| 610 |
+
"model.language_model.layers.5.linear_attn.in_proj_a.weight": "model-00003-of-00014.safetensors",
|
| 611 |
+
"model.language_model.layers.5.linear_attn.in_proj_b.weight": "model-00003-of-00014.safetensors",
|
| 612 |
+
"model.language_model.layers.5.linear_attn.in_proj_qkv.weight": "model-00003-of-00014.safetensors",
|
| 613 |
+
"model.language_model.layers.5.linear_attn.in_proj_z.weight": "model-00003-of-00014.safetensors",
|
| 614 |
+
"model.language_model.layers.5.linear_attn.norm.weight": "model-00003-of-00014.safetensors",
|
| 615 |
+
"model.language_model.layers.5.linear_attn.out_proj.weight": "model-00003-of-00014.safetensors",
|
| 616 |
+
"model.language_model.layers.5.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
|
| 617 |
+
"model.language_model.layers.5.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
|
| 618 |
+
"model.language_model.layers.5.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
|
| 619 |
+
"model.language_model.layers.5.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 620 |
+
"model.language_model.layers.50.input_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 621 |
+
"model.language_model.layers.50.linear_attn.A_log": "model-00012-of-00014.safetensors",
|
| 622 |
+
"model.language_model.layers.50.linear_attn.conv1d.weight": "model-00012-of-00014.safetensors",
|
| 623 |
+
"model.language_model.layers.50.linear_attn.dt_bias": "model-00012-of-00014.safetensors",
|
| 624 |
+
"model.language_model.layers.50.linear_attn.in_proj_a.weight": "model-00012-of-00014.safetensors",
|
| 625 |
+
"model.language_model.layers.50.linear_attn.in_proj_b.weight": "model-00012-of-00014.safetensors",
|
| 626 |
+
"model.language_model.layers.50.linear_attn.in_proj_qkv.weight": "model-00012-of-00014.safetensors",
|
| 627 |
+
"model.language_model.layers.50.linear_attn.in_proj_z.weight": "model-00012-of-00014.safetensors",
|
| 628 |
+
"model.language_model.layers.50.linear_attn.norm.weight": "model-00012-of-00014.safetensors",
|
| 629 |
+
"model.language_model.layers.50.linear_attn.out_proj.weight": "model-00012-of-00014.safetensors",
|
| 630 |
+
"model.language_model.layers.50.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
|
| 631 |
+
"model.language_model.layers.50.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
|
| 632 |
+
"model.language_model.layers.50.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
|
| 633 |
+
"model.language_model.layers.50.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 634 |
+
"model.language_model.layers.51.input_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 635 |
+
"model.language_model.layers.51.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
|
| 636 |
+
"model.language_model.layers.51.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
|
| 637 |
+
"model.language_model.layers.51.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
|
| 638 |
+
"model.language_model.layers.51.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 639 |
+
"model.language_model.layers.51.self_attn.k_norm.weight": "model-00012-of-00014.safetensors",
|
| 640 |
+
"model.language_model.layers.51.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
|
| 641 |
+
"model.language_model.layers.51.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
|
| 642 |
+
"model.language_model.layers.51.self_attn.q_norm.weight": "model-00012-of-00014.safetensors",
|
| 643 |
+
"model.language_model.layers.51.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
|
| 644 |
+
"model.language_model.layers.51.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
|
| 645 |
+
"model.language_model.layers.52.input_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 646 |
+
"model.language_model.layers.52.linear_attn.A_log": "model-00012-of-00014.safetensors",
|
| 647 |
+
"model.language_model.layers.52.linear_attn.conv1d.weight": "model-00012-of-00014.safetensors",
|
| 648 |
+
"model.language_model.layers.52.linear_attn.dt_bias": "model-00012-of-00014.safetensors",
|
| 649 |
+
"model.language_model.layers.52.linear_attn.in_proj_a.weight": "model-00012-of-00014.safetensors",
|
| 650 |
+
"model.language_model.layers.52.linear_attn.in_proj_b.weight": "model-00012-of-00014.safetensors",
|
| 651 |
+
"model.language_model.layers.52.linear_attn.in_proj_qkv.weight": "model-00012-of-00014.safetensors",
|
| 652 |
+
"model.language_model.layers.52.linear_attn.in_proj_z.weight": "model-00012-of-00014.safetensors",
|
| 653 |
+
"model.language_model.layers.52.linear_attn.norm.weight": "model-00012-of-00014.safetensors",
|
| 654 |
+
"model.language_model.layers.52.linear_attn.out_proj.weight": "model-00012-of-00014.safetensors",
|
| 655 |
+
"model.language_model.layers.52.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
|
| 656 |
+
"model.language_model.layers.52.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
|
| 657 |
+
"model.language_model.layers.52.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
|
| 658 |
+
"model.language_model.layers.52.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 659 |
+
"model.language_model.layers.53.input_layernorm.weight": "model-00012-of-00014.safetensors",
|
| 660 |
+
"model.language_model.layers.53.linear_attn.A_log": "model-00012-of-00014.safetensors",
|
| 661 |
+
"model.language_model.layers.53.linear_attn.conv1d.weight": "model-00012-of-00014.safetensors",
|
| 662 |
+
"model.language_model.layers.53.linear_attn.dt_bias": "model-00012-of-00014.safetensors",
|
| 663 |
+
"model.language_model.layers.53.linear_attn.in_proj_a.weight": "model-00012-of-00014.safetensors",
|
| 664 |
+
"model.language_model.layers.53.linear_attn.in_proj_b.weight": "model-00012-of-00014.safetensors",
|
| 665 |
+
"model.language_model.layers.53.linear_attn.in_proj_qkv.weight": "model-00012-of-00014.safetensors",
|
| 666 |
+
"model.language_model.layers.53.linear_attn.in_proj_z.weight": "model-00012-of-00014.safetensors",
|
| 667 |
+
"model.language_model.layers.53.linear_attn.norm.weight": "model-00012-of-00014.safetensors",
|
| 668 |
+
"model.language_model.layers.53.linear_attn.out_proj.weight": "model-00012-of-00014.safetensors",
|
| 669 |
+
"model.language_model.layers.53.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
|
| 670 |
+
"model.language_model.layers.53.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
|
| 671 |
+
"model.language_model.layers.53.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
|
| 672 |
+
"model.language_model.layers.53.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 673 |
+
"model.language_model.layers.54.input_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 674 |
+
"model.language_model.layers.54.linear_attn.A_log": "model-00013-of-00014.safetensors",
|
| 675 |
+
"model.language_model.layers.54.linear_attn.conv1d.weight": "model-00013-of-00014.safetensors",
|
| 676 |
+
"model.language_model.layers.54.linear_attn.dt_bias": "model-00013-of-00014.safetensors",
|
| 677 |
+
"model.language_model.layers.54.linear_attn.in_proj_a.weight": "model-00013-of-00014.safetensors",
|
| 678 |
+
"model.language_model.layers.54.linear_attn.in_proj_b.weight": "model-00013-of-00014.safetensors",
|
| 679 |
+
"model.language_model.layers.54.linear_attn.in_proj_qkv.weight": "model-00013-of-00014.safetensors",
|
| 680 |
+
"model.language_model.layers.54.linear_attn.in_proj_z.weight": "model-00013-of-00014.safetensors",
|
| 681 |
+
"model.language_model.layers.54.linear_attn.norm.weight": "model-00013-of-00014.safetensors",
|
| 682 |
+
"model.language_model.layers.54.linear_attn.out_proj.weight": "model-00013-of-00014.safetensors",
|
| 683 |
+
"model.language_model.layers.54.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
|
| 684 |
+
"model.language_model.layers.54.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
|
| 685 |
+
"model.language_model.layers.54.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
|
| 686 |
+
"model.language_model.layers.54.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 687 |
+
"model.language_model.layers.55.input_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 688 |
+
"model.language_model.layers.55.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
|
| 689 |
+
"model.language_model.layers.55.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
|
| 690 |
+
"model.language_model.layers.55.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
|
| 691 |
+
"model.language_model.layers.55.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 692 |
+
"model.language_model.layers.55.self_attn.k_norm.weight": "model-00013-of-00014.safetensors",
|
| 693 |
+
"model.language_model.layers.55.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
|
| 694 |
+
"model.language_model.layers.55.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
|
| 695 |
+
"model.language_model.layers.55.self_attn.q_norm.weight": "model-00013-of-00014.safetensors",
|
| 696 |
+
"model.language_model.layers.55.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
|
| 697 |
+
"model.language_model.layers.55.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
|
| 698 |
+
"model.language_model.layers.56.input_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 699 |
+
"model.language_model.layers.56.linear_attn.A_log": "model-00013-of-00014.safetensors",
|
| 700 |
+
"model.language_model.layers.56.linear_attn.conv1d.weight": "model-00013-of-00014.safetensors",
|
| 701 |
+
"model.language_model.layers.56.linear_attn.dt_bias": "model-00013-of-00014.safetensors",
|
| 702 |
+
"model.language_model.layers.56.linear_attn.in_proj_a.weight": "model-00013-of-00014.safetensors",
|
| 703 |
+
"model.language_model.layers.56.linear_attn.in_proj_b.weight": "model-00013-of-00014.safetensors",
|
| 704 |
+
"model.language_model.layers.56.linear_attn.in_proj_qkv.weight": "model-00013-of-00014.safetensors",
|
| 705 |
+
"model.language_model.layers.56.linear_attn.in_proj_z.weight": "model-00013-of-00014.safetensors",
|
| 706 |
+
"model.language_model.layers.56.linear_attn.norm.weight": "model-00013-of-00014.safetensors",
|
| 707 |
+
"model.language_model.layers.56.linear_attn.out_proj.weight": "model-00013-of-00014.safetensors",
|
| 708 |
+
"model.language_model.layers.56.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
|
| 709 |
+
"model.language_model.layers.56.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
|
| 710 |
+
"model.language_model.layers.56.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
|
| 711 |
+
"model.language_model.layers.56.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 712 |
+
"model.language_model.layers.57.input_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 713 |
+
"model.language_model.layers.57.linear_attn.A_log": "model-00013-of-00014.safetensors",
|
| 714 |
+
"model.language_model.layers.57.linear_attn.conv1d.weight": "model-00013-of-00014.safetensors",
|
| 715 |
+
"model.language_model.layers.57.linear_attn.dt_bias": "model-00013-of-00014.safetensors",
|
| 716 |
+
"model.language_model.layers.57.linear_attn.in_proj_a.weight": "model-00013-of-00014.safetensors",
|
| 717 |
+
"model.language_model.layers.57.linear_attn.in_proj_b.weight": "model-00013-of-00014.safetensors",
|
| 718 |
+
"model.language_model.layers.57.linear_attn.in_proj_qkv.weight": "model-00013-of-00014.safetensors",
|
| 719 |
+
"model.language_model.layers.57.linear_attn.in_proj_z.weight": "model-00013-of-00014.safetensors",
|
| 720 |
+
"model.language_model.layers.57.linear_attn.norm.weight": "model-00013-of-00014.safetensors",
|
| 721 |
+
"model.language_model.layers.57.linear_attn.out_proj.weight": "model-00013-of-00014.safetensors",
|
| 722 |
+
"model.language_model.layers.57.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
|
| 723 |
+
"model.language_model.layers.57.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
|
| 724 |
+
"model.language_model.layers.57.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
|
| 725 |
+
"model.language_model.layers.57.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 726 |
+
"model.language_model.layers.58.input_layernorm.weight": "model-00013-of-00014.safetensors",
|
| 727 |
+
"model.language_model.layers.58.linear_attn.A_log": "model-00013-of-00014.safetensors",
|
| 728 |
+
"model.language_model.layers.58.linear_attn.conv1d.weight": "model-00013-of-00014.safetensors",
|
| 729 |
+
"model.language_model.layers.58.linear_attn.dt_bias": "model-00013-of-00014.safetensors",
|
| 730 |
+
"model.language_model.layers.58.linear_attn.in_proj_a.weight": "model-00013-of-00014.safetensors",
|
| 731 |
+
"model.language_model.layers.58.linear_attn.in_proj_b.weight": "model-00013-of-00014.safetensors",
|
| 732 |
+
"model.language_model.layers.58.linear_attn.in_proj_qkv.weight": "model-00013-of-00014.safetensors",
|
| 733 |
+
"model.language_model.layers.58.linear_attn.in_proj_z.weight": "model-00013-of-00014.safetensors",
|
| 734 |
+
"model.language_model.layers.58.linear_attn.norm.weight": "model-00013-of-00014.safetensors",
|
| 735 |
+
"model.language_model.layers.58.linear_attn.out_proj.weight": "model-00013-of-00014.safetensors",
|
| 736 |
+
"model.language_model.layers.58.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
|
| 737 |
+
"model.language_model.layers.58.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
|
| 738 |
+
"model.language_model.layers.58.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
|
| 739 |
+
"model.language_model.layers.58.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 740 |
+
"model.language_model.layers.59.input_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 741 |
+
"model.language_model.layers.59.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
|
| 742 |
+
"model.language_model.layers.59.mlp.gate_proj.weight": "model-00014-of-00014.safetensors",
|
| 743 |
+
"model.language_model.layers.59.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
|
| 744 |
+
"model.language_model.layers.59.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 745 |
+
"model.language_model.layers.59.self_attn.k_norm.weight": "model-00014-of-00014.safetensors",
|
| 746 |
+
"model.language_model.layers.59.self_attn.k_proj.weight": "model-00014-of-00014.safetensors",
|
| 747 |
+
"model.language_model.layers.59.self_attn.o_proj.weight": "model-00014-of-00014.safetensors",
|
| 748 |
+
"model.language_model.layers.59.self_attn.q_norm.weight": "model-00014-of-00014.safetensors",
|
| 749 |
+
"model.language_model.layers.59.self_attn.q_proj.weight": "model-00014-of-00014.safetensors",
|
| 750 |
+
"model.language_model.layers.59.self_attn.v_proj.weight": "model-00014-of-00014.safetensors",
|
| 751 |
+
"model.language_model.layers.6.input_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 752 |
+
"model.language_model.layers.6.linear_attn.A_log": "model-00003-of-00014.safetensors",
|
| 753 |
+
"model.language_model.layers.6.linear_attn.conv1d.weight": "model-00003-of-00014.safetensors",
|
| 754 |
+
"model.language_model.layers.6.linear_attn.dt_bias": "model-00003-of-00014.safetensors",
|
| 755 |
+
"model.language_model.layers.6.linear_attn.in_proj_a.weight": "model-00003-of-00014.safetensors",
|
| 756 |
+
"model.language_model.layers.6.linear_attn.in_proj_b.weight": "model-00003-of-00014.safetensors",
|
| 757 |
+
"model.language_model.layers.6.linear_attn.in_proj_qkv.weight": "model-00003-of-00014.safetensors",
|
| 758 |
+
"model.language_model.layers.6.linear_attn.in_proj_z.weight": "model-00003-of-00014.safetensors",
|
| 759 |
+
"model.language_model.layers.6.linear_attn.norm.weight": "model-00003-of-00014.safetensors",
|
| 760 |
+
"model.language_model.layers.6.linear_attn.out_proj.weight": "model-00003-of-00014.safetensors",
|
| 761 |
+
"model.language_model.layers.6.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
|
| 762 |
+
"model.language_model.layers.6.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
|
| 763 |
+
"model.language_model.layers.6.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
|
| 764 |
+
"model.language_model.layers.6.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 765 |
+
"model.language_model.layers.60.input_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 766 |
+
"model.language_model.layers.60.linear_attn.A_log": "model-00014-of-00014.safetensors",
|
| 767 |
+
"model.language_model.layers.60.linear_attn.conv1d.weight": "model-00014-of-00014.safetensors",
|
| 768 |
+
"model.language_model.layers.60.linear_attn.dt_bias": "model-00014-of-00014.safetensors",
|
| 769 |
+
"model.language_model.layers.60.linear_attn.in_proj_a.weight": "model-00014-of-00014.safetensors",
|
| 770 |
+
"model.language_model.layers.60.linear_attn.in_proj_b.weight": "model-00014-of-00014.safetensors",
|
| 771 |
+
"model.language_model.layers.60.linear_attn.in_proj_qkv.weight": "model-00014-of-00014.safetensors",
|
| 772 |
+
"model.language_model.layers.60.linear_attn.in_proj_z.weight": "model-00014-of-00014.safetensors",
|
| 773 |
+
"model.language_model.layers.60.linear_attn.norm.weight": "model-00014-of-00014.safetensors",
|
| 774 |
+
"model.language_model.layers.60.linear_attn.out_proj.weight": "model-00014-of-00014.safetensors",
|
| 775 |
+
"model.language_model.layers.60.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
|
| 776 |
+
"model.language_model.layers.60.mlp.gate_proj.weight": "model-00014-of-00014.safetensors",
|
| 777 |
+
"model.language_model.layers.60.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
|
| 778 |
+
"model.language_model.layers.60.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 779 |
+
"model.language_model.layers.61.input_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 780 |
+
"model.language_model.layers.61.linear_attn.A_log": "model-00014-of-00014.safetensors",
|
| 781 |
+
"model.language_model.layers.61.linear_attn.conv1d.weight": "model-00014-of-00014.safetensors",
|
| 782 |
+
"model.language_model.layers.61.linear_attn.dt_bias": "model-00014-of-00014.safetensors",
|
| 783 |
+
"model.language_model.layers.61.linear_attn.in_proj_a.weight": "model-00014-of-00014.safetensors",
|
| 784 |
+
"model.language_model.layers.61.linear_attn.in_proj_b.weight": "model-00014-of-00014.safetensors",
|
| 785 |
+
"model.language_model.layers.61.linear_attn.in_proj_qkv.weight": "model-00014-of-00014.safetensors",
|
| 786 |
+
"model.language_model.layers.61.linear_attn.in_proj_z.weight": "model-00014-of-00014.safetensors",
|
| 787 |
+
"model.language_model.layers.61.linear_attn.norm.weight": "model-00014-of-00014.safetensors",
|
| 788 |
+
"model.language_model.layers.61.linear_attn.out_proj.weight": "model-00014-of-00014.safetensors",
|
| 789 |
+
"model.language_model.layers.61.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
|
| 790 |
+
"model.language_model.layers.61.mlp.gate_proj.weight": "model-00014-of-00014.safetensors",
|
| 791 |
+
"model.language_model.layers.61.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
|
| 792 |
+
"model.language_model.layers.61.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 793 |
+
"model.language_model.layers.62.input_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 794 |
+
"model.language_model.layers.62.linear_attn.A_log": "model-00014-of-00014.safetensors",
|
| 795 |
+
"model.language_model.layers.62.linear_attn.conv1d.weight": "model-00014-of-00014.safetensors",
|
| 796 |
+
"model.language_model.layers.62.linear_attn.dt_bias": "model-00014-of-00014.safetensors",
|
| 797 |
+
"model.language_model.layers.62.linear_attn.in_proj_a.weight": "model-00014-of-00014.safetensors",
|
| 798 |
+
"model.language_model.layers.62.linear_attn.in_proj_b.weight": "model-00014-of-00014.safetensors",
|
| 799 |
+
"model.language_model.layers.62.linear_attn.in_proj_qkv.weight": "model-00014-of-00014.safetensors",
|
| 800 |
+
"model.language_model.layers.62.linear_attn.in_proj_z.weight": "model-00014-of-00014.safetensors",
|
| 801 |
+
"model.language_model.layers.62.linear_attn.norm.weight": "model-00014-of-00014.safetensors",
|
| 802 |
+
"model.language_model.layers.62.linear_attn.out_proj.weight": "model-00014-of-00014.safetensors",
|
| 803 |
+
"model.language_model.layers.62.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
|
| 804 |
+
"model.language_model.layers.62.mlp.gate_proj.weight": "model-00014-of-00014.safetensors",
|
| 805 |
+
"model.language_model.layers.62.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
|
| 806 |
+
"model.language_model.layers.62.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 807 |
+
"model.language_model.layers.63.input_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 808 |
+
"model.language_model.layers.63.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
|
| 809 |
+
"model.language_model.layers.63.mlp.gate_proj.weight": "model-00014-of-00014.safetensors",
|
| 810 |
+
"model.language_model.layers.63.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
|
| 811 |
+
"model.language_model.layers.63.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
|
| 812 |
+
"model.language_model.layers.63.self_attn.k_norm.weight": "model-00014-of-00014.safetensors",
|
| 813 |
+
"model.language_model.layers.63.self_attn.k_proj.weight": "model-00014-of-00014.safetensors",
|
| 814 |
+
"model.language_model.layers.63.self_attn.o_proj.weight": "model-00014-of-00014.safetensors",
|
| 815 |
+
"model.language_model.layers.63.self_attn.q_norm.weight": "model-00014-of-00014.safetensors",
|
| 816 |
+
"model.language_model.layers.63.self_attn.q_proj.weight": "model-00014-of-00014.safetensors",
|
| 817 |
+
"model.language_model.layers.63.self_attn.v_proj.weight": "model-00014-of-00014.safetensors",
|
| 818 |
+
"model.language_model.layers.7.input_layernorm.weight": "model-00003-of-00014.safetensors",
|
| 819 |
+
"model.language_model.layers.7.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
|
| 820 |
+
"model.language_model.layers.7.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
|
| 821 |
+
"model.language_model.layers.7.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
|
| 822 |
+
"model.language_model.layers.7.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 823 |
+
"model.language_model.layers.7.self_attn.k_norm.weight": "model-00004-of-00014.safetensors",
|
| 824 |
+
"model.language_model.layers.7.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
|
| 825 |
+
"model.language_model.layers.7.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
|
| 826 |
+
"model.language_model.layers.7.self_attn.q_norm.weight": "model-00004-of-00014.safetensors",
|
| 827 |
+
"model.language_model.layers.7.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
|
| 828 |
+
"model.language_model.layers.7.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
|
| 829 |
+
"model.language_model.layers.8.input_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 830 |
+
"model.language_model.layers.8.linear_attn.A_log": "model-00004-of-00014.safetensors",
|
| 831 |
+
"model.language_model.layers.8.linear_attn.conv1d.weight": "model-00004-of-00014.safetensors",
|
| 832 |
+
"model.language_model.layers.8.linear_attn.dt_bias": "model-00004-of-00014.safetensors",
|
| 833 |
+
"model.language_model.layers.8.linear_attn.in_proj_a.weight": "model-00004-of-00014.safetensors",
|
| 834 |
+
"model.language_model.layers.8.linear_attn.in_proj_b.weight": "model-00004-of-00014.safetensors",
|
| 835 |
+
"model.language_model.layers.8.linear_attn.in_proj_qkv.weight": "model-00004-of-00014.safetensors",
|
| 836 |
+
"model.language_model.layers.8.linear_attn.in_proj_z.weight": "model-00004-of-00014.safetensors",
|
| 837 |
+
"model.language_model.layers.8.linear_attn.norm.weight": "model-00004-of-00014.safetensors",
|
| 838 |
+
"model.language_model.layers.8.linear_attn.out_proj.weight": "model-00004-of-00014.safetensors",
|
| 839 |
+
"model.language_model.layers.8.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
|
| 840 |
+
"model.language_model.layers.8.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
|
| 841 |
+
"model.language_model.layers.8.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
|
| 842 |
+
"model.language_model.layers.8.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 843 |
+
"model.language_model.layers.9.input_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 844 |
+
"model.language_model.layers.9.linear_attn.A_log": "model-00004-of-00014.safetensors",
|
| 845 |
+
"model.language_model.layers.9.linear_attn.conv1d.weight": "model-00004-of-00014.safetensors",
|
| 846 |
+
"model.language_model.layers.9.linear_attn.dt_bias": "model-00004-of-00014.safetensors",
|
| 847 |
+
"model.language_model.layers.9.linear_attn.in_proj_a.weight": "model-00004-of-00014.safetensors",
|
| 848 |
+
"model.language_model.layers.9.linear_attn.in_proj_b.weight": "model-00004-of-00014.safetensors",
|
| 849 |
+
"model.language_model.layers.9.linear_attn.in_proj_qkv.weight": "model-00004-of-00014.safetensors",
|
| 850 |
+
"model.language_model.layers.9.linear_attn.in_proj_z.weight": "model-00004-of-00014.safetensors",
|
| 851 |
+
"model.language_model.layers.9.linear_attn.norm.weight": "model-00004-of-00014.safetensors",
|
| 852 |
+
"model.language_model.layers.9.linear_attn.out_proj.weight": "model-00004-of-00014.safetensors",
|
| 853 |
+
"model.language_model.layers.9.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
|
| 854 |
+
"model.language_model.layers.9.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
|
| 855 |
+
"model.language_model.layers.9.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
|
| 856 |
+
"model.language_model.layers.9.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
|
| 857 |
+
"model.language_model.norm.weight": "model-00014-of-00014.safetensors"
|
| 858 |
+
}
|
| 859 |
+
}
|
preprocessor_config.json
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"size": {
|
| 3 |
+
"longest_edge": 16777216,
|
| 4 |
+
"shortest_edge": 65536
|
| 5 |
+
},
|
| 6 |
+
"patch_size": 16,
|
| 7 |
+
"temporal_patch_size": 2,
|
| 8 |
+
"merge_size": 2,
|
| 9 |
+
"image_mean": [
|
| 10 |
+
0.5,
|
| 11 |
+
0.5,
|
| 12 |
+
0.5
|
| 13 |
+
],
|
| 14 |
+
"image_std": [
|
| 15 |
+
0.5,
|
| 16 |
+
0.5,
|
| 17 |
+
0.5
|
| 18 |
+
],
|
| 19 |
+
"processor_class": "Qwen3VLProcessor",
|
| 20 |
+
"image_processor_type": "Qwen2VLImageProcessorFast"
|
| 21 |
+
}
|
tokenizer.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f399b3cd12fa270d51457bb749fb30863521e8359b8a27059c71b6c2f7d6dd6c
|
| 3 |
+
size 19989424
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"add_prefix_space": false,
|
| 3 |
+
"audio_bos_token": "<|audio_start|>",
|
| 4 |
+
"audio_eos_token": "<|audio_end|>",
|
| 5 |
+
"audio_token": "<|audio_pad|>",
|
| 6 |
+
"backend": "tokenizers",
|
| 7 |
+
"bos_token": null,
|
| 8 |
+
"clean_up_tokenization_spaces": false,
|
| 9 |
+
"eos_token": "<|im_end|>",
|
| 10 |
+
"errors": "replace",
|
| 11 |
+
"image_token": "<|image_pad|>",
|
| 12 |
+
"is_local": true,
|
| 13 |
+
"local_files_only": false,
|
| 14 |
+
"max_length": 2048,
|
| 15 |
+
"model_max_length": 262144,
|
| 16 |
+
"model_specific_special_tokens": {
|
| 17 |
+
"audio_bos_token": "<|audio_start|>",
|
| 18 |
+
"audio_eos_token": "<|audio_end|>",
|
| 19 |
+
"audio_token": "<|audio_pad|>",
|
| 20 |
+
"image_token": "<|image_pad|>",
|
| 21 |
+
"video_token": "<|video_pad|>",
|
| 22 |
+
"vision_bos_token": "<|vision_start|>",
|
| 23 |
+
"vision_eos_token": "<|vision_end|>"
|
| 24 |
+
},
|
| 25 |
+
"pad_token": "<|endoftext|>",
|
| 26 |
+
"pretokenize_regex": "(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?[\\p{L}\\p{M}]+|\\p{N}| ?[^\\s\\p{L}\\p{M}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+",
|
| 27 |
+
"split_special_tokens": false,
|
| 28 |
+
"stride": 0,
|
| 29 |
+
"tokenizer_class": "Qwen2Tokenizer",
|
| 30 |
+
"truncation_side": "right",
|
| 31 |
+
"truncation_strategy": "longest_first",
|
| 32 |
+
"unk_token": null,
|
| 33 |
+
"video_token": "<|video_pad|>",
|
| 34 |
+
"vision_bos_token": "<|vision_start|>",
|
| 35 |
+
"vision_eos_token": "<|vision_end|>"
|
| 36 |
+
}
|
upstream/qwen3.6-27b/LICENSE
ADDED
|
@@ -0,0 +1,202 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
Apache License
|
| 3 |
+
Version 2.0, January 2004
|
| 4 |
+
http://www.apache.org/licenses/
|
| 5 |
+
|
| 6 |
+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
| 7 |
+
|
| 8 |
+
1. Definitions.
|
| 9 |
+
|
| 10 |
+
"License" shall mean the terms and conditions for use, reproduction,
|
| 11 |
+
and distribution as defined by Sections 1 through 9 of this document.
|
| 12 |
+
|
| 13 |
+
"Licensor" shall mean the copyright owner or entity authorized by
|
| 14 |
+
the copyright owner that is granting the License.
|
| 15 |
+
|
| 16 |
+
"Legal Entity" shall mean the union of the acting entity and all
|
| 17 |
+
other entities that control, are controlled by, or are under common
|
| 18 |
+
control with that entity. For the purposes of this definition,
|
| 19 |
+
"control" means (i) the power, direct or indirect, to cause the
|
| 20 |
+
direction or management of such entity, whether by contract or
|
| 21 |
+
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
| 22 |
+
outstanding shares, or (iii) beneficial ownership of such entity.
|
| 23 |
+
|
| 24 |
+
"You" (or "Your") shall mean an individual or Legal Entity
|
| 25 |
+
exercising permissions granted by this License.
|
| 26 |
+
|
| 27 |
+
"Source" form shall mean the preferred form for making modifications,
|
| 28 |
+
including but not limited to software source code, documentation
|
| 29 |
+
source, and configuration files.
|
| 30 |
+
|
| 31 |
+
"Object" form shall mean any form resulting from mechanical
|
| 32 |
+
transformation or translation of a Source form, including but
|
| 33 |
+
not limited to compiled object code, generated documentation,
|
| 34 |
+
and conversions to other media types.
|
| 35 |
+
|
| 36 |
+
"Work" shall mean the work of authorship, whether in Source or
|
| 37 |
+
Object form, made available under the License, as indicated by a
|
| 38 |
+
copyright notice that is included in or attached to the work
|
| 39 |
+
(an example is provided in the Appendix below).
|
| 40 |
+
|
| 41 |
+
"Derivative Works" shall mean any work, whether in Source or Object
|
| 42 |
+
form, that is based on (or derived from) the Work and for which the
|
| 43 |
+
editorial revisions, annotations, elaborations, or other modifications
|
| 44 |
+
represent, as a whole, an original work of authorship. For the purposes
|
| 45 |
+
of this License, Derivative Works shall not include works that remain
|
| 46 |
+
separable from, or merely link (or bind by name) to the interfaces of,
|
| 47 |
+
the Work and Derivative Works thereof.
|
| 48 |
+
|
| 49 |
+
"Contribution" shall mean any work of authorship, including
|
| 50 |
+
the original version of the Work and any modifications or additions
|
| 51 |
+
to that Work or Derivative Works thereof, that is intentionally
|
| 52 |
+
submitted to Licensor for inclusion in the Work by the copyright owner
|
| 53 |
+
or by an individual or Legal Entity authorized to submit on behalf of
|
| 54 |
+
the copyright owner. For the purposes of this definition, "submitted"
|
| 55 |
+
means any form of electronic, verbal, or written communication sent
|
| 56 |
+
to the Licensor or its representatives, including but not limited to
|
| 57 |
+
communication on electronic mailing lists, source code control systems,
|
| 58 |
+
and issue tracking systems that are managed by, or on behalf of, the
|
| 59 |
+
Licensor for the purpose of discussing and improving the Work, but
|
| 60 |
+
excluding communication that is conspicuously marked or otherwise
|
| 61 |
+
designated in writing by the copyright owner as "Not a Contribution."
|
| 62 |
+
|
| 63 |
+
"Contributor" shall mean Licensor and any individual or Legal Entity
|
| 64 |
+
on behalf of whom a Contribution has been received by Licensor and
|
| 65 |
+
subsequently incorporated within the Work.
|
| 66 |
+
|
| 67 |
+
2. Grant of Copyright License. Subject to the terms and conditions of
|
| 68 |
+
this License, each Contributor hereby grants to You a perpetual,
|
| 69 |
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
| 70 |
+
copyright license to reproduce, prepare Derivative Works of,
|
| 71 |
+
publicly display, publicly perform, sublicense, and distribute the
|
| 72 |
+
Work and such Derivative Works in Source or Object form.
|
| 73 |
+
|
| 74 |
+
3. Grant of Patent License. Subject to the terms and conditions of
|
| 75 |
+
this License, each Contributor hereby grants to You a perpetual,
|
| 76 |
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
| 77 |
+
(except as stated in this section) patent license to make, have made,
|
| 78 |
+
use, offer to sell, sell, import, and otherwise transfer the Work,
|
| 79 |
+
where such license applies only to those patent claims licensable
|
| 80 |
+
by such Contributor that are necessarily infringed by their
|
| 81 |
+
Contribution(s) alone or by combination of their Contribution(s)
|
| 82 |
+
with the Work to which such Contribution(s) was submitted. If You
|
| 83 |
+
institute patent litigation against any entity (including a
|
| 84 |
+
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
| 85 |
+
or a Contribution incorporated within the Work constitutes direct
|
| 86 |
+
or contributory patent infringement, then any patent licenses
|
| 87 |
+
granted to You under this License for that Work shall terminate
|
| 88 |
+
as of the date such litigation is filed.
|
| 89 |
+
|
| 90 |
+
4. Redistribution. You may reproduce and distribute copies of the
|
| 91 |
+
Work or Derivative Works thereof in any medium, with or without
|
| 92 |
+
modifications, and in Source or Object form, provided that You
|
| 93 |
+
meet the following conditions:
|
| 94 |
+
|
| 95 |
+
(a) You must give any other recipients of the Work or
|
| 96 |
+
Derivative Works a copy of this License; and
|
| 97 |
+
|
| 98 |
+
(b) You must cause any modified files to carry prominent notices
|
| 99 |
+
stating that You changed the files; and
|
| 100 |
+
|
| 101 |
+
(c) You must retain, in the Source form of any Derivative Works
|
| 102 |
+
that You distribute, all copyright, patent, trademark, and
|
| 103 |
+
attribution notices from the Source form of the Work,
|
| 104 |
+
excluding those notices that do not pertain to any part of
|
| 105 |
+
the Derivative Works; and
|
| 106 |
+
|
| 107 |
+
(d) If the Work includes a "NOTICE" text file as part of its
|
| 108 |
+
distribution, then any Derivative Works that You distribute must
|
| 109 |
+
include a readable copy of the attribution notices contained
|
| 110 |
+
within such NOTICE file, excluding those notices that do not
|
| 111 |
+
pertain to any part of the Derivative Works, in at least one
|
| 112 |
+
of the following places: within a NOTICE text file distributed
|
| 113 |
+
as part of the Derivative Works; within the Source form or
|
| 114 |
+
documentation, if provided along with the Derivative Works; or,
|
| 115 |
+
within a display generated by the Derivative Works, if and
|
| 116 |
+
wherever such third-party notices normally appear. The contents
|
| 117 |
+
of the NOTICE file are for informational purposes only and
|
| 118 |
+
do not modify the License. You may add Your own attribution
|
| 119 |
+
notices within Derivative Works that You distribute, alongside
|
| 120 |
+
or as an addendum to the NOTICE text from the Work, provided
|
| 121 |
+
that such additional attribution notices cannot be construed
|
| 122 |
+
as modifying the License.
|
| 123 |
+
|
| 124 |
+
You may add Your own copyright statement to Your modifications and
|
| 125 |
+
may provide additional or different license terms and conditions
|
| 126 |
+
for use, reproduction, or distribution of Your modifications, or
|
| 127 |
+
for any such Derivative Works as a whole, provided Your use,
|
| 128 |
+
reproduction, and distribution of the Work otherwise complies with
|
| 129 |
+
the conditions stated in this License.
|
| 130 |
+
|
| 131 |
+
5. Submission of Contributions. Unless You explicitly state otherwise,
|
| 132 |
+
any Contribution intentionally submitted for inclusion in the Work
|
| 133 |
+
by You to the Licensor shall be under the terms and conditions of
|
| 134 |
+
this License, without any additional terms or conditions.
|
| 135 |
+
Notwithstanding the above, nothing herein shall supersede or modify
|
| 136 |
+
the terms of any separate license agreement you may have executed
|
| 137 |
+
with Licensor regarding such Contributions.
|
| 138 |
+
|
| 139 |
+
6. Trademarks. This License does not grant permission to use the trade
|
| 140 |
+
names, trademarks, service marks, or product names of the Licensor,
|
| 141 |
+
except as required for reasonable and customary use in describing the
|
| 142 |
+
origin of the Work and reproducing the content of the NOTICE file.
|
| 143 |
+
|
| 144 |
+
7. Disclaimer of Warranty. Unless required by applicable law or
|
| 145 |
+
agreed to in writing, Licensor provides the Work (and each
|
| 146 |
+
Contributor provides its Contributions) on an "AS IS" BASIS,
|
| 147 |
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
| 148 |
+
implied, including, without limitation, any warranties or conditions
|
| 149 |
+
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
| 150 |
+
PARTICULAR PURPOSE. You are solely responsible for determining the
|
| 151 |
+
appropriateness of using or redistributing the Work and assume any
|
| 152 |
+
risks associated with Your exercise of permissions under this License.
|
| 153 |
+
|
| 154 |
+
8. Limitation of Liability. In no event and under no legal theory,
|
| 155 |
+
whether in tort (including negligence), contract, or otherwise,
|
| 156 |
+
unless required by applicable law (such as deliberate and grossly
|
| 157 |
+
negligent acts) or agreed to in writing, shall any Contributor be
|
| 158 |
+
liable to You for damages, including any direct, indirect, special,
|
| 159 |
+
incidental, or consequential damages of any character arising as a
|
| 160 |
+
result of this License or out of the use or inability to use the
|
| 161 |
+
Work (including but not limited to damages for loss of goodwill,
|
| 162 |
+
work stoppage, computer failure or malfunction, or any and all
|
| 163 |
+
other commercial damages or losses), even if such Contributor
|
| 164 |
+
has been advised of the possibility of such damages.
|
| 165 |
+
|
| 166 |
+
9. Accepting Warranty or Additional Liability. While redistributing
|
| 167 |
+
the Work or Derivative Works thereof, You may choose to offer,
|
| 168 |
+
and charge a fee for, acceptance of support, warranty, indemnity,
|
| 169 |
+
or other liability obligations and/or rights consistent with this
|
| 170 |
+
License. However, in accepting such obligations, You may act only
|
| 171 |
+
on Your own behalf and on Your sole responsibility, not on behalf
|
| 172 |
+
of any other Contributor, and only if You agree to indemnify,
|
| 173 |
+
defend, and hold each Contributor harmless for any liability
|
| 174 |
+
incurred by, or claims asserted against, such Contributor by reason
|
| 175 |
+
of your accepting any such warranty or additional liability.
|
| 176 |
+
|
| 177 |
+
END OF TERMS AND CONDITIONS
|
| 178 |
+
|
| 179 |
+
APPENDIX: How to apply the Apache License to your work.
|
| 180 |
+
|
| 181 |
+
To apply the Apache License to your work, attach the following
|
| 182 |
+
boilerplate notice, with the fields enclosed by brackets "[]"
|
| 183 |
+
replaced with your own identifying information. (Don't include
|
| 184 |
+
the brackets!) The text should be enclosed in the appropriate
|
| 185 |
+
comment syntax for the file format. We also recommend that a
|
| 186 |
+
file or class name and description of purpose be included on the
|
| 187 |
+
same "printed page" as the copyright notice for easier
|
| 188 |
+
identification within third-party archives.
|
| 189 |
+
|
| 190 |
+
Copyright 2026 Alibaba Cloud
|
| 191 |
+
|
| 192 |
+
Licensed under the Apache License, Version 2.0 (the "License");
|
| 193 |
+
you may not use this file except in compliance with the License.
|
| 194 |
+
You may obtain a copy of the License at
|
| 195 |
+
|
| 196 |
+
http://www.apache.org/licenses/LICENSE-2.0
|
| 197 |
+
|
| 198 |
+
Unless required by applicable law or agreed to in writing, software
|
| 199 |
+
distributed under the License is distributed on an "AS IS" BASIS,
|
| 200 |
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
| 201 |
+
See the License for the specific language governing permissions and
|
| 202 |
+
limitations under the License.
|
video_preprocessor_config.json
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"size": {
|
| 3 |
+
"longest_edge": 25165824,
|
| 4 |
+
"shortest_edge": 4096
|
| 5 |
+
},
|
| 6 |
+
"patch_size": 16,
|
| 7 |
+
"temporal_patch_size": 2,
|
| 8 |
+
"merge_size": 2,
|
| 9 |
+
"image_mean": [
|
| 10 |
+
0.5,
|
| 11 |
+
0.5,
|
| 12 |
+
0.5
|
| 13 |
+
],
|
| 14 |
+
"image_std": [
|
| 15 |
+
0.5,
|
| 16 |
+
0.5,
|
| 17 |
+
0.5
|
| 18 |
+
],
|
| 19 |
+
"processor_class": "Qwen3VLProcessor",
|
| 20 |
+
"video_processor_type": "Qwen3VLVideoProcessor"
|
| 21 |
+
}
|
vocab.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|