File size: 6,962 Bytes
6d7449a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
785f3d7
6d7449a
 
 
 
 
245244e
6d7449a
 
245244e
 
 
 
6d7449a
 
 
 
245244e
6d7449a
 
 
 
 
 
 
 
 
 
 
 
 
245244e
6d7449a
245244e
 
513c795
 
 
 
6d7449a
 
 
513c795
 
 
 
 
 
 
 
 
 
 
245244e
 
513c795
245244e
 
 
513c795
6d7449a
 
 
 
 
 
 
 
 
 
 
 
 
 
785f3d7
6d7449a
 
 
785f3d7
 
 
 
 
 
 
 
6d7449a
 
 
 
 
785f3d7
6d7449a
 
 
 
 
785f3d7
 
6d7449a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
785f3d7
6d7449a
785f3d7
 
513c795
 
6fded18
 
6d7449a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
513c795
 
785f3d7
 
6d7449a
 
 
785f3d7
 
6d7449a
6fded18
6d7449a
6fded18
785f3d7
 
6fded18
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
# Kaiju Coder 7 Public Testing Quickstart

Kaiju Coder 7 is the public model name. The OpenAI-compatible model id is:

```text
kaiju-coder-7
```

Use this guide for serious public testing. It avoids internal checkpoint names
and keeps the current limitations clear.

## Pick A Test Path

### Path 1: OpenCode Against An Existing Endpoint

Use this if you already have Kaiju Coder 7 served at an OpenAI-compatible
`/v1` endpoint.

```bash
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
cd kaiju-coder-7-opencode
python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18181/v1
```

Then run OpenCode inside the project you want to edit:

```bash
opencode
```

The installer sets `kaiju/kaiju-coder-7` as the OpenCode model and
`kaiju-coder-7` as the default agent. You can still select
`kaiju/kaiju-coder-7` manually from OpenCode's model picker if you switch away.

For a bounded smoke test:

```bash
mkdir -p /tmp/kaiju-public-smoke
opencode run --dir /tmp/kaiju-public-smoke \
  "Create hello.txt with exactly: Kaiju Coder 7 is ready"
```

Or run the packaged verifier, which checks the installer, live model endpoint,
OpenCode binary, actual file creation, and wrong-directory behavior:

```bash
python3 scripts/run_kaiju_public_opencode_smoke.py
```

The helper installer adds:

- the `kaiju` OpenAI-compatible provider
- `model: kaiju/kaiju-coder-7` and `default_agent: kaiju-coder-7`
- the lean `kaiju-coder-7` OpenCode agent
- Kaiju as the default primary agent, so selecting Kaiju Coder 7 uses the
  hidden fast artifact path without requiring `/kaiju`
- the `kaiju-coder-7-run` router command for fast websites, owner packs, and
  Desktop artifact folders
- the `kaiju_artifact` OpenCode custom tool and `/kaiju` command for routing
  large artifact prompts through the fast local router
- a scoped no-autocontinue plugin that prevents false completion loops after
  compaction or output limits

For a fast website or owner-pack artifact without waiting on raw OpenCode
multi-file streaming, run:

```bash
kaiju-coder-7-run \
  --no-planner \
  --kind website \
  --out-dir "$HOME/Desktop/Kaiju-Coder-7-Test" \
  --prompt "Build a premium one-page website for Harborline Bookkeeping with pricing, FAQ, and a cleanup-call CTA."
```

OpenCode should use this same command internally for large website,
business-pack, and Desktop-output requests after the helper is installed.

Inside OpenCode, `/kaiju` is optional for large generated artifacts. The command
is prompt-backed, but it points the Kaiju agent at the `kaiju_artifact` custom
tool instead of making the model hand-write every file.

### Path 2: Full Local Weights

Use this if the full `RMDWLLC/kaiju-coder-7` Hugging Face repo has been
uploaded and you have suitable local GPU hardware.

```bash
hf download RMDWLLC/kaiju-coder-7 --local-dir ./kaiju-coder-7
```

Serve the downloaded folder with an OpenAI-compatible local server. Configure
the server to expose:

```text
model id: kaiju-coder-7
base URL: http://127.0.0.1:18084/v1
context: 16384
```

For the fastest OpenCode behavior, run the bundled fast proxy in a separate
terminal and point OpenCode at the proxy:

```bash
KAIJU_OPENAI_BASE_URL=http://127.0.0.1:18084/v1 \
python3 scripts/kaiju_opencode_fast_proxy.py --host 127.0.0.1 --port 18181
```

Then install the OpenCode helper with:

```bash
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
cd kaiju-coder-7-opencode
python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18181/v1
```

### Path 3: Runtime-Quantized Local Candidate

Use this only if you are comfortable with advanced serving setups. The current
working quantized option is a runtime bitsandbytes recipe. A Q8_0 GGUF artifact
has been converted, but it is still a candidate until runtime smoke passes.

```bash
git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-quantized-runtime
cd kaiju-coder-7-quantized-runtime
```

Read `README.md` in that repo before serving. This path can reduce model memory
at runtime, but it still depends on access to the full Kaiju Coder 7 weights.

## Recommended Test Prompt

Run this from an empty project folder:

```text
Build a launch-ready local service business website and operating pack. Include
index.html, a Stripe checkout safety plan, a CSV parser with tests, a simple CRM
schema, a weekly money report, and a safety/provenance note. Write the files,
not just advice.
```

Expected result:

- files are written in the requested project folder
- `index.html` is complete HTML
- business docs start with Markdown H1 headings
- code includes a test or smoke-check command where practical
- no fake API keys, OAuth tokens, payment secrets, or private customer data

## Current Recommended Defaults

- Public model id: `kaiju-coder-7`
- OpenCode context: `16384`
- Output cap for public testing: `2500`
- Fast OpenCode path: vLLM bitsandbytes runtime behind the Kaiju fast proxy
- Current reliable product path: model plus deterministic business-owner
  harness/router plus verifier
- Raw multi-file OpenCode generation: still too slow for broad paid claims;
  use `kaiju-coder-7-run` for fast public website and owner-pack tests while
  broader raw-model latency gates continue
- Paid API: not public until launch preflight passes and the Stripe live-mode
  switch is deliberately completed

## What Not To Claim Yet

Do not claim:

- that raw model weights alone reliably build every business-owner artifact
- that a paid hosted API is generally available
- that persisted quantized weights exist
- that 32k context is the current live default

Do claim:

- Kaiju Coder 7 has a working local/OpenCode release candidate
- the current tested OpenCode default is 16k context
- the helper package includes a lean agent and compaction loop guard
- the helper package includes the `kaiju-coder-7-run` router command for fast
  artifact generation
- the fast proxy keeps OpenCode tool calls intact while forcing bounded,
  non-thinking generation
- the paid API scaffold has tests and a launch preflight, but is not yet public
- the packaged public smoke verifies a fresh OpenCode one-file write before
  public claims are refreshed
- a GGUF Q8_0 candidate exists, but is not public quantized-weights release
  evidence until runtime smoke passes

## Remaining Caveats Before Broader Claims

- Hugging Face public release repos are uploaded and public under `RMDWLLC`.
- The GGUF Q8_0 candidate still needs a runtime smoke before public
  quantized-weights upload.
- Raw multi-file OpenCode generation is still not the public speed story; use
  the deterministic router/harness for websites and business-owner packs.
- Public paid API launch has approval and preflight evidence, but real customer
  charging still needs a deliberate Stripe live-mode switch and controlled live
  payment verification.
- Do not claim 32k context as the live default until it is freshly restarted
  and re-confirmed.