restokes92 commited on
Commit
1fe486d
·
verified ·
1 Parent(s): 3c9c754

Upload Kaiju Coder 7 OpenCode helper package

Browse files
.opencode/agents/kaiju-coder-7.md CHANGED
@@ -39,15 +39,27 @@ permission:
39
  You are Kaiju Coder 7, a local coding model for business owners and practical product builders.
40
 
41
  Keep responses short while working. Prefer creating complete files over describing what should be created.
 
 
 
 
 
 
 
 
 
 
42
 
43
  Rules:
44
 
45
  - Confirm the current working directory with `pwd` before writing files.
46
  - Write artifacts into the requested project folder only.
47
  - Use relative paths for write/edit tool calls. Do not use absolute paths unless the user explicitly asks for an absolute destination.
 
48
  - For multi-file tasks, create every requested file before summarizing.
49
  - After `pwd`, write the first requested file immediately. Do not announce "parallel" work, batching, or planning before the first write.
50
  - Create files sequentially with write/edit tool calls; do not wait to draft all files in the chat response.
 
51
  - Do not say a file exists unless you wrote it or read it from disk.
52
  - Do not ask the user to finish setup that you can do locally.
53
  - Do not invent secrets, API keys, private client data, payments, or live integrations.
 
39
  You are Kaiju Coder 7, a local coding model for business owners and practical product builders.
40
 
41
  Keep responses short while working. Prefer creating complete files over describing what should be created.
42
+ For general chat, identity, capability, or "what can you do" questions, answer in 45 words or less unless the user asks for detail.
43
+
44
+ Public identity:
45
+
46
+ - Present yourself as "Kaiju Coder 7" with spaces, not "Kaiju-Coder-7" unless referring to the OpenCode agent id.
47
+ - Say you are built for local-first business-owner build work: websites, booking/payment flows, intake/CRM systems, proposals, SOPs, dashboards, automations, and practical repo fixes.
48
+ - When asked why someone should use you, answer like a product that is ready for serious testing: concrete, confident, and specific. Avoid generic phrases such as "no-fluff coding assistant" unless the user uses that wording first.
49
+ - Do not claim frontier-model superiority. Say the strength is practical execution inside a project folder with OpenCode tools, privacy-friendly local or private-network serving, and business artifacts an owner can inspect.
50
+ - Do not say customer data "never leaves your computer" unless the model is actually running on the same machine. For the default Gojira/Tailscale setup, say data stays inside the owner's controlled local/private runtime.
51
+ - Do not imply you can browse, access accounts, send emails, process payments, or use live integrations unless the available tools and user approval make that true.
52
 
53
  Rules:
54
 
55
  - Confirm the current working directory with `pwd` before writing files.
56
  - Write artifacts into the requested project folder only.
57
  - Use relative paths for write/edit tool calls. Do not use absolute paths unless the user explicitly asks for an absolute destination.
58
+ - For complete website, landing-page, or owner business-pack tasks, use the Kaiju router/harness if `scripts/run_kaiju_router.py` is available in the current repo. Run it first, then report the generated artifact path and checks instead of hand-streaming a large HTML file.
59
  - For multi-file tasks, create every requested file before summarizing.
60
  - After `pwd`, write the first requested file immediately. Do not announce "parallel" work, batching, or planning before the first write.
61
  - Create files sequentially with write/edit tool calls; do not wait to draft all files in the chat response.
62
+ - When a file must contain exact text, write only the requested bytes. Do not include XML/tool wrapper markers such as `<content>`, `</content>`, `<file>`, or markdown fences in the file.
63
  - Do not say a file exists unless you wrote it or read it from disk.
64
  - Do not ask the user to finish setup that you can do locally.
65
  - Do not invent secrets, API keys, private client data, payments, or live integrations.
PUBLIC_TESTING_QUICKSTART.md CHANGED
@@ -19,7 +19,7 @@ Use this if you already have Kaiju Coder 7 served at an OpenAI-compatible
19
  ```bash
20
  git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
21
  cd kaiju-coder-7-opencode
22
- python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
23
  ```
24
 
25
  Then run OpenCode inside the project you want to edit:
@@ -65,23 +65,31 @@ the server to expose:
65
 
66
  ```text
67
  model id: kaiju-coder-7
68
- base URL: http://127.0.0.1:18083/v1
69
  context: 16384
70
  ```
71
 
 
 
 
 
 
 
 
 
72
  Then install the OpenCode helper with:
73
 
74
  ```bash
75
  git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
76
  cd kaiju-coder-7-opencode
77
- python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18083/v1
78
  ```
79
 
80
  ### Path 3: Runtime-Quantized Local Candidate
81
 
82
  Use this only if you are comfortable with advanced serving setups. The current
83
- working quantized option is a runtime bitsandbytes recipe, not a separate
84
- persisted quantized weights repo.
85
 
86
  ```bash
87
  git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-quantized-runtime
@@ -115,9 +123,12 @@ Expected result:
115
  - Public model id: `kaiju-coder-7`
116
  - OpenCode context: `16384`
117
  - Output cap for public testing: `2500`
 
118
  - Current reliable product path: model plus deterministic business-owner
119
- harness plus verifier
120
- - Raw multi-file OpenCode generation: still too slow for broad paid API claims
 
 
121
  - Paid API: not public until launch preflight passes
122
 
123
  ## What Not To Claim Yet
@@ -134,15 +145,21 @@ Do claim:
134
  - Kaiju Coder 7 has a working local/OpenCode release candidate
135
  - the current tested OpenCode default is 16k context
136
  - the helper package includes a lean agent and compaction loop guard
 
 
137
  - the paid API scaffold has tests and a launch preflight, but is not yet public
138
  - the packaged public smoke verifies a fresh OpenCode one-file write before
139
  public claims are refreshed
 
 
140
 
141
  ## Current Blockers Before Public Release
142
 
143
  - Hugging Face repo creation still requires a write-capable token or namespace.
144
  - Full merged model upload has not completed; the merged folder must first have
145
  the metadata packet synced by `prepare_hf_merged_model_metadata.sh`.
 
 
146
  - Public paid API launch needs real Cloudflare D1/KV/R2 bindings, Wrangler
147
  secret verification, Stripe webhook staging evidence, staging traffic, latency
148
  evidence, and rollback proof.
 
19
  ```bash
20
  git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
21
  cd kaiju-coder-7-opencode
22
+ python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18181/v1
23
  ```
24
 
25
  Then run OpenCode inside the project you want to edit:
 
65
 
66
  ```text
67
  model id: kaiju-coder-7
68
+ base URL: http://127.0.0.1:18084/v1
69
  context: 16384
70
  ```
71
 
72
+ For the fastest OpenCode behavior, run the bundled fast proxy in a separate
73
+ terminal and point OpenCode at the proxy:
74
+
75
+ ```bash
76
+ KAIJU_OPENAI_BASE_URL=http://127.0.0.1:18084/v1 \
77
+ python3 scripts/kaiju_opencode_fast_proxy.py --host 127.0.0.1 --port 18181
78
+ ```
79
+
80
  Then install the OpenCode helper with:
81
 
82
  ```bash
83
  git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-opencode
84
  cd kaiju-coder-7-opencode
85
+ python3 scripts/install_kaiju_opencode_profile.py --base-url http://127.0.0.1:18181/v1
86
  ```
87
 
88
  ### Path 3: Runtime-Quantized Local Candidate
89
 
90
  Use this only if you are comfortable with advanced serving setups. The current
91
+ working quantized option is a runtime bitsandbytes recipe. A Q8_0 GGUF artifact
92
+ has been converted, but it is still a candidate until runtime smoke passes.
93
 
94
  ```bash
95
  git clone https://huggingface.co/RMDWLLC/kaiju-coder-7-quantized-runtime
 
123
  - Public model id: `kaiju-coder-7`
124
  - OpenCode context: `16384`
125
  - Output cap for public testing: `2500`
126
+ - Fast OpenCode path: vLLM bitsandbytes runtime behind the Kaiju fast proxy
127
  - Current reliable product path: model plus deterministic business-owner
128
+ harness/router plus verifier
129
+ - Raw multi-file OpenCode generation: still too slow for broad paid claims;
130
+ useful for testing, but paid API claims should favor harnessed product
131
+ workflows until broader latency gates pass
132
  - Paid API: not public until launch preflight passes
133
 
134
  ## What Not To Claim Yet
 
145
  - Kaiju Coder 7 has a working local/OpenCode release candidate
146
  - the current tested OpenCode default is 16k context
147
  - the helper package includes a lean agent and compaction loop guard
148
+ - the fast proxy keeps OpenCode tool calls intact while forcing bounded,
149
+ non-thinking generation
150
  - the paid API scaffold has tests and a launch preflight, but is not yet public
151
  - the packaged public smoke verifies a fresh OpenCode one-file write before
152
  public claims are refreshed
153
+ - a GGUF Q8_0 candidate exists, but is not public quantized-weights release
154
+ evidence until runtime smoke passes
155
 
156
  ## Current Blockers Before Public Release
157
 
158
  - Hugging Face repo creation still requires a write-capable token or namespace.
159
  - Full merged model upload has not completed; the merged folder must first have
160
  the metadata packet synced by `prepare_hf_merged_model_metadata.sh`.
161
+ - The GGUF Q8_0 candidate still needs a runtime smoke before public
162
+ quantized-weights upload.
163
  - Public paid API launch needs real Cloudflare D1/KV/R2 bindings, Wrangler
164
  secret verification, Stripe webhook staging evidence, staging traffic, latency
165
  evidence, and rollback proof.
README.md CHANGED
@@ -25,7 +25,7 @@ the absolute path where you copied `kaiju-no-autocontinue.mjs`:
25
  "npm": "@ai-sdk/openai-compatible",
26
  "name": "Kaiju Coder",
27
  "options": {
28
- "baseURL": "http://100.109.109.14:18083/v1",
29
  "apiKey": "not-needed",
30
  "timeout": 900000,
31
  "chunkTimeout": 120000
@@ -95,9 +95,27 @@ file/output facts into the summary.
95
 
96
  - Model id: `kaiju-coder-7`
97
  - Endpoint shape: OpenAI-compatible `/v1/chat/completions`
98
- - Current Gojira-B restored default: 16,384 context
99
- - Tested high-context target: 32,768 context
100
- - Serving path: merged full model through SGLang
 
 
101
  - OpenCode guard: lean agent plus scoped no-autocontinue plugin
102
  - Product caveat: raw generation is useful but slow; paid workflows should use
103
  deterministic harnesses and verifiers until broader raw-model gates pass.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  "npm": "@ai-sdk/openai-compatible",
26
  "name": "Kaiju Coder",
27
  "options": {
28
+ "baseURL": "http://127.0.0.1:18181/v1",
29
  "apiKey": "not-needed",
30
  "timeout": 900000,
31
  "chunkTimeout": 120000
 
95
 
96
  - Model id: `kaiju-coder-7`
97
  - Endpoint shape: OpenAI-compatible `/v1/chat/completions`
98
+ - Fast OpenCode base URL: `http://127.0.0.1:18181/v1`
99
+ - Fast proxy upstream for Richard's current setup: vLLM bitsandbytes on Gojira-B port `18084`
100
+ - Current tested context: 16,384
101
+ - Tested high-context target: 32,768, but not the current fast default
102
+ - Serving path for speed testing: merged full model through vLLM runtime bitsandbytes
103
  - OpenCode guard: lean agent plus scoped no-autocontinue plugin
104
  - Product caveat: raw generation is useful but slow; paid workflows should use
105
  deterministic harnesses and verifiers until broader raw-model gates pass.
106
+
107
+ ## Fast Proxy
108
+
109
+ The helper bundle includes `scripts/kaiju_opencode_fast_proxy.py`. It preserves
110
+ OpenCode tool-call streaming while forcing the fast model settings Kaiju needs:
111
+ `thinking=false`, model id `kaiju-coder-7`, and bounded output budgets.
112
+
113
+ Run it in one terminal, then point OpenCode at `http://127.0.0.1:18181/v1`:
114
+
115
+ ```bash
116
+ KAIJU_OPENAI_BASE_URL=http://127.0.0.1:18084/v1 \
117
+ python3 scripts/kaiju_opencode_fast_proxy.py --host 127.0.0.1 --port 18181
118
+ ```
119
+
120
+ If your vLLM server is remote, set `KAIJU_OPENAI_BASE_URL` to that remote
121
+ OpenAI-compatible `/v1` endpoint instead.
opencode.kaiju-coder-7.jsonc CHANGED
@@ -5,7 +5,7 @@
5
  "npm": "@ai-sdk/openai-compatible",
6
  "name": "Kaiju Coder",
7
  "options": {
8
- "baseURL": "http://100.109.109.14:18083/v1",
9
  "apiKey": "not-needed",
10
  "timeout": 900000,
11
  "chunkTimeout": 120000
 
5
  "npm": "@ai-sdk/openai-compatible",
6
  "name": "Kaiju Coder",
7
  "options": {
8
+ "baseURL": "http://127.0.0.1:18181/v1",
9
  "apiKey": "not-needed",
10
  "timeout": 900000,
11
  "chunkTimeout": 120000
scripts/check_hf_uploaded_release.py CHANGED
@@ -24,7 +24,7 @@ from typing import Any
24
 
25
  MODEL_ID = "kaiju-coder-7"
26
  DEFAULT_NAMESPACE = "RMDWLLC"
27
- DEFAULT_BASE_URL = "http://100.109.109.14:18083/v1"
28
 
29
 
30
  @dataclass(frozen=True)
 
24
 
25
  MODEL_ID = "kaiju-coder-7"
26
  DEFAULT_NAMESPACE = "RMDWLLC"
27
+ DEFAULT_BASE_URL = "http://127.0.0.1:18181/v1"
28
 
29
 
30
  @dataclass(frozen=True)
scripts/kaiju_opencode_fast_proxy.py ADDED
@@ -0,0 +1,234 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Tool-safe OpenAI-compatible fast proxy for Kaiju Coder 7 OpenCode.
3
+
4
+ The normal Gojira gateway is product/API oriented and aggregates content. OpenCode
5
+ needs raw tool-call chunks preserved, so this proxy only patches serving knobs
6
+ and then passes upstream responses through unchanged.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import argparse
12
+ import json
13
+ import os
14
+ import time
15
+ import urllib.error
16
+ import urllib.request
17
+ from http import HTTPStatus
18
+ from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
19
+ from typing import Any
20
+
21
+
22
+ DEFAULT_HOST = "127.0.0.1"
23
+ DEFAULT_PORT = int(os.environ.get("KAIJU_OPENCODE_FAST_PROXY_PORT", "18181"))
24
+ UPSTREAM_BASE_URL = os.environ.get("KAIJU_OPENAI_BASE_URL", "http://100.109.109.14:18084/v1")
25
+ DEFAULT_MODEL = os.environ.get("KAIJU_DEFAULT_MODEL", "kaiju-coder-7")
26
+ API_KEY = os.environ.get("KAIJU_OPENAI_API_KEY", "")
27
+ NORMAL_MAX_TOKENS = int(os.environ.get("KAIJU_NORMAL_MAX_TOKENS", "384"))
28
+ WORK_MAX_TOKENS = int(os.environ.get("KAIJU_WORK_MAX_TOKENS", "1536"))
29
+ ARTIFACT_MAX_TOKENS = int(os.environ.get("KAIJU_ARTIFACT_MAX_TOKENS", "4096"))
30
+ MAX_REQUEST_BYTES = int(os.environ.get("KAIJU_MAX_REQUEST_BYTES", "2097152"))
31
+
32
+
33
+ def normalize_messages(messages: Any) -> list[dict[str, Any]]:
34
+ if not isinstance(messages, list):
35
+ return []
36
+ return [message for message in messages if isinstance(message, dict)]
37
+
38
+
39
+ def message_text(messages: list[dict[str, Any]]) -> str:
40
+ parts: list[str] = []
41
+ for message in messages:
42
+ content = message.get("content", "")
43
+ if isinstance(content, str):
44
+ parts.append(content)
45
+ else:
46
+ parts.append(json.dumps(content, ensure_ascii=False))
47
+ return "\n".join(parts).lower()
48
+
49
+
50
+ def classify_job(messages: list[dict[str, Any]]) -> str:
51
+ text = message_text(messages)
52
+ artifact_terms = (
53
+ "complete html",
54
+ "html file",
55
+ "one-file website",
56
+ "landing page",
57
+ "build a website",
58
+ "make a website",
59
+ "full file",
60
+ )
61
+ work_terms = (
62
+ "create ",
63
+ "write ",
64
+ "edit ",
65
+ "implement",
66
+ "debug",
67
+ "fix",
68
+ "refactor",
69
+ "test",
70
+ "repo",
71
+ "file",
72
+ )
73
+ if any(term in text for term in artifact_terms):
74
+ return "artifact"
75
+ if any(term in text for term in work_terms):
76
+ return "work"
77
+ return "normal"
78
+
79
+
80
+ def target_tokens(job_class: str) -> int:
81
+ if job_class == "artifact":
82
+ return ARTIFACT_MAX_TOKENS
83
+ if job_class == "work":
84
+ return WORK_MAX_TOKENS
85
+ return NORMAL_MAX_TOKENS
86
+
87
+
88
+ def patch_chat_payload(body: dict[str, Any]) -> dict[str, Any]:
89
+ patched = dict(body)
90
+ patched["model"] = DEFAULT_MODEL
91
+ messages = normalize_messages(patched.get("messages"))
92
+ job_class = classify_job(messages)
93
+ patched["max_tokens"] = target_tokens(job_class)
94
+ patched["chat_template_kwargs"] = {
95
+ **(patched.get("chat_template_kwargs") if isinstance(patched.get("chat_template_kwargs"), dict) else {}),
96
+ "enable_thinking": False,
97
+ "thinking": False,
98
+ }
99
+ return patched
100
+
101
+
102
+ class Handler(BaseHTTPRequestHandler):
103
+ server_version = "KaijuOpenCodeFastProxy/0.1"
104
+
105
+ def log_message(self, fmt: str, *args: Any) -> None:
106
+ print(f"{time.strftime('%Y-%m-%d %H:%M:%S')} {self.address_string()} - {fmt % args}", flush=True)
107
+
108
+ def _json(self, status: int, payload: dict[str, Any]) -> None:
109
+ data = json.dumps(payload).encode("utf-8")
110
+ self.send_response(status)
111
+ self.send_header("content-type", "application/json; charset=utf-8")
112
+ self.send_header("cache-control", "no-store")
113
+ self.send_header("content-length", str(len(data)))
114
+ self.end_headers()
115
+ self.wfile.write(data)
116
+
117
+ def _read_json(self) -> dict[str, Any]:
118
+ length = int(self.headers.get("content-length", "0"))
119
+ if length > MAX_REQUEST_BYTES:
120
+ raise ValueError("request body too large")
121
+ raw = self.rfile.read(length)
122
+ if not raw:
123
+ return {}
124
+ value = json.loads(raw.decode("utf-8"))
125
+ if not isinstance(value, dict):
126
+ raise ValueError("request body must be a JSON object")
127
+ return value
128
+
129
+ def do_GET(self) -> None: # noqa: N802 - BaseHTTPRequestHandler API.
130
+ if self.path == "/health":
131
+ self._json(
132
+ HTTPStatus.OK,
133
+ {
134
+ "ok": True,
135
+ "model": DEFAULT_MODEL,
136
+ "upstream": UPSTREAM_BASE_URL,
137
+ "normal_max_tokens": NORMAL_MAX_TOKENS,
138
+ "work_max_tokens": WORK_MAX_TOKENS,
139
+ "artifact_max_tokens": ARTIFACT_MAX_TOKENS,
140
+ },
141
+ )
142
+ return
143
+ if self.path == "/v1/models":
144
+ self._forward_get("/models")
145
+ return
146
+ self._json(HTTPStatus.NOT_FOUND, {"error": {"message": "Not found", "type": "not_found"}})
147
+
148
+ def do_POST(self) -> None: # noqa: N802 - BaseHTTPRequestHandler API.
149
+ if self.path != "/v1/chat/completions":
150
+ self._json(HTTPStatus.NOT_FOUND, {"error": {"message": "Not found", "type": "not_found"}})
151
+ return
152
+ try:
153
+ body = patch_chat_payload(self._read_json())
154
+ except Exception as error: # noqa: BLE001 - return request parse failures.
155
+ self._json(HTTPStatus.BAD_REQUEST, {"error": {"message": str(error), "type": "bad_request"}})
156
+ return
157
+ self._forward_post("/chat/completions", body)
158
+
159
+ def _headers(self) -> dict[str, str]:
160
+ headers = {"content-type": "application/json"}
161
+ if API_KEY:
162
+ headers["authorization"] = f"Bearer {API_KEY}"
163
+ return headers
164
+
165
+ def _forward_get(self, suffix: str) -> None:
166
+ request = urllib.request.Request(
167
+ f"{UPSTREAM_BASE_URL.rstrip('/')}{suffix}",
168
+ headers=self._headers(),
169
+ method="GET",
170
+ )
171
+ try:
172
+ with urllib.request.urlopen(request, timeout=30) as upstream:
173
+ data = upstream.read()
174
+ self.send_response(upstream.status)
175
+ self.send_header("content-type", upstream.headers.get("content-type", "application/json"))
176
+ self.send_header("cache-control", "no-store")
177
+ self.send_header("content-length", str(len(data)))
178
+ self.end_headers()
179
+ self.wfile.write(data)
180
+ except urllib.error.HTTPError as error:
181
+ self._json(error.code, {"error": {"message": error.read().decode("utf-8", errors="replace")[:500]}})
182
+ except Exception as error: # noqa: BLE001 - proxy health should surface upstream failures.
183
+ self._json(HTTPStatus.BAD_GATEWAY, {"error": {"message": str(error), "type": "upstream_error"}})
184
+
185
+ def _forward_post(self, suffix: str, body: dict[str, Any]) -> None:
186
+ data = json.dumps(body).encode("utf-8")
187
+ request = urllib.request.Request(
188
+ f"{UPSTREAM_BASE_URL.rstrip('/')}{suffix}",
189
+ data=data,
190
+ headers=self._headers(),
191
+ method="POST",
192
+ )
193
+ try:
194
+ timeout = 1200 if classify_job(normalize_messages(body.get("messages"))) == "artifact" else 600
195
+ with urllib.request.urlopen(request, timeout=timeout) as upstream:
196
+ content_type = upstream.headers.get("content-type", "application/json")
197
+ if body.get("stream") is True:
198
+ self.send_response(upstream.status)
199
+ self.send_header("content-type", content_type)
200
+ self.send_header("cache-control", "no-store, no-transform")
201
+ self.send_header("connection", "close")
202
+ self.end_headers()
203
+ for chunk in upstream:
204
+ self.wfile.write(chunk)
205
+ self.wfile.flush()
206
+ return
207
+ response = upstream.read()
208
+ self.send_response(upstream.status)
209
+ self.send_header("content-type", content_type)
210
+ self.send_header("cache-control", "no-store")
211
+ self.send_header("content-length", str(len(response)))
212
+ self.end_headers()
213
+ self.wfile.write(response)
214
+ except urllib.error.HTTPError as error:
215
+ detail = error.read().decode("utf-8", errors="replace")[:500]
216
+ self._json(error.code, {"error": {"message": detail, "type": "upstream_error"}})
217
+ except Exception as error: # noqa: BLE001 - proxy should report upstream failures.
218
+ self._json(HTTPStatus.BAD_GATEWAY, {"error": {"message": str(error), "type": "upstream_error"}})
219
+
220
+
221
+ def main() -> int:
222
+ parser = argparse.ArgumentParser(description=__doc__)
223
+ parser.add_argument("--host", default=DEFAULT_HOST)
224
+ parser.add_argument("--port", type=int, default=DEFAULT_PORT)
225
+ args = parser.parse_args()
226
+ server = ThreadingHTTPServer((args.host, args.port), Handler)
227
+ print(f"Kaiju OpenCode fast proxy listening on http://{args.host}:{args.port}", flush=True)
228
+ print(f"Upstream: {UPSTREAM_BASE_URL}", flush=True)
229
+ server.serve_forever()
230
+ return 0
231
+
232
+
233
+ if __name__ == "__main__":
234
+ raise SystemExit(main())
scripts/run_kaiju_public_demo_pack.py ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Run the public Kaiju Coder 7 business-owner demo pack.
3
+
4
+ The demo pack exercises the release path customers will actually use: a compact
5
+ model-planned prompt where useful, deterministic harness rendering, and static
6
+ verification before any public claim is refreshed.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import argparse
12
+ import datetime as dt
13
+ import json
14
+ import sys
15
+ import time
16
+ from dataclasses import asdict, dataclass
17
+ from pathlib import Path
18
+
19
+ ROOT = Path(__file__).resolve().parents[1]
20
+ sys.path.insert(0, str(ROOT))
21
+
22
+ from kaiju_harness.router import result_to_json, run_task
23
+
24
+
25
+ @dataclass
26
+ class DemoTask:
27
+ task_id: str
28
+ kind: str
29
+ prompt: str
30
+
31
+
32
+ @dataclass
33
+ class DemoResult:
34
+ task_id: str
35
+ kind: str
36
+ seconds: float
37
+ task_type: str
38
+ artifact_type: str
39
+ artifact_path: str | None
40
+ project_dir: str | None
41
+ changed_files: int
42
+ errors: list[str]
43
+
44
+
45
+ DEMO_TASKS = [
46
+ DemoTask(
47
+ task_id="service-website",
48
+ kind="website",
49
+ prompt=(
50
+ "Build a premium one-page website for Harborline Bookkeeping in "
51
+ "Savannah. Include trust-focused copy, clear services, pricing "
52
+ "signals, FAQ, and the CTA Book a Cleanup Call."
53
+ ),
54
+ ),
55
+ DemoTask(
56
+ task_id="owner-ai-company-pack",
57
+ kind="business_suite",
58
+ prompt=(
59
+ "Build the owner-ready AI company operating pack for Harborline "
60
+ "Bookkeeping with launch kit, connector pack, intake CRM, "
61
+ "reporting agent, lead generator, sales closer, ROI dashboard, "
62
+ "operator training, and teach-once Workshop handoff."
63
+ ),
64
+ ),
65
+ DemoTask(
66
+ task_id="stripe-safety-plan",
67
+ kind="business_document",
68
+ prompt=(
69
+ "Write a practical Stripe checkout and webhook safety plan for a "
70
+ "local service business selling paid AI setup calls. Include key "
71
+ "states, failure handling, refund/debit rules, and launch checks."
72
+ ),
73
+ ),
74
+ DemoTask(
75
+ task_id="csv-parser",
76
+ kind="coding",
77
+ prompt=(
78
+ "Write a safe Node.js CSV parser utility for business-owner lead "
79
+ "imports. Include validation rules, typed output shape, example "
80
+ "usage, and a small test plan."
81
+ ),
82
+ ),
83
+ ]
84
+
85
+
86
+ def utc_stamp() -> str:
87
+ return dt.datetime.now(dt.UTC).strftime("%Y%m%dT%H%M%SZ")
88
+
89
+
90
+ def write_summary(run_dir: Path, results: list[DemoResult], manifests: list[dict]) -> None:
91
+ payload = {
92
+ "product": "Kaiju Coder 7",
93
+ "model_id": "kaiju-coder-7",
94
+ "created_at": utc_stamp(),
95
+ "summary": {
96
+ "tasks": len(results),
97
+ "passed": sum(1 for result in results if not result.errors),
98
+ "failed": sum(1 for result in results if result.errors),
99
+ "total_seconds": round(sum(result.seconds for result in results), 3),
100
+ },
101
+ "results": [asdict(result) for result in results],
102
+ "manifests": manifests,
103
+ }
104
+ (run_dir / "results.json").write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8")
105
+
106
+ lines = [
107
+ "# Kaiju Coder 7 Public Demo Pack",
108
+ "",
109
+ f"- Run dir: `{run_dir}`",
110
+ f"- Tasks: `{payload['summary']['tasks']}`",
111
+ f"- Passed: `{payload['summary']['passed']}`",
112
+ f"- Failed: `{payload['summary']['failed']}`",
113
+ f"- Total seconds: `{payload['summary']['total_seconds']}`",
114
+ "",
115
+ "| Task | Kind | Result | Seconds | Changed files | Artifact |",
116
+ "|---|---|---:|---:|---:|---|",
117
+ ]
118
+ for result in results:
119
+ status = "pass" if not result.errors else "fail"
120
+ artifact = result.artifact_path or result.project_dir or ""
121
+ lines.append(
122
+ f"| `{result.task_id}` | `{result.kind}` | {status} | "
123
+ f"{result.seconds:.2f} | {result.changed_files} | `{artifact}` |"
124
+ )
125
+ (run_dir / "summary.md").write_text("\n".join(lines) + "\n", encoding="utf-8")
126
+
127
+
128
+ def main() -> int:
129
+ parser = argparse.ArgumentParser(description=__doc__)
130
+ parser.add_argument("--out-dir", type=Path, default=ROOT / "runs/public-demo-pack")
131
+ parser.add_argument("--openai-base-url", default="http://127.0.0.1:18181/v1")
132
+ parser.add_argument("--model", default="kaiju-coder-7")
133
+ parser.add_argument("--api-key-env", default="KAIJU_EVAL_API_KEY")
134
+ parser.add_argument("--planner-timeout", type=int, default=120)
135
+ args = parser.parse_args()
136
+
137
+ run_dir = args.out_dir / utc_stamp()
138
+ run_dir.mkdir(parents=True, exist_ok=True)
139
+ results: list[DemoResult] = []
140
+ manifests: list[dict] = []
141
+
142
+ for task in DEMO_TASKS:
143
+ started = time.time()
144
+ task_dir = run_dir / task.task_id
145
+ try:
146
+ result = run_task(
147
+ task.prompt,
148
+ task_dir,
149
+ kind=task.kind,
150
+ openai_base_url=args.openai_base_url,
151
+ model=args.model,
152
+ api_key_env=args.api_key_env,
153
+ planner_timeout=args.planner_timeout,
154
+ )
155
+ seconds = time.time() - started
156
+ manifests.append(json.loads(result_to_json(result)))
157
+ results.append(
158
+ DemoResult(
159
+ task_id=task.task_id,
160
+ kind=task.kind,
161
+ seconds=round(seconds, 3),
162
+ task_type=result.task_type,
163
+ artifact_type=result.artifact_type,
164
+ artifact_path=str(result.artifact_path) if result.artifact_path else None,
165
+ project_dir=str(result.project_dir) if result.project_dir else None,
166
+ changed_files=len(result.changed_files),
167
+ errors=result.errors,
168
+ )
169
+ )
170
+ except Exception as exc:
171
+ seconds = time.time() - started
172
+ results.append(
173
+ DemoResult(
174
+ task_id=task.task_id,
175
+ kind=task.kind,
176
+ seconds=round(seconds, 3),
177
+ task_type=task.kind,
178
+ artifact_type="error",
179
+ artifact_path=None,
180
+ project_dir=None,
181
+ changed_files=0,
182
+ errors=[str(exc)],
183
+ )
184
+ )
185
+
186
+ write_summary(run_dir, results, manifests)
187
+ failed = [result for result in results if result.errors]
188
+ print(f"Demo summary: {run_dir / 'summary.md'}")
189
+ return 1 if failed else 0
190
+
191
+
192
+ if __name__ == "__main__":
193
+ raise SystemExit(main())
scripts/run_kaiju_public_opencode_smoke.py CHANGED
@@ -29,7 +29,7 @@ AGENT = "kaiju-coder-7"
29
  MODEL_ID = "kaiju-coder-7"
30
  EXPECTED_TEXT = "Kaiju Coder 7 public OpenCode smoke ok"
31
  DEFAULT_RUNS_DIR = ROOT / "runs/public-opencode-smoke"
32
- DEFAULT_BASE_URL = "http://100.109.109.14:18083/v1"
33
 
34
 
35
  @dataclass
 
29
  MODEL_ID = "kaiju-coder-7"
30
  EXPECTED_TEXT = "Kaiju Coder 7 public OpenCode smoke ok"
31
  DEFAULT_RUNS_DIR = ROOT / "runs/public-opencode-smoke"
32
+ DEFAULT_BASE_URL = "http://127.0.0.1:18181/v1"
33
 
34
 
35
  @dataclass