smysle commited on
Commit
3ecfe58
·
0 Parent(s):

initial commit

Browse files
Files changed (5) hide show
  1. .dockerignore +7 -0
  2. Dockerfile +17 -0
  3. README.md +213 -0
  4. go.mod +3 -0
  5. main.go +1361 -0
.dockerignore ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ .git
2
+ .gitignore
3
+ *.md
4
+ deploy.sh
5
+ onyx2api.service
6
+ onyx2api
7
+ onyx2api-linux-amd64
Dockerfile ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM golang:1.23-alpine AS builder
2
+ WORKDIR /src
3
+
4
+ COPY go.mod ./
5
+ COPY main.go ./
6
+
7
+ RUN CGO_ENABLED=0 go build -trimpath -ldflags="-s -w" -o /out/onyx2api .
8
+
9
+ FROM gcr.io/distroless/static-debian12:nonroot
10
+ WORKDIR /
11
+
12
+ COPY --from=builder /out/onyx2api /onyx2api
13
+
14
+ ENV PORT=7860
15
+ EXPOSE 7860
16
+
17
+ ENTRYPOINT ["/onyx2api"]
README.md ADDED
@@ -0,0 +1,213 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: ONYX
3
+ emoji: "🚀"
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ app_port: 7860
8
+ pinned: false
9
+ ---
10
+
11
+ # onyx2api
12
+
13
+ 把 [Onyx Cloud](https://cloud.onyx.app) 包装成标准的 OpenAI / Anthropic API,Go 写的,单文件,零依赖,编译出来就一个二进制。
14
+
15
+ 说白了:你有 Onyx 的 Key,这东西帮你把它变成任何支持 OpenAI 或 Anthropic 协议的客户端都能直接用的接口。ChatGPT-Next-Web、LobeChat、Cursor、随便什么客户端,填个地址就能连。
16
+
17
+ > [!NOTE]
18
+ > 仅供学习和技术研究,请遵守 Onyx 使用条款及当地法律法规。
19
+
20
+ ## 跑起来
21
+
22
+ 三种方式,挑一个适合自己的。
23
+
24
+ ### 直接编译运行
25
+
26
+ ```bash
27
+ git clone https://github.com/smysle/onyx2api-go.git
28
+ cd onyx2api-go
29
+ go build -o onyx2api .
30
+
31
+ ONYX_KEYS="on_tenant_xxx" CLIENT_API_KEY="sk-456789" ./onyx2api
32
+ ```
33
+
34
+ 默认监听 `0.0.0.0:9898`,想换端口就加个 `PORT=7860`。
35
+
36
+ ### Docker
37
+
38
+ ```bash
39
+ docker build -t onyx2api:latest .
40
+ docker run --rm -p 7860:7860 \
41
+ -e PORT=7860 \
42
+ -e ONYX_KEYS=on_tenant_xxx \
43
+ -e CLIENT_API_KEY=sk-456789 \
44
+ onyx2api:latest
45
+ ```
46
+
47
+ 镜像用的 distroless,编译产物就几 MB,启动秒开。
48
+
49
+ ### 丢到 Hugging Face Spaces
50
+
51
+ 最省事的部署方式,白嫖 HF 的容器,不用自己搞服务器。
52
+
53
+ 参考实例:`https://huggingface.co/spaces/smyslenny/ONYX`(可以直接 Duplicate)
54
+
55
+ **步骤:**
56
+
57
+ 1. 把代码推到你自己的 Space:
58
+
59
+ ```bash
60
+ git clone https://github.com/smysle/onyx2api-go.git
61
+ cd onyx2api-go
62
+ git remote add hf https://huggingface.co/spaces/<你的用户名>/ONYX
63
+ git push hf main
64
+ ```
65
+
66
+ 如果要求输密码,Username 填 HF 用户名,Password 填 Access Token(要有 write 权限)。
67
+
68
+ 2. 去 Space 的 **Settings → Variables and secrets** 里加两个 Secret:
69
+
70
+ - `ONYX_KEYS` = 你的 Onyx Key(多个用逗号隔开)
71
+ - `CLIENT_API_KEY` = 自己定一个密钥,给客户端用的
72
+
73
+ 3. 等它 Build 完,接口地址就是 `https://<你的用户名>-onyx.hf.space/v1`
74
+
75
+ > [!TIP]
76
+ > 别人 Duplicate 你的 Space 也能用,但 Secret 不会跟过去,得自己配。
77
+
78
+ ## 它能干什么
79
+
80
+ **双协议兼容** — 同时暴露 OpenAI 和 Anthropic 两套接口,一个服务顶两个用:
81
+
82
+ - `POST /v1/chat/completions` — OpenAI 格式,`Authorization: Bearer xxx` 鉴权
83
+ - `POST /v1/messages` — Anthropic 格式,`x-api-key: xxx` 鉴权
84
+
85
+ 两个接口都支持流式(SSE)和非流式响应。
86
+
87
+ **多 Key 轮询** — `ONYX_KEYS` 里填多个 Key(逗号分隔),请求会自动轮着用,原子操作无锁,不会有并发问题。
88
+
89
+ **自动重试** — 遇到 502 / 503 / 504 / 429 会自动重试,最多 3 次,间隔 2s → 5s → 10s 递增。重试时还会自动换 Key(如果有多个的话)。
90
+
91
+ **Reasoning 透传** — Onyx 返回的推理过程(thinking)会被正确翻译。OpenAI 格式走 `reasoning_content` 字段,Anthropic 格式走 `thinking` content block,下游客户端能正常展示思考链。
92
+
93
+ **工具调用可视化** — Onyx 的搜索、网页抓取、代码执行、图片生成、深度研究等工具事件,都会被转成可读的文本块插入回复里,不会丢失上下文。
94
+
95
+ ## 接口一览
96
+
97
+ | 方法 | 路径 | 用途 |
98
+ |------|------|------|
99
+ | `GET` | `/` | 服务信息,返回版本号和接口列表 |
100
+ | `GET` | `/health` | 健康检查,顺便告诉你加载了几个 Key |
101
+ | `GET` | `/v1/models` | 模型列表(OpenAI 格式) |
102
+ | `POST` | `/v1/chat/completions` | OpenAI 兼容接口 |
103
+ | `POST` | `/v1/messages` | Anthropic 兼容接口 |
104
+
105
+ ### 用 curl 试一下
106
+
107
+ 健康检查:
108
+
109
+ ```bash
110
+ curl http://127.0.0.1:9898/health
111
+ # {"status":"ok","version":"0.5.0","keys":1}
112
+ ```
113
+
114
+ OpenAI 格式:
115
+
116
+ ```bash
117
+ curl http://127.0.0.1:9898/v1/chat/completions \
118
+ -H "Authorization: Bearer sk-456789" \
119
+ -H "Content-Type: application/json" \
120
+ -d '{
121
+ "model": "gpt-4o",
122
+ "messages": [{"role":"user","content":"你好"}],
123
+ "stream": false
124
+ }'
125
+ ```
126
+
127
+ Anthropic 格式:
128
+
129
+ ```bash
130
+ curl http://127.0.0.1:9898/v1/messages \
131
+ -H "x-api-key: sk-456789" \
132
+ -H "Content-Type: application/json" \
133
+ -d '{
134
+ "model": "claude-sonnet-4-6",
135
+ "messages": [{"role":"user","content":"你好"}],
136
+ "stream": false
137
+ }'
138
+ ```
139
+
140
+ ## 客户端接入
141
+
142
+ 大多数支持自定义 API 地址的客户端都能用。配置方法:
143
+
144
+ | 配置项 | 本地部署 | HF Spaces |
145
+ |--------|---------|-----------|
146
+ | Base URL | `http://localhost:9898/v1` | `https://<用户名>-onyx.hf.space/v1` |
147
+ | API Key | 你设的 `CLIENT_API_KEY` 值 | 同左 |
148
+
149
+ Python 里用 `openai` 库的话:
150
+
151
+ ```python
152
+ from openai import OpenAI
153
+
154
+ client = OpenAI(
155
+ base_url="http://localhost:9898/v1",
156
+ api_key="sk-456789",
157
+ )
158
+
159
+ resp = client.chat.completions.create(
160
+ model="claude-sonnet-4-6",
161
+ messages=[{"role": "user", "content": "你好"}],
162
+ )
163
+ print(resp.choices[0].message.content)
164
+ ```
165
+
166
+ ## 支持的模型
167
+
168
+ 内置了主流模型的映射,直接用模��名就行:
169
+
170
+ **Anthropic**
171
+ `claude-opus-4-6` · `claude-opus-4-5` · `claude-sonnet-4-6` · `claude-sonnet-4-5` · `claude-sonnet-4` · `claude-3-5-sonnet` · `claude-3-5-haiku` · `claude-3-opus` · `claude-haiku-4-5`
172
+
173
+ **OpenAI**
174
+ `gpt-4o` · `gpt-4o-mini` · `gpt-4-turbo` · `gpt-4` · `o1` · `o1-mini` · `o3-mini`
175
+
176
+ **Google**
177
+ `gemini-2.0-flash` · `gemini-2.5-pro`
178
+
179
+ 不在列表里的模型也能用,传 Onyx 的原始格式 `Provider__api__model_version` 就行,比如 `OpenAI__api__gpt-4o`。找不到映射的模型名会默认当 Anthropic 处理。
180
+
181
+ ## 环境变量
182
+
183
+ | 变量 | 说明 | 默认 |
184
+ |------|------|------|
185
+ | `ONYX_KEYS` | Onyx API Key,多个逗号分隔 | **必填** |
186
+ | `CLIENT_API_KEY` | 客户端鉴权密钥 | 空(不鉴权) |
187
+ | `CLIENT_API_KEYS` | 多个客户端密钥,逗号分隔,设了这个会忽略上面那个 | 空 |
188
+ | `PORT` | 监听端口 | `9898` |
189
+ | `LISTEN_ADDR` | 完整监听地址,比如 `0.0.0.0:8080`,设了会忽略 `PORT` | 空 |
190
+
191
+ 请求体里还可以传一个 `persona_id`(整数),对应 Onyx 的 Persona,默认 `1`。
192
+
193
+ ## 常见问题
194
+
195
+ **Q: Duplicate / Fork 了别人的项目,跑起来没反应?**
196
+
197
+ Secret 不会跟着代码走。你得自己去 Space Settings 里把 `ONYX_KEYS` 和 `CLIENT_API_KEY` 填上。
198
+
199
+ **Q: `/health` 返回的 `keys` 是 0?**
200
+
201
+ 说明没读到 `ONYX_KEYS`。本地跑检查一下环境变量有没有传对,HF 上检查 Space Secrets 配了没。
202
+
203
+ **Q: 不设 `CLIENT_API_KEY` 行不行?**
204
+
205
+ 能跑,但公网部署别这么干。没有鉴权谁都能调,你的 Onyx 配额会被白嫖光。
206
+
207
+ **Q: 流式响应断了 / 超时了?**
208
+
209
+ 默认请求超时 5 分钟。如果模型响应特别长(比如深度研究),可能会超。另外检查一下中间有没有反向代理在限制连接时长。
210
+
211
+ **Q: 模型名写错了会怎样?**
212
+
213
+ 不会报错,会被当成 Anthropic 模型透传给 Onyx。Onyx 那边认不认就看它了。
go.mod ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ module onyx2api
2
+
3
+ go 1.23.6
main.go ADDED
@@ -0,0 +1,1361 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package main
2
+
3
+ import (
4
+ "bufio"
5
+ "bytes"
6
+ "context"
7
+ "crypto/rand"
8
+ "encoding/hex"
9
+ "encoding/json"
10
+ "errors"
11
+ "fmt"
12
+ "io"
13
+ "log"
14
+ "net"
15
+ "net/http"
16
+ "os"
17
+ "strings"
18
+ "sync/atomic"
19
+ "time"
20
+ )
21
+
22
+ const (
23
+ onyxBase = "https://cloud.onyx.app"
24
+ defaultPersona = 1
25
+ maxRetries = 3
26
+ ver = "0.5.0"
27
+ )
28
+
29
+ var (
30
+ retryBackoff = [3]time.Duration{2 * time.Second, 5 * time.Second, 10 * time.Second}
31
+ retryStatus = map[int]bool{502: true, 503: true, 504: true, 429: true}
32
+ httpClient = &http.Client{
33
+ Timeout: 5 * time.Minute,
34
+ Transport: &http.Transport{
35
+ Proxy: http.ProxyFromEnvironment,
36
+ DialContext: (&net.Dialer{
37
+ Timeout: 30 * time.Second,
38
+ KeepAlive: 30 * time.Second,
39
+ }).DialContext,
40
+ ForceAttemptHTTP2: true,
41
+ MaxIdleConns: 200,
42
+ MaxIdleConnsPerHost: 100,
43
+ IdleConnTimeout: 90 * time.Second,
44
+ TLSHandshakeTimeout: 10 * time.Second,
45
+ ExpectContinueTimeout: 1 * time.Second,
46
+ },
47
+ }
48
+ keys []string
49
+ clientKeys map[string]struct{}
50
+ keyIdx uint64
51
+ )
52
+
53
+ // ── Key rotation ────────────────────────────────────────────────────────
54
+
55
+ func nextKey() string {
56
+ n := len(keys)
57
+ if n == 0 {
58
+ return ""
59
+ }
60
+ return keys[atomic.AddUint64(&keyIdx, 1)%uint64(n)]
61
+ }
62
+
63
+ func extractToken(s string) string {
64
+ s = strings.TrimSpace(s)
65
+ if s == "" {
66
+ return ""
67
+ }
68
+ if strings.HasPrefix(strings.ToLower(s), "bearer ") {
69
+ return strings.TrimSpace(s[7:])
70
+ }
71
+ return s
72
+ }
73
+
74
+ func resolveAuth(headers ...string) string {
75
+ if len(keys) > 0 {
76
+ return nextKey()
77
+ }
78
+ for _, h := range headers {
79
+ if h != "" {
80
+ return extractToken(h)
81
+ }
82
+ }
83
+ return ""
84
+ }
85
+
86
+ func checkClientAuth(r *http.Request) bool {
87
+ if len(clientKeys) == 0 {
88
+ return true
89
+ }
90
+ token := strings.TrimSpace(r.Header.Get("x-api-key"))
91
+ if token == "" {
92
+ token = extractToken(r.Header.Get("Authorization"))
93
+ }
94
+ if token == "" {
95
+ return false
96
+ }
97
+ _, ok := clientKeys[token]
98
+ return ok
99
+ }
100
+
101
+ // ── Model mapping ───────────────────────────────────────────────────────
102
+
103
+ type pm struct{ provider, version string }
104
+
105
+ var modelMap = map[string]pm{
106
+ "claude-opus-4-6": {"Anthropic", "claude-opus-4-6"},
107
+ "claude-opus-4-5": {"Anthropic", "claude-opus-4-5"},
108
+ "claude-sonnet-4-6": {"Anthropic", "claude-sonnet-4-6"},
109
+ "claude-sonnet-4-5": {"Anthropic", "claude-sonnet-4-5"},
110
+ "claude-sonnet-4": {"Anthropic", "claude-sonnet-4-20250514"},
111
+ "claude-3-5-sonnet": {"Anthropic", "claude-3-5-sonnet-20241022"},
112
+ "claude-3-5-haiku": {"Anthropic", "claude-3-5-haiku-20241022"},
113
+ "claude-3-opus": {"Anthropic", "claude-3-opus-20240229"},
114
+ "claude-haiku-4-5": {"Anthropic", "claude-haiku-4-5"},
115
+ "gpt-4o": {"OpenAI", "gpt-4o"},
116
+ "gpt-4o-mini": {"OpenAI", "gpt-4o-mini"},
117
+ "gpt-4-turbo": {"OpenAI", "gpt-4-turbo"},
118
+ "gpt-4": {"OpenAI", "gpt-4"},
119
+ "o1": {"OpenAI", "o1"},
120
+ "o1-mini": {"OpenAI", "o1-mini"},
121
+ "o3-mini": {"OpenAI", "o3-mini"},
122
+ "gemini-2.0-flash": {"Google", "gemini-2.0-flash"},
123
+ "gemini-2.5-pro": {"Google", "gemini-2.5-pro-preview-05-06"},
124
+ }
125
+
126
+ func buildLLMOverride(model string, temp float64) map[string]any {
127
+ m, ok := modelMap[model]
128
+ if !ok {
129
+ parts := strings.SplitN(model, "__", 3)
130
+ if len(parts) == 3 {
131
+ m = pm{parts[0], parts[2]}
132
+ } else {
133
+ m = pm{"Anthropic", model}
134
+ }
135
+ }
136
+ return map[string]any{"model_provider": m.provider, "model_version": m.version, "temperature": temp}
137
+ }
138
+
139
+ // ── Message conversion ──────────────────────────────────────────────────
140
+
141
+ func textContent(raw any) string {
142
+ switch v := raw.(type) {
143
+ case string:
144
+ return v
145
+ case []any:
146
+ var parts []string
147
+ for _, item := range v {
148
+ if m, ok := item.(map[string]any); ok && m["type"] == "text" {
149
+ if s, ok := m["text"].(string); ok {
150
+ parts = append(parts, s)
151
+ }
152
+ }
153
+ }
154
+ return strings.Join(parts, "\n")
155
+ default:
156
+ return fmt.Sprint(v)
157
+ }
158
+ }
159
+
160
+ type chatMsg struct {
161
+ Role string `json:"role"`
162
+ Content any `json:"content"`
163
+ ToolCallID string `json:"tool_call_id,omitempty"`
164
+ }
165
+
166
+ func messagesToOnyx(system string, msgs []chatMsg) string {
167
+ var conv []string
168
+ var lastUser string
169
+
170
+ for _, m := range msgs {
171
+ c := textContent(m.Content)
172
+ switch m.Role {
173
+ case "system":
174
+ if system == "" {
175
+ system = c
176
+ } else {
177
+ system += "\n" + c
178
+ }
179
+ case "user":
180
+ lastUser = c
181
+ conv = append(conv, "User: "+c)
182
+ case "assistant":
183
+ conv = append(conv, "Assistant: "+c)
184
+ case "tool":
185
+ tid := m.ToolCallID
186
+ if tid == "" {
187
+ tid = "unknown"
188
+ }
189
+ conv = append(conv, fmt.Sprintf("Tool result (%s): %s", tid, c))
190
+ }
191
+ }
192
+
193
+ if system == "" && len(conv) == 1 && len(msgs) == 1 && msgs[0].Role == "user" {
194
+ return lastUser
195
+ }
196
+ if system != "" && len(conv) == 1 && len(msgs) >= 1 {
197
+ return "[System: " + system + "]\n\n" + lastUser
198
+ }
199
+
200
+ var b strings.Builder
201
+ if system != "" {
202
+ b.WriteString("[System: " + system + "]\n\n")
203
+ }
204
+ b.WriteString(strings.Join(conv, "\n"))
205
+ return b.String()
206
+ }
207
+
208
+ // ── Helpers ─────────────────────────────────────────────────────────────
209
+
210
+ func genID(prefix string) string {
211
+ b := make([]byte, 15)
212
+ if _, err := rand.Read(b); err != nil {
213
+ return fmt.Sprintf("%s%d", prefix, time.Now().UnixNano())
214
+ }
215
+ return prefix + hex.EncodeToString(b)[:29]
216
+ }
217
+
218
+ func toStrSlice(v any) []string {
219
+ arr, ok := v.([]any)
220
+ if !ok {
221
+ return nil
222
+ }
223
+ var out []string
224
+ for _, item := range arr {
225
+ if s, ok := item.(string); ok {
226
+ out = append(out, s)
227
+ }
228
+ }
229
+ return out
230
+ }
231
+
232
+ func writeJSON(w http.ResponseWriter, status int, v any) {
233
+ w.Header().Set("Content-Type", "application/json")
234
+ w.WriteHeader(status)
235
+ json.NewEncoder(w).Encode(v)
236
+ }
237
+
238
+ func sleepOrDone(ctx context.Context, d time.Duration) bool {
239
+ t := time.NewTimer(d)
240
+ defer t.Stop()
241
+ select {
242
+ case <-ctx.Done():
243
+ return false
244
+ case <-t.C:
245
+ return true
246
+ }
247
+ }
248
+
249
+ // ── Core: send to Onyx with retry ──────────────────────────────────────
250
+
251
+ type onyxError struct {
252
+ Status int
253
+ Body string
254
+ }
255
+
256
+ func (e *onyxError) Error() string {
257
+ return fmt.Sprintf("Onyx HTTP %d: %s", e.Status, e.Body)
258
+ }
259
+
260
+ func doOnyxRequest(ctx context.Context, token, model string, msgs []chatMsg, system string, temp float64, persona int) (*http.Response, error) {
261
+ bodyBytes, err := json.Marshal(map[string]any{
262
+ "message": messagesToOnyx(system, msgs),
263
+ "chat_session_info": map[string]any{"persona_id": persona},
264
+ "llm_override": buildLLMOverride(model, temp),
265
+ "stream": true,
266
+ "file_descriptors": []any{},
267
+ "deep_research": false,
268
+ "origin": "api",
269
+ })
270
+ if err != nil {
271
+ return nil, fmt.Errorf("marshal onyx request: %w", err)
272
+ }
273
+
274
+ var lastErr error
275
+ for attempt := range maxRetries + 1 {
276
+ if err := ctx.Err(); err != nil {
277
+ return nil, err
278
+ }
279
+ cur := token
280
+ if attempt > 0 && len(keys) > 0 {
281
+ cur = nextKey()
282
+ log.Printf("Retry %d/%d, switched key", attempt, maxRetries)
283
+ }
284
+
285
+ req, err := http.NewRequestWithContext(
286
+ ctx,
287
+ http.MethodPost,
288
+ onyxBase+"/api/chat/send-chat-message",
289
+ bytes.NewReader(bodyBytes),
290
+ )
291
+ if err != nil {
292
+ return nil, fmt.Errorf("create onyx request: %w", err)
293
+ }
294
+ req.Header.Set("Content-Type", "application/json")
295
+ req.Header.Set("Authorization", "Bearer "+cur)
296
+
297
+ resp, err := httpClient.Do(req)
298
+ if err != nil {
299
+ if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
300
+ return nil, err
301
+ }
302
+ lastErr = err
303
+ if attempt < maxRetries {
304
+ wait := retryBackoff[min(attempt, len(retryBackoff)-1)]
305
+ log.Printf("Attempt %d failed (%v), retry in %v", attempt+1, err, wait)
306
+ if !sleepOrDone(ctx, wait) {
307
+ return nil, ctx.Err()
308
+ }
309
+ continue
310
+ }
311
+ break
312
+ }
313
+
314
+ if resp.StatusCode != 200 {
315
+ body, _ := io.ReadAll(io.LimitReader(resp.Body, 1024))
316
+ resp.Body.Close()
317
+ log.Printf("Onyx HTTP %d for model=%s: %s", resp.StatusCode, model, string(body))
318
+ if retryStatus[resp.StatusCode] && attempt < maxRetries {
319
+ wait := retryBackoff[min(attempt, len(retryBackoff)-1)]
320
+ log.Printf("Attempt %d failed (HTTP %d), retry in %v", attempt+1, resp.StatusCode, wait)
321
+ if !sleepOrDone(ctx, wait) {
322
+ return nil, ctx.Err()
323
+ }
324
+ continue
325
+ }
326
+ return nil, &onyxError{Status: resp.StatusCode, Body: string(body)}
327
+ }
328
+ return resp, nil
329
+ }
330
+ return nil, fmt.Errorf("all retries failed: %v", lastErr)
331
+ }
332
+
333
+ // ── Onyx NDJSON scanner ────────────────────────────────────────────────
334
+
335
+ type onyxScanner struct {
336
+ sc *bufio.Scanner
337
+ }
338
+
339
+ type onyxEvent struct {
340
+ Type string
341
+ Obj map[string]any
342
+ Err string
343
+ }
344
+
345
+ func newOnyxScanner(r io.Reader) *onyxScanner {
346
+ sc := bufio.NewScanner(r)
347
+ sc.Buffer(make([]byte, 0, 64*1024), 1024*1024)
348
+ return &onyxScanner{sc: sc}
349
+ }
350
+
351
+ func (s *onyxScanner) Next() (onyxEvent, bool) {
352
+ for s.sc.Scan() {
353
+ line := strings.TrimSpace(s.sc.Text())
354
+ if line == "" {
355
+ continue
356
+ }
357
+ var raw map[string]any
358
+ if json.Unmarshal([]byte(line), &raw) != nil {
359
+ continue
360
+ }
361
+ if raw["user_message_id"] != nil {
362
+ continue
363
+ }
364
+ if raw["error"] != nil && raw["obj"] == nil {
365
+ return onyxEvent{Type: "error", Err: fmt.Sprint(raw["error"])}, true
366
+ }
367
+ obj, _ := raw["obj"].(map[string]any)
368
+ if obj == nil {
369
+ continue
370
+ }
371
+ t, _ := obj["type"].(string)
372
+ return onyxEvent{Type: t, Obj: obj}, true
373
+ }
374
+ return onyxEvent{}, false
375
+ }
376
+
377
+ // ══════════════════════════════════════════════════════════════════════════
378
+ // OpenAI FORMAT
379
+ // ══════════════════════════════════════════════════════════════════════════
380
+
381
+ func sse(v any) []byte {
382
+ b, _ := json.Marshal(v)
383
+ return append(append([]byte("data: "), b...), '\n', '\n')
384
+ }
385
+
386
+ var sseDone = []byte("data: [DONE]\n\n")
387
+
388
+ func makeChunk(id string, created int64, model string, delta map[string]any, finish *string) map[string]any {
389
+ return map[string]any{
390
+ "id": id, "object": "chat.completion.chunk", "created": created, "model": model,
391
+ "choices": []any{map[string]any{"index": 0, "delta": delta, "finish_reason": finish}},
392
+ }
393
+ }
394
+
395
+ func streamOpenAI(w io.Writer, flush func(), body io.Reader, model, rid string) {
396
+ created := time.Now().Unix()
397
+ sentRole := false
398
+ toolActive := false
399
+ stop := "stop"
400
+ scan := newOnyxScanner(body)
401
+
402
+ emit := func(delta map[string]any, fin *string) {
403
+ w.Write(sse(makeChunk(rid, created, model, delta, fin)))
404
+ flush()
405
+ }
406
+ ensureRole := func(d map[string]any) {
407
+ if !sentRole {
408
+ d["role"] = "assistant"
409
+ sentRole = true
410
+ }
411
+ }
412
+
413
+ for {
414
+ ev, ok := scan.Next()
415
+ if !ok {
416
+ break
417
+ }
418
+ if ev.Err != "" {
419
+ emit(map[string]any{"content": "\n\n[Error: " + ev.Err + "]"}, &stop)
420
+ w.Write(sseDone)
421
+ flush()
422
+ return
423
+ }
424
+ obj := ev.Obj
425
+ switch ev.Type {
426
+ case "reasoning_start", "reasoning_done":
427
+ case "reasoning_delta":
428
+ if s, _ := obj["reasoning"].(string); s != "" {
429
+ emit(map[string]any{"reasoning_content": s}, nil)
430
+ }
431
+ case "message_start":
432
+ if !sentRole {
433
+ emit(map[string]any{"role": "assistant", "content": ""}, nil)
434
+ sentRole = true
435
+ }
436
+ case "message_delta":
437
+ if c, _ := obj["content"].(string); c != "" {
438
+ d := map[string]any{"content": c}
439
+ ensureRole(d)
440
+ emit(d, nil)
441
+ }
442
+ case "search_tool_start":
443
+ toolActive = true
444
+ label := "Internal Search"
445
+ if v, _ := obj["is_internet_search"].(bool); v {
446
+ label = "Web Search"
447
+ }
448
+ d := map[string]any{"content": "\n[" + label + "] "}
449
+ ensureRole(d)
450
+ emit(d, nil)
451
+ case "search_tool_queries_delta":
452
+ if qs := toStrSlice(obj["queries"]); len(qs) > 0 {
453
+ var q []string
454
+ for _, s := range qs {
455
+ q = append(q, "\u201c"+s+"\u201d")
456
+ }
457
+ emit(map[string]any{"content": "Searching: " + strings.Join(q, ", ") + "\n"}, nil)
458
+ }
459
+ case "search_tool_documents_delta":
460
+ if docs, ok := obj["documents"].([]any); ok && len(docs) > 0 {
461
+ emit(map[string]any{"content": fmt.Sprintf("Found %d results.\n", len(docs))}, nil)
462
+ }
463
+ case "open_url_start":
464
+ toolActive = true
465
+ d := map[string]any{"content": "\n[Opening URL] "}
466
+ ensureRole(d)
467
+ emit(d, nil)
468
+ case "open_url_urls":
469
+ if urls := toStrSlice(obj["urls"]); len(urls) > 0 {
470
+ emit(map[string]any{"content": strings.Join(urls, ", ") + "\n"}, nil)
471
+ }
472
+ case "open_url_documents":
473
+ if docs, ok := obj["documents"].([]any); ok && len(docs) > 0 {
474
+ emit(map[string]any{"content": fmt.Sprintf("Loaded %d pages.\n", len(docs))}, nil)
475
+ }
476
+ case "image_generation_start":
477
+ toolActive = true
478
+ d := map[string]any{"content": "\n[Generating Image...]\n"}
479
+ ensureRole(d)
480
+ emit(d, nil)
481
+ case "image_generation_heartbeat":
482
+ case "image_generation_final":
483
+ if images, ok := obj["images"].([]any); ok {
484
+ for _, img := range images {
485
+ if m, ok := img.(map[string]any); ok {
486
+ u, _ := m["url"].(string)
487
+ p, _ := m["revised_prompt"].(string)
488
+ if u != "" {
489
+ emit(map[string]any{"content": fmt.Sprintf("![%s](%s)\n", p, u)}, nil)
490
+ }
491
+ }
492
+ }
493
+ }
494
+ case "python_tool_start":
495
+ toolActive = true
496
+ code, _ := obj["code"].(string)
497
+ text := "\n[Code Interpreter]\n"
498
+ if code != "" {
499
+ text += "```python\n" + code + "\n```\n"
500
+ }
501
+ d := map[string]any{"content": text}
502
+ ensureRole(d)
503
+ emit(d, nil)
504
+ case "python_tool_delta":
505
+ var parts []string
506
+ if s, _ := obj["stdout"].(string); s != "" {
507
+ parts = append(parts, "Output: "+s)
508
+ }
509
+ if s, _ := obj["stderr"].(string); s != "" {
510
+ parts = append(parts, "Error: "+s)
511
+ }
512
+ if len(parts) > 0 {
513
+ emit(map[string]any{"content": strings.Join(parts, "\n") + "\n"}, nil)
514
+ }
515
+ case "custom_tool_start":
516
+ toolActive = true
517
+ tn, _ := obj["tool_name"].(string)
518
+ if tn == "" {
519
+ tn = "custom_tool"
520
+ }
521
+ d := map[string]any{"content": "\n[Tool: " + tn + "]\n"}
522
+ ensureRole(d)
523
+ emit(d, nil)
524
+ case "custom_tool_delta":
525
+ if td := obj["data"]; td != nil {
526
+ var text string
527
+ if m, ok := td.(map[string]any); ok {
528
+ b, _ := json.Marshal(m)
529
+ text = string(b)
530
+ } else {
531
+ text = fmt.Sprint(td)
532
+ }
533
+ if text != "" {
534
+ emit(map[string]any{"content": text + "\n"}, nil)
535
+ }
536
+ }
537
+ case "file_reader_start":
538
+ toolActive = true
539
+ d := map[string]any{"content": "\n[Reading File...]\n"}
540
+ ensureRole(d)
541
+ emit(d, nil)
542
+ case "file_reader_result":
543
+ if fn, _ := obj["file_name"].(string); fn != "" {
544
+ emit(map[string]any{"content": "Read: " + fn + "\n"}, nil)
545
+ }
546
+ case "deep_research_plan_start":
547
+ d := map[string]any{"content": "\n[Research Plan]\n"}
548
+ ensureRole(d)
549
+ emit(d, nil)
550
+ case "deep_research_plan_delta", "intermediate_report_start", "intermediate_report_delta":
551
+ if c, _ := obj["content"].(string); c != "" {
552
+ emit(map[string]any{"content": c}, nil)
553
+ }
554
+ case "research_agent_start":
555
+ task, _ := obj["research_task"].(string)
556
+ emit(map[string]any{"content": "\n[Researching: " + task + "]\n"}, nil)
557
+ case "memory_tool_start", "memory_tool_delta", "memory_tool_no_access",
558
+ "citation_info", "top_level_branching", "intermediate_report_cited_docs", "tool_call_debug":
559
+ case "section_end":
560
+ if toolActive {
561
+ emit(map[string]any{"content": "\n"}, nil)
562
+ toolActive = false
563
+ }
564
+ case "stop":
565
+ emit(map[string]any{}, &stop)
566
+ w.Write(sseDone)
567
+ flush()
568
+ return
569
+ case "error":
570
+ msg, _ := obj["error"].(string)
571
+ if msg == "" {
572
+ msg = "Unknown error"
573
+ }
574
+ emit(map[string]any{"content": "\n[Error: " + msg + "]"}, &stop)
575
+ w.Write(sseDone)
576
+ flush()
577
+ return
578
+ default:
579
+ if ev.Type != "" {
580
+ log.Printf("Unknown SSE event: %s", ev.Type)
581
+ }
582
+ }
583
+ }
584
+ w.Write(sseDone)
585
+ flush()
586
+ }
587
+
588
+ func collectOpenAI(body io.Reader) string {
589
+ var parts, toolCtx []string
590
+ scan := newOnyxScanner(body)
591
+ for {
592
+ ev, ok := scan.Next()
593
+ if !ok {
594
+ break
595
+ }
596
+ if ev.Err != "" {
597
+ parts = append(parts, "\n[Error: "+ev.Err+"]")
598
+ break
599
+ }
600
+ obj := ev.Obj
601
+ switch ev.Type {
602
+ case "message_delta":
603
+ if c, _ := obj["content"].(string); c != "" {
604
+ parts = append(parts, c)
605
+ }
606
+ case "search_tool_start":
607
+ label := "Internal Search"
608
+ if v, _ := obj["is_internet_search"].(bool); v {
609
+ label = "Web Search"
610
+ }
611
+ toolCtx = append(toolCtx, "["+label+"]")
612
+ case "search_tool_queries_delta":
613
+ if qs := toStrSlice(obj["queries"]); len(qs) > 0 {
614
+ toolCtx = append(toolCtx, "Searching: "+strings.Join(qs, ", "))
615
+ }
616
+ case "search_tool_documents_delta":
617
+ if docs, ok := obj["documents"].([]any); ok && len(docs) > 0 {
618
+ toolCtx = append(toolCtx, fmt.Sprintf("Found %d results.", len(docs)))
619
+ }
620
+ case "open_url_start":
621
+ toolCtx = append(toolCtx, "[Opening URL]")
622
+ case "open_url_urls":
623
+ if urls := toStrSlice(obj["urls"]); len(urls) > 0 {
624
+ toolCtx = append(toolCtx, strings.Join(urls, ", "))
625
+ }
626
+ case "python_tool_start":
627
+ code, _ := obj["code"].(string)
628
+ toolCtx = append(toolCtx, "[Code Interpreter]\n```python\n"+code+"\n```")
629
+ case "python_tool_delta":
630
+ if s, _ := obj["stdout"].(string); s != "" {
631
+ toolCtx = append(toolCtx, "Output: "+s)
632
+ }
633
+ if s, _ := obj["stderr"].(string); s != "" {
634
+ toolCtx = append(toolCtx, "Error: "+s)
635
+ }
636
+ case "image_generation_final":
637
+ if images, ok := obj["images"].([]any); ok {
638
+ for _, img := range images {
639
+ if m, ok := img.(map[string]any); ok {
640
+ u, _ := m["url"].(string)
641
+ p, _ := m["revised_prompt"].(string)
642
+ if u != "" {
643
+ toolCtx = append(toolCtx, fmt.Sprintf("![%s](%s)", p, u))
644
+ }
645
+ }
646
+ }
647
+ }
648
+ case "custom_tool_start":
649
+ tn, _ := obj["tool_name"].(string)
650
+ toolCtx = append(toolCtx, "[Tool: "+tn+"]")
651
+ case "custom_tool_delta":
652
+ if td := obj["data"]; td != nil {
653
+ toolCtx = append(toolCtx, fmt.Sprint(td))
654
+ }
655
+ case "error":
656
+ msg, _ := obj["error"].(string)
657
+ parts = append(parts, "\n[Error: "+msg+"]")
658
+ case "stop":
659
+ goto done
660
+ }
661
+ }
662
+ done:
663
+ var b strings.Builder
664
+ if len(toolCtx) > 0 {
665
+ b.WriteString(strings.Join(toolCtx, "\n") + "\n\n")
666
+ }
667
+ for _, p := range parts {
668
+ b.WriteString(p)
669
+ }
670
+ return b.String()
671
+ }
672
+
673
+ // ══════════════════════════════════════════════════════════════════════════
674
+ // Anthropic FORMAT
675
+ // ══════════════════════════════════════════════════════════════════════════
676
+
677
+ func anthropicSSE(event string, data any) []byte {
678
+ b, _ := json.Marshal(data)
679
+ return []byte("event: " + event + "\ndata: " + string(b) + "\n\n")
680
+ }
681
+
682
+ func streamAnthropic(w io.Writer, flush func(), body io.Reader, model, rid string) {
683
+ scan := newOnyxScanner(body)
684
+ blockIdx := 0
685
+ inThinking := false
686
+ inText := false
687
+
688
+ // message_start
689
+ w.Write(anthropicSSE("message_start", map[string]any{
690
+ "type": "message_start",
691
+ "message": map[string]any{
692
+ "id": rid, "type": "message", "role": "assistant", "model": model,
693
+ "content": []any{}, "stop_reason": nil,
694
+ "usage": map[string]any{"input_tokens": 0, "output_tokens": 0},
695
+ },
696
+ }))
697
+ flush()
698
+
699
+ startBlock := func(btype string) {
700
+ block := map[string]any{"type": btype}
701
+ if btype == "thinking" {
702
+ block["thinking"] = ""
703
+ } else {
704
+ block["text"] = ""
705
+ }
706
+ w.Write(anthropicSSE("content_block_start", map[string]any{
707
+ "type": "content_block_start", "index": blockIdx, "content_block": block,
708
+ }))
709
+ flush()
710
+ }
711
+ stopBlock := func() {
712
+ w.Write(anthropicSSE("content_block_stop", map[string]any{
713
+ "type": "content_block_stop", "index": blockIdx,
714
+ }))
715
+ flush()
716
+ blockIdx++
717
+ }
718
+ finishMsg := func(reason string) {
719
+ if inThinking {
720
+ stopBlock()
721
+ inThinking = false
722
+ }
723
+ if inText {
724
+ stopBlock()
725
+ inText = false
726
+ }
727
+ w.Write(anthropicSSE("message_delta", map[string]any{
728
+ "type": "message_delta",
729
+ "delta": map[string]any{"stop_reason": reason},
730
+ "usage": map[string]any{"output_tokens": 0},
731
+ }))
732
+ w.Write(anthropicSSE("message_stop", map[string]any{"type": "message_stop"}))
733
+ flush()
734
+ }
735
+
736
+ for {
737
+ ev, ok := scan.Next()
738
+ if !ok {
739
+ break
740
+ }
741
+ if ev.Err != "" {
742
+ if !inText {
743
+ startBlock("text")
744
+ inText = true
745
+ }
746
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
747
+ "type": "content_block_delta", "index": blockIdx,
748
+ "delta": map[string]any{"type": "text_delta", "text": "\n[Error: " + ev.Err + "]"},
749
+ }))
750
+ flush()
751
+ finishMsg("end_turn")
752
+ return
753
+ }
754
+ obj := ev.Obj
755
+ switch ev.Type {
756
+ case "reasoning_start":
757
+ if !inThinking {
758
+ startBlock("thinking")
759
+ inThinking = true
760
+ }
761
+ case "reasoning_delta":
762
+ if s, _ := obj["reasoning"].(string); s != "" {
763
+ if !inThinking {
764
+ startBlock("thinking")
765
+ inThinking = true
766
+ }
767
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
768
+ "type": "content_block_delta", "index": blockIdx,
769
+ "delta": map[string]any{"type": "thinking_delta", "thinking": s},
770
+ }))
771
+ flush()
772
+ }
773
+ case "reasoning_done":
774
+ if inThinking {
775
+ stopBlock()
776
+ inThinking = false
777
+ }
778
+ case "message_start":
779
+ if !inText {
780
+ if inThinking {
781
+ stopBlock()
782
+ inThinking = false
783
+ }
784
+ startBlock("text")
785
+ inText = true
786
+ }
787
+ case "message_delta":
788
+ if c, _ := obj["content"].(string); c != "" {
789
+ if !inText {
790
+ if inThinking {
791
+ stopBlock()
792
+ inThinking = false
793
+ }
794
+ startBlock("text")
795
+ inText = true
796
+ }
797
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
798
+ "type": "content_block_delta", "index": blockIdx,
799
+ "delta": map[string]any{"type": "text_delta", "text": c},
800
+ }))
801
+ flush()
802
+ }
803
+
804
+ // ── Tool events → text deltas ──
805
+ case "search_tool_start":
806
+ if !inText {
807
+ if inThinking {
808
+ stopBlock()
809
+ inThinking = false
810
+ }
811
+ startBlock("text")
812
+ inText = true
813
+ }
814
+ label := "Internal Search"
815
+ if v, _ := obj["is_internet_search"].(bool); v {
816
+ label = "Web Search"
817
+ }
818
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
819
+ "type": "content_block_delta", "index": blockIdx,
820
+ "delta": map[string]any{"type": "text_delta", "text": "\n[" + label + "] "},
821
+ }))
822
+ flush()
823
+ case "search_tool_queries_delta":
824
+ if qs := toStrSlice(obj["queries"]); len(qs) > 0 {
825
+ var q []string
826
+ for _, s := range qs {
827
+ q = append(q, "\u201c"+s+"\u201d")
828
+ }
829
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
830
+ "type": "content_block_delta", "index": blockIdx,
831
+ "delta": map[string]any{"type": "text_delta", "text": "Searching: " + strings.Join(q, ", ") + "\n"},
832
+ }))
833
+ flush()
834
+ }
835
+ case "search_tool_documents_delta":
836
+ if docs, ok := obj["documents"].([]any); ok && len(docs) > 0 {
837
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
838
+ "type": "content_block_delta", "index": blockIdx,
839
+ "delta": map[string]any{"type": "text_delta", "text": fmt.Sprintf("Found %d results.\n", len(docs))},
840
+ }))
841
+ flush()
842
+ }
843
+ case "open_url_start", "python_tool_start", "custom_tool_start",
844
+ "image_generation_start", "file_reader_start", "deep_research_plan_start":
845
+ if !inText {
846
+ if inThinking {
847
+ stopBlock()
848
+ inThinking = false
849
+ }
850
+ startBlock("text")
851
+ inText = true
852
+ }
853
+ var label string
854
+ switch ev.Type {
855
+ case "open_url_start":
856
+ label = "[Opening URL] "
857
+ case "python_tool_start":
858
+ code, _ := obj["code"].(string)
859
+ label = "[Code Interpreter]\n"
860
+ if code != "" {
861
+ label += "```python\n" + code + "\n```\n"
862
+ }
863
+ case "custom_tool_start":
864
+ tn, _ := obj["tool_name"].(string)
865
+ if tn == "" {
866
+ tn = "custom_tool"
867
+ }
868
+ label = "[Tool: " + tn + "]\n"
869
+ case "image_generation_start":
870
+ label = "[Generating Image...]\n"
871
+ case "file_reader_start":
872
+ label = "[Reading File...]\n"
873
+ case "deep_research_plan_start":
874
+ label = "[Research Plan]\n"
875
+ }
876
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
877
+ "type": "content_block_delta", "index": blockIdx,
878
+ "delta": map[string]any{"type": "text_delta", "text": "\n" + label},
879
+ }))
880
+ flush()
881
+ case "open_url_urls":
882
+ if urls := toStrSlice(obj["urls"]); len(urls) > 0 {
883
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
884
+ "type": "content_block_delta", "index": blockIdx,
885
+ "delta": map[string]any{"type": "text_delta", "text": strings.Join(urls, ", ") + "\n"},
886
+ }))
887
+ flush()
888
+ }
889
+ case "open_url_documents":
890
+ if docs, ok := obj["documents"].([]any); ok && len(docs) > 0 {
891
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
892
+ "type": "content_block_delta", "index": blockIdx,
893
+ "delta": map[string]any{"type": "text_delta", "text": fmt.Sprintf("Loaded %d pages.\n", len(docs))},
894
+ }))
895
+ flush()
896
+ }
897
+ case "python_tool_delta":
898
+ var parts []string
899
+ if s, _ := obj["stdout"].(string); s != "" {
900
+ parts = append(parts, "Output: "+s)
901
+ }
902
+ if s, _ := obj["stderr"].(string); s != "" {
903
+ parts = append(parts, "Error: "+s)
904
+ }
905
+ if len(parts) > 0 {
906
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
907
+ "type": "content_block_delta", "index": blockIdx,
908
+ "delta": map[string]any{"type": "text_delta", "text": strings.Join(parts, "\n") + "\n"},
909
+ }))
910
+ flush()
911
+ }
912
+ case "custom_tool_delta":
913
+ if td := obj["data"]; td != nil {
914
+ var text string
915
+ if m, ok := td.(map[string]any); ok {
916
+ b, _ := json.Marshal(m)
917
+ text = string(b)
918
+ } else {
919
+ text = fmt.Sprint(td)
920
+ }
921
+ if text != "" {
922
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
923
+ "type": "content_block_delta", "index": blockIdx,
924
+ "delta": map[string]any{"type": "text_delta", "text": text + "\n"},
925
+ }))
926
+ flush()
927
+ }
928
+ }
929
+ case "image_generation_final":
930
+ if images, ok := obj["images"].([]any); ok {
931
+ for _, img := range images {
932
+ if m, ok := img.(map[string]any); ok {
933
+ u, _ := m["url"].(string)
934
+ p, _ := m["revised_prompt"].(string)
935
+ if u != "" {
936
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
937
+ "type": "content_block_delta", "index": blockIdx,
938
+ "delta": map[string]any{"type": "text_delta", "text": fmt.Sprintf("![%s](%s)\n", p, u)},
939
+ }))
940
+ flush()
941
+ }
942
+ }
943
+ }
944
+ }
945
+ case "file_reader_result":
946
+ if fn, _ := obj["file_name"].(string); fn != "" {
947
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
948
+ "type": "content_block_delta", "index": blockIdx,
949
+ "delta": map[string]any{"type": "text_delta", "text": "Read: " + fn + "\n"},
950
+ }))
951
+ flush()
952
+ }
953
+ case "deep_research_plan_delta", "intermediate_report_start",
954
+ "intermediate_report_delta":
955
+ if c, _ := obj["content"].(string); c != "" {
956
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
957
+ "type": "content_block_delta", "index": blockIdx,
958
+ "delta": map[string]any{"type": "text_delta", "text": c},
959
+ }))
960
+ flush()
961
+ }
962
+ case "research_agent_start":
963
+ task, _ := obj["research_task"].(string)
964
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
965
+ "type": "content_block_delta", "index": blockIdx,
966
+ "delta": map[string]any{"type": "text_delta", "text": "\n[Researching: " + task + "]\n"},
967
+ }))
968
+ flush()
969
+ case "image_generation_heartbeat", "memory_tool_start", "memory_tool_delta",
970
+ "memory_tool_no_access", "citation_info", "top_level_branching",
971
+ "intermediate_report_cited_docs", "tool_call_debug":
972
+ case "section_end":
973
+ // skip
974
+ case "stop":
975
+ finishMsg("end_turn")
976
+ return
977
+ case "error":
978
+ msg, _ := obj["error"].(string)
979
+ if msg == "" {
980
+ msg = "Unknown error"
981
+ }
982
+ if !inText {
983
+ startBlock("text")
984
+ inText = true
985
+ }
986
+ w.Write(anthropicSSE("content_block_delta", map[string]any{
987
+ "type": "content_block_delta", "index": blockIdx,
988
+ "delta": map[string]any{"type": "text_delta", "text": "\n[Error: " + msg + "]"},
989
+ }))
990
+ flush()
991
+ finishMsg("end_turn")
992
+ return
993
+ default:
994
+ if ev.Type != "" {
995
+ log.Printf("Unknown SSE event: %s", ev.Type)
996
+ }
997
+ }
998
+ }
999
+ finishMsg("end_turn")
1000
+ }
1001
+
1002
+ func collectAnthropic(body io.Reader) (text, thinking string) {
1003
+ var textParts, thinkParts, toolCtx []string
1004
+ scan := newOnyxScanner(body)
1005
+ for {
1006
+ ev, ok := scan.Next()
1007
+ if !ok {
1008
+ break
1009
+ }
1010
+ if ev.Err != "" {
1011
+ textParts = append(textParts, "\n[Error: "+ev.Err+"]")
1012
+ break
1013
+ }
1014
+ obj := ev.Obj
1015
+ switch ev.Type {
1016
+ case "reasoning_delta":
1017
+ if s, _ := obj["reasoning"].(string); s != "" {
1018
+ thinkParts = append(thinkParts, s)
1019
+ }
1020
+ case "message_delta":
1021
+ if c, _ := obj["content"].(string); c != "" {
1022
+ textParts = append(textParts, c)
1023
+ }
1024
+ case "search_tool_start":
1025
+ label := "Internal Search"
1026
+ if v, _ := obj["is_internet_search"].(bool); v {
1027
+ label = "Web Search"
1028
+ }
1029
+ toolCtx = append(toolCtx, "["+label+"]")
1030
+ case "search_tool_queries_delta":
1031
+ if qs := toStrSlice(obj["queries"]); len(qs) > 0 {
1032
+ toolCtx = append(toolCtx, "Searching: "+strings.Join(qs, ", "))
1033
+ }
1034
+ case "search_tool_documents_delta":
1035
+ if docs, ok := obj["documents"].([]any); ok && len(docs) > 0 {
1036
+ toolCtx = append(toolCtx, fmt.Sprintf("Found %d results.", len(docs)))
1037
+ }
1038
+ case "open_url_start":
1039
+ toolCtx = append(toolCtx, "[Opening URL]")
1040
+ case "open_url_urls":
1041
+ if urls := toStrSlice(obj["urls"]); len(urls) > 0 {
1042
+ toolCtx = append(toolCtx, strings.Join(urls, ", "))
1043
+ }
1044
+ case "python_tool_start":
1045
+ code, _ := obj["code"].(string)
1046
+ toolCtx = append(toolCtx, "[Code Interpreter]\n```python\n"+code+"\n```")
1047
+ case "python_tool_delta":
1048
+ if s, _ := obj["stdout"].(string); s != "" {
1049
+ toolCtx = append(toolCtx, "Output: "+s)
1050
+ }
1051
+ if s, _ := obj["stderr"].(string); s != "" {
1052
+ toolCtx = append(toolCtx, "Error: "+s)
1053
+ }
1054
+ case "image_generation_final":
1055
+ if images, ok := obj["images"].([]any); ok {
1056
+ for _, img := range images {
1057
+ if m, ok := img.(map[string]any); ok {
1058
+ u, _ := m["url"].(string)
1059
+ p, _ := m["revised_prompt"].(string)
1060
+ if u != "" {
1061
+ toolCtx = append(toolCtx, fmt.Sprintf("![%s](%s)", p, u))
1062
+ }
1063
+ }
1064
+ }
1065
+ }
1066
+ case "custom_tool_start":
1067
+ tn, _ := obj["tool_name"].(string)
1068
+ toolCtx = append(toolCtx, "[Tool: "+tn+"]")
1069
+ case "custom_tool_delta":
1070
+ if td := obj["data"]; td != nil {
1071
+ toolCtx = append(toolCtx, fmt.Sprint(td))
1072
+ }
1073
+ case "error":
1074
+ msg, _ := obj["error"].(string)
1075
+ textParts = append(textParts, "\n[Error: "+msg+"]")
1076
+ case "stop":
1077
+ goto done
1078
+ }
1079
+ }
1080
+ done:
1081
+ var tb strings.Builder
1082
+ if len(toolCtx) > 0 {
1083
+ tb.WriteString(strings.Join(toolCtx, "\n") + "\n\n")
1084
+ }
1085
+ for _, p := range textParts {
1086
+ tb.WriteString(p)
1087
+ }
1088
+ return tb.String(), strings.Join(thinkParts, "")
1089
+ }
1090
+
1091
+ // ══════════════════════════════════════════════════════════════════════════
1092
+ // HTTP HANDLERS
1093
+ // ══════════════════════════════════════════════════════════════════════════
1094
+
1095
+ func handleModels(w http.ResponseWriter, r *http.Request) {
1096
+ models := []string{
1097
+ "claude-opus-4-6", "claude-opus-4-5",
1098
+ "claude-sonnet-4-6", "claude-sonnet-4-5", "claude-sonnet-4",
1099
+ "claude-3-5-sonnet", "claude-3-5-haiku", "claude-haiku-4-5",
1100
+ "gpt-4o", "gpt-4o-mini", "o1", "o3-mini",
1101
+ "gemini-2.0-flash", "gemini-2.5-pro",
1102
+ }
1103
+ data := make([]any, len(models))
1104
+ for i, m := range models {
1105
+ data[i] = map[string]any{"id": m, "object": "model", "created": 1700000000, "owned_by": "onyx"}
1106
+ }
1107
+ writeJSON(w, 200, map[string]any{"object": "list", "data": data})
1108
+ }
1109
+
1110
+ func handleHealth(w http.ResponseWriter, r *http.Request) {
1111
+ writeJSON(w, 200, map[string]any{"status": "ok", "version": ver, "keys": len(keys)})
1112
+ }
1113
+
1114
+ func handleRoot(w http.ResponseWriter, r *http.Request) {
1115
+ if r.URL.Path != "/" {
1116
+ http.NotFound(w, r)
1117
+ return
1118
+ }
1119
+ writeJSON(w, 200, map[string]any{
1120
+ "name": "onyx2api",
1121
+ "version": ver,
1122
+ "endpoints": []string{"/v1/chat/completions", "/v1/messages", "/v1/models", "/health"},
1123
+ })
1124
+ }
1125
+
1126
+ // ── OpenAI: POST /v1/chat/completions ──
1127
+
1128
+ type openaiReq struct {
1129
+ Model string `json:"model"`
1130
+ Messages []chatMsg `json:"messages"`
1131
+ Stream bool `json:"stream"`
1132
+ Temperature *float64 `json:"temperature,omitempty"`
1133
+ PersonaID *int `json:"persona_id,omitempty"`
1134
+ }
1135
+
1136
+ func handleOpenAI(w http.ResponseWriter, r *http.Request) {
1137
+ if r.Method != http.MethodPost {
1138
+ writeJSON(w, 405, map[string]any{"error": "method not allowed"})
1139
+ return
1140
+ }
1141
+ if !checkClientAuth(r) {
1142
+ writeJSON(w, 401, map[string]any{"error": "invalid api key"})
1143
+ return
1144
+ }
1145
+ var req openaiReq
1146
+ if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
1147
+ writeJSON(w, 400, map[string]any{"error": "invalid JSON"})
1148
+ return
1149
+ }
1150
+ token := resolveAuth(r.Header.Get("Authorization"))
1151
+ if token == "" {
1152
+ writeJSON(w, 401, map[string]any{"error": "No auth. Set ONYX_KEYS env var."})
1153
+ return
1154
+ }
1155
+ if len(req.Messages) == 0 {
1156
+ writeJSON(w, 400, map[string]any{"error": "messages is required"})
1157
+ return
1158
+ }
1159
+
1160
+ model := req.Model
1161
+ if model == "" {
1162
+ model = "claude-opus-4-6"
1163
+ }
1164
+ temp := 0.5
1165
+ if req.Temperature != nil {
1166
+ temp = *req.Temperature
1167
+ }
1168
+ persona := defaultPersona
1169
+ if req.PersonaID != nil {
1170
+ persona = *req.PersonaID
1171
+ }
1172
+
1173
+ rid := genID("chatcmpl-")
1174
+ resp, err := doOnyxRequest(r.Context(), token, model, req.Messages, "", temp, persona)
1175
+ if err != nil {
1176
+ if oe, ok := err.(*onyxError); ok {
1177
+ writeJSON(w, oe.Status, map[string]any{"error": oe.Error(), "detail": oe.Body})
1178
+ } else {
1179
+ writeJSON(w, 502, map[string]any{"error": err.Error()})
1180
+ }
1181
+ return
1182
+ }
1183
+ defer resp.Body.Close()
1184
+
1185
+ if req.Stream {
1186
+ w.Header().Set("Content-Type", "text/event-stream")
1187
+ w.Header().Set("Cache-Control", "no-cache")
1188
+ w.Header().Set("Connection", "keep-alive")
1189
+ w.Header().Set("X-Accel-Buffering", "no")
1190
+ flusher, ok := w.(http.Flusher)
1191
+ if !ok {
1192
+ writeJSON(w, 500, map[string]any{"error": "streaming unsupported"})
1193
+ return
1194
+ }
1195
+ streamOpenAI(w, flusher.Flush, resp.Body, model, rid)
1196
+ } else {
1197
+ content := collectOpenAI(resp.Body)
1198
+ writeJSON(w, 200, map[string]any{
1199
+ "id": rid, "object": "chat.completion", "created": time.Now().Unix(), "model": model,
1200
+ "choices": []any{map[string]any{
1201
+ "index": 0, "message": map[string]any{"role": "assistant", "content": content},
1202
+ "finish_reason": "stop",
1203
+ }},
1204
+ "usage": map[string]any{"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0},
1205
+ })
1206
+ }
1207
+ }
1208
+
1209
+ // ── Anthropic: POST /v1/messages ──
1210
+
1211
+ type anthropicReq struct {
1212
+ Model string `json:"model"`
1213
+ Messages []chatMsg `json:"messages"`
1214
+ System any `json:"system,omitempty"` // string or []content_block
1215
+ Stream bool `json:"stream"`
1216
+ MaxTokens int `json:"max_tokens,omitempty"`
1217
+ Temperature *float64 `json:"temperature,omitempty"`
1218
+ PersonaID *int `json:"persona_id,omitempty"`
1219
+ }
1220
+
1221
+ func handleAnthropic(w http.ResponseWriter, r *http.Request) {
1222
+ if r.Method != http.MethodPost {
1223
+ writeJSON(w, 405, map[string]any{"error": map[string]any{"type": "invalid_request_error", "message": "method not allowed"}})
1224
+ return
1225
+ }
1226
+ if !checkClientAuth(r) {
1227
+ writeJSON(w, 401, map[string]any{"type": "error", "error": map[string]any{"type": "authentication_error", "message": "invalid api key"}})
1228
+ return
1229
+ }
1230
+ var req anthropicReq
1231
+ if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
1232
+ writeJSON(w, 400, map[string]any{"type": "error", "error": map[string]any{"type": "invalid_request_error", "message": "invalid JSON"}})
1233
+ return
1234
+ }
1235
+ token := resolveAuth(r.Header.Get("x-api-key"), r.Header.Get("Authorization"))
1236
+ if token == "" {
1237
+ writeJSON(w, 401, map[string]any{"type": "error", "error": map[string]any{"type": "authentication_error", "message": "No auth. Set ONYX_KEYS env var."}})
1238
+ return
1239
+ }
1240
+ if len(req.Messages) == 0 {
1241
+ writeJSON(w, 400, map[string]any{"type": "error", "error": map[string]any{"type": "invalid_request_error", "message": "messages is required"}})
1242
+ return
1243
+ }
1244
+
1245
+ model := req.Model
1246
+ if model == "" {
1247
+ model = "claude-opus-4-6"
1248
+ }
1249
+ temp := 0.5
1250
+ if req.Temperature != nil {
1251
+ temp = *req.Temperature
1252
+ }
1253
+ persona := defaultPersona
1254
+ if req.PersonaID != nil {
1255
+ persona = *req.PersonaID
1256
+ }
1257
+
1258
+ system := textContent(req.System)
1259
+
1260
+ rid := genID("msg_")
1261
+ resp, err := doOnyxRequest(r.Context(), token, model, req.Messages, system, temp, persona)
1262
+ if err != nil {
1263
+ if oe, ok := err.(*onyxError); ok {
1264
+ writeJSON(w, oe.Status, map[string]any{"type": "error", "error": map[string]any{"type": "api_error", "message": oe.Error()}})
1265
+ } else {
1266
+ writeJSON(w, 502, map[string]any{"type": "error", "error": map[string]any{"type": "api_error", "message": err.Error()}})
1267
+ }
1268
+ return
1269
+ }
1270
+ defer resp.Body.Close()
1271
+
1272
+ if req.Stream {
1273
+ w.Header().Set("Content-Type", "text/event-stream")
1274
+ w.Header().Set("Cache-Control", "no-cache")
1275
+ w.Header().Set("Connection", "keep-alive")
1276
+ w.Header().Set("X-Accel-Buffering", "no")
1277
+ flusher, ok := w.(http.Flusher)
1278
+ if !ok {
1279
+ writeJSON(w, 500, map[string]any{"type": "error", "error": map[string]any{"type": "api_error", "message": "streaming unsupported"}})
1280
+ return
1281
+ }
1282
+ streamAnthropic(w, flusher.Flush, resp.Body, model, rid)
1283
+ } else {
1284
+ text, thinking := collectAnthropic(resp.Body)
1285
+ var content []any
1286
+ if thinking != "" {
1287
+ content = append(content, map[string]any{"type": "thinking", "thinking": thinking})
1288
+ }
1289
+ content = append(content, map[string]any{"type": "text", "text": text})
1290
+ writeJSON(w, 200, map[string]any{
1291
+ "id": rid, "type": "message", "role": "assistant", "model": model,
1292
+ "content": content,
1293
+ "stop_reason": "end_turn",
1294
+ "usage": map[string]any{"input_tokens": 0, "output_tokens": 0},
1295
+ })
1296
+ }
1297
+ }
1298
+
1299
+ // ── Main ────────────────────────────────────────────────────────────────
1300
+
1301
+ func main() {
1302
+ if env := os.Getenv("ONYX_KEYS"); env != "" {
1303
+ for _, k := range strings.Split(env, ",") {
1304
+ if k = strings.TrimSpace(k); k != "" {
1305
+ keys = append(keys, k)
1306
+ }
1307
+ }
1308
+ }
1309
+ if env := firstNonEmpty(os.Getenv("CLIENT_API_KEYS"), os.Getenv("CLIENT_API_KEY")); env != "" {
1310
+ clientKeys = map[string]struct{}{}
1311
+ for _, k := range strings.Split(env, ",") {
1312
+ if k = strings.TrimSpace(k); k != "" {
1313
+ clientKeys[k] = struct{}{}
1314
+ }
1315
+ }
1316
+ }
1317
+ if len(clientKeys) > 0 && len(keys) == 0 {
1318
+ log.Fatal("CLIENT_API_KEYS/CLIENT_API_KEY requires ONYX_KEYS to be set")
1319
+ }
1320
+ listenAddr := resolveListenAddr()
1321
+
1322
+ mux := http.NewServeMux()
1323
+ mux.HandleFunc("/", handleRoot)
1324
+ mux.HandleFunc("/v1/chat/completions", handleOpenAI)
1325
+ mux.HandleFunc("/v1/messages", handleAnthropic)
1326
+ mux.HandleFunc("/v1/models", handleModels)
1327
+ mux.HandleFunc("/health", handleHealth)
1328
+
1329
+ srv := &http.Server{
1330
+ Addr: listenAddr,
1331
+ Handler: mux,
1332
+ ReadHeaderTimeout: 10 * time.Second,
1333
+ ReadTimeout: 60 * time.Second,
1334
+ IdleTimeout: 120 * time.Second,
1335
+ }
1336
+ log.Printf("onyx2api v%s | onyx_keys=%d | client_keys=%d | listen=%s", ver, len(keys), len(clientKeys), listenAddr)
1337
+ log.Fatal(srv.ListenAndServe())
1338
+ }
1339
+
1340
+ func firstNonEmpty(vals ...string) string {
1341
+ for _, v := range vals {
1342
+ if strings.TrimSpace(v) != "" {
1343
+ return v
1344
+ }
1345
+ }
1346
+ return ""
1347
+ }
1348
+
1349
+ func resolveListenAddr() string {
1350
+ if addr := strings.TrimSpace(os.Getenv("LISTEN_ADDR")); addr != "" {
1351
+ return addr
1352
+ }
1353
+ port := strings.TrimSpace(os.Getenv("PORT"))
1354
+ if port == "" {
1355
+ port = "9898"
1356
+ }
1357
+ if strings.Contains(port, ":") {
1358
+ return port
1359
+ }
1360
+ return ":" + port
1361
+ }