hotcopy commited on
Commit
6b8fdf6
·
verified ·
1 Parent(s): 8c3da01

Initial org card — RLM CLI, benchmarks, install

Browse files
Files changed (1) hide show
  1. README.md +241 -6
README.md CHANGED
@@ -1,10 +1,245 @@
1
  ---
2
- title: README
3
- emoji: 📉
4
- colorFrom: purple
5
- colorTo: green
6
  sdk: static
7
- pinned: false
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: HotCopy.ai
3
+ emoji: 🔥
4
+ colorFrom: red
5
+ colorTo: gray
6
  sdk: static
7
+ pinned: true
8
+ short_description: Recursive Language Model CLI on Cloudflare Workers.
9
+ models:
10
+ - moonshotai/Kimi-K2-Instruct
11
+ - google/gemma-2-27b-it
12
+ tags:
13
+ - recursive-language-models
14
+ - agentic-coding
15
+ - cloudflare-workers
16
+ - cli
17
+ - rlm
18
+ - arxiv:2512.24601
19
  ---
20
 
21
+ HotCopy is a Recursive Language Model CLI on Cloudflare Workers. Unbounded context, agent swarms in parallel, two-model architecture (Kimi K2.6 orchestrator + Gemma 4 workers, 256K context each), no API keys, no config.
22
+
23
+ <div class="hc-org-card">
24
+ <style>
25
+ .hc-org-card {
26
+ background: #0a0a0a;
27
+ color: #e8e8e8;
28
+ font-family: 'IBM Plex Mono', ui-monospace, SFMono-Regular, monospace;
29
+ font-weight: 400;
30
+ padding: 32px 24px;
31
+ border-radius: 8px;
32
+ margin: 16px 0 32px;
33
+ }
34
+ .hc-org-card .hc-stamp {
35
+ display: inline-block;
36
+ font-size: 12px;
37
+ letter-spacing: 0.12em;
38
+ color: #d4bf8a;
39
+ border: 1px solid #d4bf8a;
40
+ padding: 4px 8px;
41
+ margin-bottom: 24px;
42
+ }
43
+ .hc-org-card .hc-wordmark {
44
+ font-size: 30px;
45
+ font-weight: 600;
46
+ letter-spacing: -0.02em;
47
+ margin: 0 0 16px;
48
+ line-height: 1.1;
49
+ }
50
+ .hc-org-card .hc-wordmark .glyph {
51
+ color: #e05030;
52
+ }
53
+ .hc-org-card .hc-tagline {
54
+ font-size: 16px;
55
+ line-height: 1.5;
56
+ color: #e8e8e8;
57
+ max-width: 56ch;
58
+ margin: 0 0 24px;
59
+ }
60
+ .hc-org-card .hc-tagline em {
61
+ font-style: normal;
62
+ color: #d4bf8a;
63
+ }
64
+ .hc-org-card .hc-cta-row {
65
+ display: flex;
66
+ flex-wrap: wrap;
67
+ gap: 16px;
68
+ margin-top: 8px;
69
+ }
70
+ .hc-org-card .hc-cta {
71
+ display: inline-block;
72
+ font-size: 14px;
73
+ font-weight: 600;
74
+ letter-spacing: 0.04em;
75
+ padding: 12px 24px;
76
+ text-decoration: none;
77
+ border: 1px solid #e05030;
78
+ border-radius: 4px;
79
+ }
80
+ .hc-org-card .hc-cta-primary {
81
+ background: #e05030;
82
+ color: #0a0a0a;
83
+ }
84
+ .hc-org-card .hc-cta-secondary {
85
+ background: transparent;
86
+ color: #e05030;
87
+ }
88
+ .hc-org-card .hc-meta {
89
+ font-size: 12px;
90
+ color: #d4bf8a;
91
+ letter-spacing: 0.08em;
92
+ margin-top: 24px;
93
+ }
94
+ </style>
95
+ <div class="hc-stamp">▸ OPS-BRIEF / SECTOR: DEV-INFRA</div>
96
+ <h1 class="hc-wordmark"><span class="glyph">h.</span>HotCopy.ai</h1>
97
+ <p class="hc-tagline">Your codebase is too big for any context window. <em>So we killed the context window.</em> Managed recursive AI coding CLI — agent swarms decompose impossible problems in parallel.</p>
98
+ <div class="hc-cta-row">
99
+ <a class="hc-cta hc-cta-primary" href="https://hotcopy.ai">Install</a>
100
+ <a class="hc-cta hc-cta-secondary" href="https://github.com/hotcopyai/hotcopy">GitHub</a>
101
+ </div>
102
+ <div class="hc-meta">▸ UNBOUNDED CONTEXT // 50+ PARALLEL AGENTS // 0 API KEYS</div>
103
+ </div>
104
+
105
+ ## What HotCopy is
106
+
107
+ Managed recursive AI coding CLI. Agent swarms decompose impossible problems in parallel. No context limits. No API keys. No config. Just install and go. HotCopy treats your project as an *environment* the model interacts with programmatically — never raw files in a prompt.
108
+
109
+ ## The Recursive Language Model paradigm
110
+
111
+ HotCopy implements the Recursive Language Model (RLM) paradigm introduced by Zhang, Kraska, and Khattab at MIT CSAIL ([arXiv:2512.24601](https://arxiv.org/abs/2512.24601), Jan 2026). Existing scaffolds (Claude Code, Gemini CLI, Cursor) place the user prompt directly into the LLM's context window, generate output autoregressively (capped by token limits), and verbalize sub-LLM calls (only O(1) delegations). RLMs invert this: the prompt is loaded as a REPL variable, the model writes code to interrogate it, sub-calls are programmatic functions inside that code, and the answer is built up in variables and returned via `FINAL_VAR()` — unbounded by any single model's generation length.
112
+
113
+ ## Benchmarks vs base model
114
+
115
+ | Benchmark | Input Size | Base GPT-5 | RLM(GPT-5) | Avg Cost |
116
+ |---|---|---|---|---|
117
+ | BrowseComp+ | 6–11M tokens | **0%** | **91.3%** | $0.99 |
118
+ | OOLONG | 131K tokens | 44% | 56.5% | $0.43 |
119
+ | OOLONG-Pairs | 32K tokens | 0.04% | 58.0% | $0.33 |
120
+ | CodeQA | 23K–4.2M tokens | 24% | 62% | $0.11 |
121
+
122
+ Median RLM cost ≤ base model cost; up to 3× cheaper than summary-agent baselines. Source: Zhang, Kraska, Khattab. *Recursive Language Models.* MIT CSAIL. [arXiv:2512.24601](https://arxiv.org/abs/2512.24601), Jan 2026.
123
+
124
+ ## How it works
125
+
126
+ - **Prompt as environment.** Project context is loaded into Durable Object memory as REPL variables. The orchestrator sees only metadata (file count, repo map, total chars) — never raw files.
127
+ - **Two-model architecture.** Kimi K2.6 (`@cf/moonshotai/kimi-k2.6`) is the RLM root model with a 256K context window. Gemma 4 (`@cf/google/gemma-4-26b-a4b-it`) is every sub-call worker, also 256K, with vision.
128
+ - **REPL with stdout truncation.** The orchestrator writes JavaScript that runs in a sandboxed V8 isolate (`@cloudflare/codemode`). Stdout is truncated before being appended to history — variables hold the long state, the model history holds the metadata.
129
+ - **Parallel sub-calls via `llm_batch()`.** Unlike the paper's sequential implementation, HotCopy fans out scouts in parallel through Cloudflare Workers, dramatically reducing latency on map-reduce decompositions.
130
+ - **Durable Objects + AI Gateway.** Every model call routes through `hotcopy-gateway` for cost tracking, prefix caching (per-session affinity), and rate-limit handling. State lives in DO SQL — trajectories, sub-calls, costs, all logged for debugging.
131
+ - **`FINAL_VAR()` for unbounded output.** The model returns the *name* of the variable holding the answer. The reply is whatever that variable holds — no autoregressive output cap.
132
+
133
+ ## Architecture deep-dive
134
+
135
+ <details>
136
+ <summary><strong>REPL scope, three-tier memory, and the permissions model</strong></summary>
137
+
138
+ ### REPL scope (root code)
139
+
140
+ | Category | Functions |
141
+ |---|---|
142
+ | Context | `context.read(path)`, `context.readAll(paths)`, `context.search(pattern)`, `context.preview(chars)`, `context.metadata`, `context.files` |
143
+ | Sub-calls | `llm_query(prompt)`, `llm_batch(prompts)` |
144
+ | Agents | `spawn_worker(role, task, ctx)`, `vision_analyze(base64, prompt)` |
145
+ | Memory | `memory_search(query, limit)`, `memory_failures(query)`, `memory_save(type, content)` |
146
+ | File tools | `write_file(path, content)`, `edit_file(path, old, new)`, `shell(cmd, cwd)` |
147
+ | Verification | `verify(claim, evidence)`, `verify_claims(...)` |
148
+ | Persistent state | `set_context(label, content)`, `load_context(label)` |
149
+ | Terminal | `FINAL(text)`, `FINAL_VAR(varName)`, `print(...)` |
150
+
151
+ ### Three-tier memory
152
+
153
+ - **Working memory** — per-agent context window. Compaction at 70–75% utilization. Budget: 12% system, 18% tools, 25% retrieved context, 25% history, 5% user input, 15% output reserve.
154
+ - **Session memory** — `SWARM_STATE.md` task manifest, continuously updated by the orchestrator. Exploits transformer recency bias by placing the live task list at the end of context.
155
+ - **Project memory** — D1 + Vectorize. Observations, decisions, failures indexed for semantic search. AI-compressed after N turns. Exposed via MCP for external tool access.
156
+
157
+ ### Permissions
158
+
159
+ Default-deny on `fetch_*`, `browse`, `crawl`, `shell`. Wildcard rules in D1 (`permission_rules`) let users selectively allow patterns (`api.github.com/*`, `git status`, etc.) at three persistence tiers — session, user, or global. Every quarantined call hits a tier-cascade evaluator; first match wins; "ask" rules round-trip to the CLI for a per-call decision.
160
+
161
+ ### Stack
162
+
163
+ Cloudflare Workers + Agents SDK · Durable Objects (RLMOrchestrator, WorkerAgent, VisionAgent, MemoryAgent) · D1 · Vectorize · R2 · Dynamic Workers (`@cloudflare/codemode`) · AI Gateway (`hotcopy-gateway`).
164
+
165
+ </details>
166
+
167
+ ## Install
168
+
169
+ ```bash
170
+ npm i -g hotcopy && hotcopy
171
+ ```
172
+
173
+ Auth with GitHub on first run. No API keys. No config files. Managed inference is included.
174
+
175
+ ## What we publish on Hugging Face
176
+
177
+ <div class="hf-grid">
178
+ <style>
179
+ .hc-org-card + .hf-grid, .hf-grid {
180
+ display: grid;
181
+ grid-template-columns: repeat(auto-fit, minmax(220px, 1fr));
182
+ gap: 16px;
183
+ margin: 24px 0;
184
+ }
185
+ .hf-grid .hf-card {
186
+ background: #0a0a0a;
187
+ color: #e8e8e8;
188
+ border: 1px solid #d4bf8a;
189
+ border-radius: 4px;
190
+ padding: 16px;
191
+ text-decoration: none;
192
+ font-family: 'IBM Plex Mono', ui-monospace, SFMono-Regular, monospace;
193
+ font-weight: 400;
194
+ }
195
+ .hf-grid .hf-card:hover {
196
+ border-color: #e05030;
197
+ }
198
+ .hf-grid .hf-card .hf-emoji {
199
+ font-size: 20px;
200
+ display: block;
201
+ margin-bottom: 8px;
202
+ }
203
+ .hf-grid .hf-card .hf-title {
204
+ font-size: 14px;
205
+ font-weight: 600;
206
+ color: #e05030;
207
+ display: block;
208
+ margin-bottom: 4px;
209
+ }
210
+ .hf-grid .hf-card .hf-sub {
211
+ font-size: 12px;
212
+ color: #d4bf8a;
213
+ display: block;
214
+ }
215
+ </style>
216
+ <a class="hf-card" href="https://huggingface.co/spaces/HotCopyAI/hotcopy-rlm-demo">
217
+ <span class="hf-emoji">🚀</span>
218
+ <span class="hf-title">Showcase Space</span>
219
+ <span class="hf-sub">Live RLM demo</span>
220
+ </a>
221
+ <a class="hf-card" href="https://huggingface.co/datasets/HotCopyAI/rlm-trajectories-seed">
222
+ <span class="hf-emoji">📊</span>
223
+ <span class="hf-title">RLM Trajectories</span>
224
+ <span class="hf-sub">Open dataset</span>
225
+ </a>
226
+ <a class="hf-card" href="https://hotcopy.ai">
227
+ <span class="hf-emoji">🌐</span>
228
+ <span class="hf-title">hotcopy.ai</span>
229
+ <span class="hf-sub">Product site</span>
230
+ </a>
231
+ <a class="hf-card" href="https://github.com/hotcopyai/hotcopy">
232
+ <span class="hf-emoji">💻</span>
233
+ <span class="hf-title">GitHub</span>
234
+ <span class="hf-sub">Source + issues</span>
235
+ </a>
236
+ <a class="hf-card" href="https://arxiv.org/abs/2512.24601">
237
+ <span class="hf-emoji">📄</span>
238
+ <span class="hf-title">Paper</span>
239
+ <span class="hf-sub">arXiv:2512.24601</span>
240
+ </a>
241
+ </div>
242
+
243
+ ---
244
+
245
+ License: MIT. Contact: hello@hotcopy.ai · [hotcopy.ai](https://hotcopy.ai)