| --- |
| title: HotCopy.ai |
| emoji: π₯ |
| colorFrom: red |
| colorTo: gray |
| sdk: static |
| pinned: true |
| short_description: Recursive Language Model CLI on Cloudflare Workers. |
| tags: |
| - recursive-language-models |
| - agentic-coding |
| - cloudflare-workers |
| - cli |
| - rlm |
| - arxiv:2512.24601 |
| --- |
| |
| HotCopy is a Recursive Language Model CLI on Cloudflare Workers. Unbounded context, agent swarms in parallel, two-tier orchestrator/worker architecture (256K context per agent), no API keys, no config. |
|
|
| <div class="hc-org-card"> |
| <style> |
| .hc-org-card { |
| background: #0a0a0a; |
| color: #e8e8e8; |
| font-family: 'IBM Plex Mono', ui-monospace, SFMono-Regular, monospace; |
| font-weight: 400; |
| padding: 32px 24px; |
| border-radius: 8px; |
| margin: 16px 0 32px; |
| } |
| .hc-org-card .hc-stamp { |
| display: inline-block; |
| font-size: 12px; |
| letter-spacing: 0.12em; |
| color: #d4bf8a; |
| border: 1px solid #d4bf8a; |
| padding: 4px 8px; |
| margin-bottom: 24px; |
| } |
| .hc-org-card .hc-wordmark { |
| font-size: 30px; |
| font-weight: 600; |
| letter-spacing: -0.02em; |
| margin: 0 0 16px; |
| line-height: 1.1; |
| } |
| .hc-org-card .hc-wordmark .glyph { |
| color: #e05030; |
| } |
| .hc-org-card .hc-tagline { |
| font-size: 16px; |
| line-height: 1.5; |
| color: #e8e8e8; |
| max-width: 56ch; |
| margin: 0 0 24px; |
| } |
| .hc-org-card .hc-tagline em { |
| font-style: normal; |
| color: #d4bf8a; |
| } |
| .hc-org-card .hc-cta-row { |
| display: flex; |
| flex-wrap: wrap; |
| gap: 16px; |
| margin-top: 8px; |
| } |
| .hc-org-card .hc-cta { |
| display: inline-block; |
| font-size: 14px; |
| font-weight: 600; |
| letter-spacing: 0.04em; |
| padding: 12px 24px; |
| text-decoration: none; |
| border: 1px solid #e05030; |
| border-radius: 4px; |
| } |
| .hc-org-card .hc-cta-primary { |
| background: #e05030; |
| color: #0a0a0a; |
| } |
| .hc-org-card .hc-cta-secondary { |
| background: transparent; |
| color: #e05030; |
| } |
| .hc-org-card .hc-meta { |
| font-size: 12px; |
| color: #d4bf8a; |
| letter-spacing: 0.08em; |
| margin-top: 24px; |
| } |
| </style> |
| <div class="hc-stamp">βΈ OPS-BRIEF / SECTOR: DEV-INFRA</div> |
| <h1 class="hc-wordmark"><span class="glyph">h.</span>HotCopy.ai</h1> |
| <p class="hc-tagline">Your codebase is too big for any context window. <em>So we killed the context window.</em> Managed recursive AI coding CLI β agent swarms decompose impossible problems in parallel.</p> |
| <div class="hc-cta-row"> |
| <a class="hc-cta hc-cta-primary" href="https://hotcopy.ai">Install</a> |
| </div> |
| <div class="hc-meta">βΈ UNBOUNDED CONTEXT // 50+ PARALLEL AGENTS // 0 API KEYS</div> |
| </div> |
| |
| ## What HotCopy is |
|
|
| Managed recursive AI coding CLI. Agent swarms decompose impossible problems in parallel. No context limits. No API keys. No config. Just install and go. HotCopy treats your project as an *environment* the model interacts with programmatically β never raw files in a prompt. |
|
|
| ## The Recursive Language Model paradigm |
|
|
| HotCopy implements the Recursive Language Model (RLM) paradigm introduced by Zhang, Kraska, and Khattab at MIT CSAIL ([arXiv:2512.24601](https://arxiv.org/abs/2512.24601), Jan 2026). Existing scaffolds (Claude Code, Gemini CLI, Cursor) place the user prompt directly into the LLM's context window, generate output autoregressively (capped by token limits), and verbalize sub-LLM calls (only O(1) delegations). RLMs invert this: the prompt is loaded as a REPL variable, the model writes code to interrogate it, sub-calls are programmatic functions inside that code, and the answer is built up in variables and returned via `FINAL_VAR()` β unbounded by any single model's generation length. |
|
|
| ## Benchmarks vs base model |
|
|
| | Benchmark | Input Size | Base GPT-5 | RLM(GPT-5) | Avg Cost | |
| |---|---|---|---|---| |
| | BrowseComp+ | 6β11M tokens | **0%** | **91.3%** | $0.99 | |
| | OOLONG | 131K tokens | 44% | 56.5% | $0.43 | |
| | OOLONG-Pairs | 32K tokens | 0.04% | 58.0% | $0.33 | |
| | CodeQA | 23Kβ4.2M tokens | 24% | 62% | $0.11 | |
|
|
| Median RLM cost β€ base model cost; up to 3Γ cheaper than summary-agent baselines. Source: Zhang, Kraska, Khattab. *Recursive Language Models.* MIT CSAIL. [arXiv:2512.24601](https://arxiv.org/abs/2512.24601), Jan 2026. |
|
|
| ## How it works |
|
|
| - **Prompt as environment.** Project context is loaded into Durable Object memory as REPL variables. The orchestrator sees only metadata (file count, repo map, total chars) β never raw files. |
| - **Two-tier orchestrator/worker architecture.** The RLM root model owns a 256K orchestration window. Every sub-call routes to a separate worker model with its own 256K window and multimodal capability β managed for you, no provider keys to wire up. |
| - **REPL with stdout truncation.** The orchestrator writes JavaScript that runs in a sandboxed V8 isolate (`@cloudflare/codemode`). Stdout is truncated before being appended to history β variables hold the long state, the model history holds the metadata. |
| - **Parallel sub-calls via `llm_batch()`.** Unlike the paper's sequential implementation, HotCopy fans out scouts in parallel through Cloudflare Workers, dramatically reducing latency on map-reduce decompositions. |
| - **Durable Objects + AI Gateway.** Every model call routes through `hotcopy-gateway` for cost tracking, prefix caching (per-session affinity), and rate-limit handling. State lives in DO SQL β trajectories, sub-calls, costs, all logged for debugging. |
| - **`FINAL_VAR()` for unbounded output.** The model returns the *name* of the variable holding the answer. The reply is whatever that variable holds β no autoregressive output cap. |
|
|
| ## Architecture deep-dive |
|
|
| <details> |
| <summary><strong>REPL scope, three-tier memory, and the permissions model</strong></summary> |
|
|
| ### REPL scope (root code) |
|
|
| | Category | Functions | |
| |---|---| |
| | Context | `context.read(path)`, `context.readAll(paths)`, `context.search(pattern)`, `context.preview(chars)`, `context.metadata`, `context.files` | |
| | Sub-calls | `llm_query(prompt)`, `llm_batch(prompts)` | |
| | Agents | `spawn_worker(role, task, ctx)`, `vision_analyze(base64, prompt)` | |
| | Memory | `memory_search(query, limit)`, `memory_failures(query)`, `memory_save(type, content)` | |
| | File tools | `write_file(path, content)`, `edit_file(path, old, new)`, `shell(cmd, cwd)` | |
| | Verification | `verify(claim, evidence)`, `verify_claims(...)` | |
| | Persistent state | `set_context(label, content)`, `load_context(label)` | |
| | Terminal | `FINAL(text)`, `FINAL_VAR(varName)`, `print(...)` | |
|
|
| ### Three-tier memory |
|
|
| - **Working memory** β per-agent context window. Compaction at 70β75% utilization. Budget: 12% system, 18% tools, 25% retrieved context, 25% history, 5% user input, 15% output reserve. |
| - **Session memory** β `SWARM_STATE.md` task manifest, continuously updated by the orchestrator. Exploits transformer recency bias by placing the live task list at the end of context. |
| - **Project memory** β D1 + Vectorize. Observations, decisions, failures indexed for semantic search. AI-compressed after N turns. Exposed via MCP for external tool access. |
|
|
| ### Permissions |
|
|
| Default-deny on `fetch_*`, `browse`, `crawl`, `shell`. Wildcard rules in D1 (`permission_rules`) let users selectively allow patterns (`api.github.com/*`, `git status`, etc.) at three persistence tiers β session, user, or global. Every quarantined call hits a tier-cascade evaluator; first match wins; "ask" rules round-trip to the CLI for a per-call decision. |
|
|
| ### Stack |
|
|
| Cloudflare Workers + Agents SDK Β· Durable Objects (RLMOrchestrator, WorkerAgent, VisionAgent, MemoryAgent) Β· D1 Β· Vectorize Β· R2 Β· Dynamic Workers (`@cloudflare/codemode`) Β· AI Gateway (`hotcopy-gateway`). |
|
|
| </details> |
|
|
| ## Install |
|
|
| ```bash |
| npm i -g hotcopy && hotcopy |
| ``` |
|
|
| Sign in on first run. No API keys. No config files. Managed inference is included. |
|
|
| ## What we publish on Hugging Face |
|
|
| <div class="hf-grid"> |
| <style> |
| .hc-org-card + .hf-grid, .hf-grid { |
| display: grid; |
| grid-template-columns: repeat(auto-fit, minmax(220px, 1fr)); |
| gap: 16px; |
| margin: 24px 0; |
| } |
| .hf-grid .hf-card { |
| background: #0a0a0a; |
| color: #e8e8e8; |
| border: 1px solid #d4bf8a; |
| border-radius: 4px; |
| padding: 16px; |
| text-decoration: none; |
| font-family: 'IBM Plex Mono', ui-monospace, SFMono-Regular, monospace; |
| font-weight: 400; |
| } |
| .hf-grid .hf-card:hover { |
| border-color: #e05030; |
| } |
| .hf-grid .hf-card .hf-emoji { |
| font-size: 20px; |
| display: block; |
| margin-bottom: 8px; |
| } |
| .hf-grid .hf-card .hf-title { |
| font-size: 14px; |
| font-weight: 600; |
| color: #e05030; |
| display: block; |
| margin-bottom: 4px; |
| } |
| .hf-grid .hf-card .hf-sub { |
| font-size: 12px; |
| color: #d4bf8a; |
| display: block; |
| } |
| </style> |
| <a class="hf-card" href="https://huggingface.co/spaces/HotCopyAI/hotcopy-rlm-demo"> |
| <span class="hf-emoji">π</span> |
| <span class="hf-title">Showcase Space</span> |
| <span class="hf-sub">Live RLM demo</span> |
| </a> |
| <a class="hf-card" href="https://huggingface.co/datasets/HotCopyAI/rlm-trajectories-seed"> |
| <span class="hf-emoji">π</span> |
| <span class="hf-title">RLM Trajectories</span> |
| <span class="hf-sub">Open dataset</span> |
| </a> |
| <a class="hf-card" href="https://hotcopy.ai"> |
| <span class="hf-emoji">π</span> |
| <span class="hf-title">hotcopy.ai</span> |
| <span class="hf-sub">Product site</span> |
| </a> |
| <a class="hf-card" href="https://arxiv.org/abs/2512.24601"> |
| <span class="hf-emoji">π</span> |
| <span class="hf-title">Paper</span> |
| <span class="hf-sub">arXiv:2512.24601</span> |
| </a> |
| </div> |
| |
| --- |
|
|
| Contact: hello@hotcopy.ai Β· [hotcopy.ai](https://hotcopy.ai) |
|
|