Spaces:
Sleeping
Sleeping
File size: 11,666 Bytes
59114e2 70f2179 59114e2 70f2179 59114e2 70f2179 0ca4714 70f2179 0ca4714 70f2179 0ca4714 70f2179 0ca4714 70f2179 0ca4714 70f2179 0ca4714 70f2179 0ca4714 70f2179 108e843 70f2179 108e843 70f2179 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 | ---
title: OpenCode Environment Server
emoji: π οΈ
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
- openenv
short_description: OpenCode coding agent in an E2B sandbox with logprob capture
---
# OpenCode Environment for OpenEnv
`opencode_env` runs the [OpenCode](https://opencode.ai) coding agent inside
an isolated [E2B](https://e2b.dev) sandbox against any OpenAI-compatible
LLM endpoint, optionally capturing per-token logprobs for GRPO training.
**π Try it live**: [`AdithyaSK/opencode-env`](https://huggingface.co/spaces/AdithyaSK/opencode-env)
The deployed Space exposes:
- **Web UI** at [`/web`](https://adithyask-opencode-env.hf.space/web) β pick endpoint, write task, hit Run, watch live phase log + reward + logprobs.
- **MCP tool API** at [`/mcp`](https://adithyask-opencode-env.hf.space/mcp) β programmatic `run_rollout` calls.
- **OpenAPI docs** at [`/docs`](https://adithyask-opencode-env.hf.space/docs).
- **Health** at [`/health`](https://adithyask-opencode-env.hf.space/health).
The env is **task-agnostic** β every rollout is configured at call-time
with a uniform Task shape:
- **`instruction`** β prompt for the agent
- **`setup`** β list of bash commands run *before* the agent (pip
install, git clone, file downloads β anything you need staged in the
sandbox)
- **`verify`** β list of bash commands run *after* the agent (asserts,
pytest invocations, score-file writes)
Reward = `passed_verify / total_verify` unless any `verify` command writes
a float to `/home/user/logs/verifier/reward.txt` (override).
## Quick Start
### Async (default β talk to the deployed Space)
```python
import asyncio
import os
from opencode_env import OpenCodeEnv
from opencode_env.client import _extract_text
from opencode_env.models import RolloutResult
async def main():
SPACE = "https://adithyask-opencode-env.hf.space"
async with OpenCodeEnv(base_url=SPACE) as env:
await env.reset()
# The MCP tool returns JSON; deserialize via the typed model.
raw = await env.call_tool(
"run_rollout",
endpoint="openai", # vllm | openai | hf_router
api_key=os.environ["OPENAI_API_KEY"], # or set as a Space secret
instruction=(
"Create binary_search.py exposing def binary_search(arr, target) -> int "
"that returns the index of target in arr, or -1 if absent. Use a "
"relative path."
),
setup=[],
verify=[
"test -f /home/user/workdir/binary_search.py",
"python -c \"import sys; sys.path.insert(0, '/home/user/workdir'); "
"import binary_search; "
"assert binary_search.binary_search([1,2,3], 2) == 1; print('OK')\"",
],
template="opencode-rl", # prebaked E2B template
task_id="binary_search_v1",
)
result = RolloutResult.model_validate_json(_extract_text(raw))
print("reward:", result.reward)
print("turns:", len(result.proxy_turns))
print("files:", list(result.files.keys()))
print("wall:", result.wall_s, "s")
asyncio.run(main())
```
Expected output (~20s with the prebaked template):
```
reward: 1.0
turns: 3
files: ['/home/user/workdir/binary_search.py', ...]
wall: 19.8 s
```
### Sync wrapper
```python
import os
from opencode_env import OpenCodeEnv
# .sync() returns a synchronous wrapper around the async client.
with OpenCodeEnv(base_url="https://adithyask-opencode-env.hf.space").sync() as env:
env.reset()
# MCP tools are reachable via env.call_tool(...) / env.step(...) sync-wrapped.
# See the async example above for the full run_rollout signature.
```
Point `base_url` at `http://localhost:8000` to talk to a local container
instead of the public Space.
### In-process primitive (no HTTP)
For trainers that want to drive a sandbox directly without an HTTP boundary:
```python
import os
from opencode_env import (
OpenCodeConfig, OpenCodeSessionFactory, OpenCodeTask, E2BSandboxBackend,
)
factory = OpenCodeSessionFactory(
config=OpenCodeConfig(
provider="openai_compatible",
base_url="https://api.openai.com/v1",
api_key=os.environ["OPENAI_API_KEY"],
model="gpt-4o-mini",
),
sandbox_backend=E2BSandboxBackend(),
mode="transparent_proxy", # captures per-token logprobs
)
session = factory.create(task=OpenCodeTask(instruction="..."))
session.wait_for_completion()
turns = session.fetch_proxy_trace() # per-turn (tokens, logprobs)
session.close()
```
## Building the Docker Image
The Dockerfile lives at `server/Dockerfile`. Use the `openenv` CLI from
the env root:
```bash
cd envs/opencode_env
openenv validate # check pyproject.toml + openenv.yaml + server/app.py + uv.lock
openenv build -t opencode-env # builds the image (uses server/Dockerfile)
# run locally with E2B credentials
docker run -p 8000:8000 -e E2B_API_KEY=e2b_... opencode-env
# push to HF Spaces (Docker variant)
openenv push --repo-id <user>/opencode-env
```
Or build directly without the CLI:
```bash
docker build -t opencode-env -f envs/opencode_env/server/Dockerfile envs/opencode_env
```
The image:
- Runs `uvicorn server.app:app --host 0.0.0.0 --port 8000`
- Exposes the MCP API at `/mcp` and `/step`, the Gradio UI at `/web`,
health at `/health`, and OpenAPI docs at `/docs`.
- Reads `E2B_API_KEY` and (optionally) endpoint-specific env vars at
runtime (see [Environment Variables](#environment-variables)).
## The MCP Tool: `run_rollout`
Single tool, two ways to specify the LLM endpoint:
**Option A β endpoint shorthand (recommended)**: pass
`endpoint="vllm"` (or `"openai"` / `"hf_router"`). The server resolves
`base_url`, `api_key`, and `model` from env vars + catalog defaults.
Any explicit field overrides the catalog.
**Option B β fully explicit**: pass `base_url` + `api_key` + `model`
directly.
| Arg | Type | Default | Notes |
|---|---|---|---|
| `endpoint` | `str` | `""` | One of `"vllm"` / `"openai"` / `"hf_router"`. |
| `base_url` / `api_key` / `model` | `str` | `""` | Override / supply explicitly. |
| `instruction` | `str` | required | Prompt passed to `opencode run`. |
| `setup` | `list[str]` | `[]` | Bash commands run **before** the agent. |
| `verify` | `list[str]` | `[]` | Bash commands run **after** the agent. |
| `task_id` | `str` | `""` | Echoed back in result. |
| `mode` | `str` | `"transparent_proxy"` | Or `"black_box"` (no logprobs). |
| `disable_thinking` | `bool \| None` | `None` (catalog default) | Inject `chat_template_kwargs.enable_thinking=false`. |
| `max_tokens_cap` | `int` | `4096` | Per-turn `max_tokens` clamp. |
| `top_logprobs` | `int` | `5` | HF Router cap is 5; OpenAI 0β20; vLLM unbounded. |
| `agent_timeout_s` | `float` | `600.0` | Hard wall budget for opencode. |
| `template` | `str` | `""` | E2B template name; `"opencode-rl"` skips ~2 min of install per rollout. |
Returns `RolloutResult` JSON with: `reward`, `setup_results[]`,
`verify_results[]`, `proxy_turns[]`, `files{}`, `agent_log_tail`,
`proxy_log_tail`, `wall_s`, `agent_exit_code`, `sandbox_id`, `error`.
## Two Operating Modes
| Mode | What it does | Best for |
|---|---|---|
| **`transparent_proxy`** (default) | In-sandbox proxy at `localhost:7000` forwards opencode's LLM calls to `base_url`, injects `logprobs=true`, captures per-turn `(messages, completion_tokens, logprobs)` to `proxy_trace.jsonl`. | GRPO / RL training, observability, top-k distillation. |
| **`black_box`** | No proxy. opencode talks straight to `base_url`. | Smoke tests, eval, SFT data collection. |
## Environment Variables
The server reads these at runtime. Local dev auto-loads them from a
sibling `.env` file; on HF Spaces, set them as **Space secrets**.
| Variable | Required | Purpose |
|---|---|---|
| `E2B_API_KEY` | **yes** for any rollout | E2B sandbox credentials. |
| `MAX_CONCURRENT_ENVS` | no | Env-instance pool size. Default `4`. |
| `ENABLE_WEB_INTERFACE` | no | Set `false` to disable the `/web` Gradio mount. Default `true`. |
| **vLLM endpoint** | | |
| `VLLM_URL` | required for `endpoint="vllm"` | OAI-compatible base URL. |
| `VLLM_API_KEY` | no | Defaults to `intercepted`. |
| `VLLM_MODEL` | no | Defaults to `Qwen/Qwen3.5-4B`. |
| **OpenAI endpoint** | | |
| `OPENAI_API_KEY` | required for `endpoint="openai"` | Standard OpenAI key. |
| `OPENAI_BASE_URL` | no | Defaults to `https://api.openai.com/v1`. |
| `OPENAI_MODEL` | no | Defaults to `gpt-4o-mini` (gpt-5.x and o-series refuse logprobs). |
| **HF Router endpoint** | | |
| `HF_ROUTER_API_KEY` | required for `endpoint="hf_router"` | HF user token. |
| `HF_ROUTER_BASE_URL` | no | Defaults to `https://router.huggingface.co/v1`. |
| `HF_ROUTER_MODEL` | no | Defaults to `Qwen/Qwen3-4B-Instruct-2507:nscale`. |
Pick `provider:` suffixes that actually return logprobs:
**Together / Nscale / Scaleway / SambaNova / Cerebras**. Avoid Novita /
Hyperbolic / Featherless (silent drop) and Groq (HTTP 400).
## Pre-baked E2B Template
The first rollout in a fresh E2B sandbox spends ~2 min installing
opencode and the proxy's Python deps. Build a one-time template that
ships those pre-installed:
```bash
.venv/bin/python envs/opencode_env/sandbox/build_template.py
# β builds `opencode-rl` template in your E2B account (~1m20s, one-time)
```
After this, pass `template="opencode-rl"` on every `run_rollout` call β
each rollout drops to ~20β30s end-to-end.
## Project Structure
```
opencode_env/
βββ README.md # this file
βββ openenv.yaml # OpenEnv space spec
βββ pyproject.toml # deps + ``server`` entrypoint
βββ uv.lock # frozen deps (required by ``openenv validate``)
βββ .gitignore / .dockerignore # excludes .env / __pycache__
βββ __init__.py # re-exports primitive + client + models
β
βββ client.py # OpenCodeEnv(MCPToolClient)
βββ models.py # RolloutResult / RolloutTurn / OpenCodeState
β
βββ config.py # OpenCodeConfig (primitive)
βββ harness.py # OpenCodeSession / OpenCodeSessionFactory (CLI-only)
βββ opencode_runtime.py # opencode.json builder + cmds
βββ task.py # OpenCodeTask
β
βββ server/
β βββ __init__.py
β βββ app.py # FastAPI factory; mounts Gradio at /web
β βββ opencode_environment.py # MCPEnvironment with single ``run_rollout`` tool
β βββ gradio_ui.py # the /web Gradio Blocks UI
β βββ catalog.py # endpoint shorthand resolver
β βββ Dockerfile # multi-stage uv build (used by ``openenv build``)
β
βββ sandbox/
βββ __init__.py
βββ base.py # SandboxBackend / SandboxHandle Protocols
βββ e2b.py # E2B implementation
βββ interception.py # in-sandbox FastAPI proxy (logprob capture)
βββ build_template.py # one-time E2B template builder
```
## References
- [OpenEnv docs](https://meta-pytorch.org/OpenEnv/)
- [OpenCode CLI](https://opencode.ai/docs/cli/)
- [E2B Python SDK](https://e2b.dev/docs)
- [HF Inference Providers logprob matrix](../../../DOCS/HF/hf_inference_providers_logprobs.md)
|