""" FastAPI application for the Dalaal Browser-Use Environment. Endpoints: - POST /reset: Reset the environment (pass task name in body) - POST /step: Execute a browser action - GET /state: Get current environment state - GET /schema: Get action/observation schemas - WS /ws: WebSocket endpoint for persistent sessions """ try: from openenv.core.env_server.http_server import create_app except Exception as e: raise ImportError( "openenv is required. Install with: uv sync" ) from e try: from ..models import DalaalEnvAction, DalaalEnvObservation from .dalaal_env_environment import DalaalEnvEnvironment except (ImportError, SystemError): try: from models import DalaalEnvAction, DalaalEnvObservation from server.dalaal_env_environment import DalaalEnvEnvironment except ImportError: from dalaal_env.models import DalaalEnvAction, DalaalEnvObservation from dalaal_env.server.dalaal_env_environment import DalaalEnvEnvironment app = create_app( DalaalEnvEnvironment, DalaalEnvAction, DalaalEnvObservation, env_name="dalaal_env", max_concurrent_envs=1, ) # ── Landing page & info endpoints ──────────────────────────────────── from fastapi.responses import HTMLResponse, JSONResponse try: from server.tasks import TASKS except ImportError: try: from .tasks import TASKS except (ImportError, SystemError): from dalaal_env.server.tasks import TASKS @app.get("/", response_class=HTMLResponse) async def landing_page(): task_rows = "" for tid in sorted(TASKS): t = TASKS[tid] task_rows += f"""
{t.id}A reinforcement learning environment where LLM agents learn to navigate and interact with web pages through accessibility tree observations.
OpenEnv Framework Playwright + CDP 19 TasksThe agent observes a numbered accessibility tree (extracted via CDP) and emits structured actions (click, type, select, scroll, etc.). The environment executes actions in a headless browser and evaluates task-specific JavaScript success criteria.
/ws — WebSocket for persistent sessions (primary)/reset — Reset environment with a task/step — Execute a browser action/state — Get current observation/tasks — List all available tasks (JSON)/docs — Interactive API documentation (Swagger)Each action is a JSON object with action_type and relevant parameters:
| Action | Parameters | Description |
|---|---|---|
click | element_id | Click an element by its accessibility tree ID |
type | element_id, text | Type text into an input field |
select | element_id, text | Select a dropdown option by visible text |
press_key | key | Press a keyboard key (Enter, Tab, etc.) |
scroll | direction | Scroll the page (up/down) |
wait | — | Wait for page to settle |
done | — | Signal task completion |
| Task ID | Description | Mock Site | Max Steps |
|---|
+1.0 on task success | -0.01 per step penalty | Clamped to [0, 1]
Example: completing a task in 4 steps → reward = max(0, 1.0 - 0.04) = 0.96
Run inference against this environment:
API_BASE_URL=https://router.huggingface.co/v1 \\
MODEL_NAME=Qwen/Qwen3.5-27B \\
HF_TOKEN=hf_... \\
DALAAL_TASK=todo_add \\
python inference.py