""" FastAPI application for the Dalaal Browser-Use Environment. Endpoints: - POST /reset: Reset the environment (pass task name in body) - POST /step: Execute a browser action - GET /state: Get current environment state - GET /schema: Get action/observation schemas - WS /ws: WebSocket endpoint for persistent sessions """ try: from openenv.core.env_server.http_server import create_app except Exception as e: raise ImportError( "openenv is required. Install with: uv sync" ) from e try: from ..models import DalaalEnvAction, DalaalEnvObservation from .dalaal_env_environment import DalaalEnvEnvironment except (ImportError, SystemError): try: from models import DalaalEnvAction, DalaalEnvObservation from server.dalaal_env_environment import DalaalEnvEnvironment except ImportError: from dalaal_env.models import DalaalEnvAction, DalaalEnvObservation from dalaal_env.server.dalaal_env_environment import DalaalEnvEnvironment app = create_app( DalaalEnvEnvironment, DalaalEnvAction, DalaalEnvObservation, env_name="dalaal_env", max_concurrent_envs=1, ) # ── Landing page & info endpoints ──────────────────────────────────── from fastapi.responses import HTMLResponse, JSONResponse try: from server.tasks import TASKS except ImportError: try: from .tasks import TASKS except (ImportError, SystemError): from dalaal_env.server.tasks import TASKS @app.get("/", response_class=HTMLResponse) async def landing_page(): task_rows = "" for tid in sorted(TASKS): t = TASKS[tid] task_rows += f""" {t.id} {t.description} {t.site_file} {t.max_steps} """ return f""" Dalaal Env — Browser-Use RL Environment

Dalaal Env

A reinforcement learning environment where LLM agents learn to navigate and interact with web pages through accessibility tree observations.

OpenEnv Framework Playwright + CDP 19 Tasks

Overview

{len(TASKS)}
Browser Tasks
12
Mock Websites
7
Action Types
6
Benchmark Sources

Architecture

LLM Agent
Qwen / GPT / etc.
Dalaal Env
FastAPI + OpenEnv
Browser
Playwright + Chromium

The agent observes a numbered accessibility tree (extracted via CDP) and emits structured actions (click, type, select, scroll, etc.). The environment executes actions in a headless browser and evaluates task-specific JavaScript success criteria.

API Endpoints

WS /ws — WebSocket for persistent sessions (primary)
POST /reset — Reset environment with a task
POST /step — Execute a browser action
GET /state — Get current observation
GET /tasks — List all available tasks (JSON)
GET /docs — Interactive API documentation (Swagger)

Action Space

Each action is a JSON object with action_type and relevant parameters:

ActionParametersDescription
clickelement_idClick an element by its accessibility tree ID
typeelement_id, textType text into an input field
selectelement_id, textSelect a dropdown option by visible text
press_keykeyPress a keyboard key (Enter, Tab, etc.)
scrolldirectionScroll the page (up/down)
waitWait for page to settle
doneSignal task completion

Available Tasks

{task_rows}
Task IDDescriptionMock SiteMax Steps

Reward Structure

+1.0 on task success  |  -0.01 per step penalty  |  Clamped to [0, 1]

Example: completing a task in 4 steps → reward = max(0, 1.0 - 0.04) = 0.96

Quick Start

Run inference against this environment:

API_BASE_URL=https://router.huggingface.co/v1 \\ MODEL_NAME=Qwen/Qwen3.5-27B \\ HF_TOKEN=hf_... \\ DALAAL_TASK=todo_add \\ python inference.py
""" @app.get("/tasks") async def list_tasks_endpoint(): return JSONResponse({ tid: {"description": t.description, "site_file": t.site_file, "max_steps": t.max_steps} for tid, t in sorted(TASKS.items()) }) def main(host: str = "0.0.0.0", port: int = 8000): """Entry point for direct execution.""" import uvicorn uvicorn.run(app, host=host, port=port) if __name__ == "__main__": main()