Spaces:

burtenshaw
/

openenv-benchmark

Sleeping

App Files Files Community

burtenshaw HF Staff commited on Dec 6, 2025

Commit

4729fab

verified ·

1 Parent(s): 8dc7c86

Upload folder using huggingface_hub

Browse files

Files changed (18) hide show

Dockerfile +81 -0
README.md +244 -4
__init__.py +13 -0
client.py +104 -0
models.py +35 -0
openenv.yaml +7 -0
openenv_benchmark.egg-info/PKG-INFO +9 -0
openenv_benchmark.egg-info/SOURCES.txt +14 -0
openenv_benchmark.egg-info/dependency_links.txt +1 -0
openenv_benchmark.egg-info/entry_points.txt +2 -0
openenv_benchmark.egg-info/requires.txt +5 -0
openenv_benchmark.egg-info/top_level.txt +1 -0
pyproject.toml +43 -0
server/__init__.py +12 -0
server/app.py +74 -0
server/benchmark_environment.py +153 -0
server/requirements.txt +6 -0
test_concurrency.py +98 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,81 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+# Multi-stage build using openenv-base
+# This Dockerfile is flexible and works for both:
+# - In-repo environments (with local OpenEnv sources)
+# - Standalone environments (with openenv from PyPI/Git)
+# The build script (openenv build) handles context detection and sets appropriate build args.
+ARG BASE_IMAGE=ghcr.io/meta-pytorch/openenv-base:latest
+FROM ${BASE_IMAGE} AS builder
+WORKDIR /app
+# Ensure git is available (required for installing dependencies from VCS)
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends git && \
+    rm -rf /var/lib/apt/lists/*
+# Build argument to control whether we're building standalone or in-repo
+ARG BUILD_MODE=in-repo
+ARG ENV_NAME=benchmark
+# Copy environment code (always at root of build context)
+COPY . /app/env
+# For in-repo builds, openenv is already vendored in the build context
+# For standalone builds, openenv will be installed via pyproject.toml
+WORKDIR /app/env
+# Ensure uv is available (for local builds where base image lacks it)
+RUN if ! command -v uv >/dev/null 2>&1; then \
+        curl -LsSf https://astral.sh/uv/install.sh | sh && \
+        mv /root/.local/bin/uv /usr/local/bin/uv && \
+        mv /root/.local/bin/uvx /usr/local/bin/uvx; \
+    fi
+# Install dependencies using uv sync
+# If uv.lock exists, use it; otherwise resolve on the fly
+RUN --mount=type=cache,target=/root/.cache/uv \
+    if [ -f uv.lock ]; then \
+        uv sync --frozen --no-install-project --no-editable; \
+    else \
+        uv sync --no-install-project --no-editable; \
+    fi
+RUN --mount=type=cache,target=/root/.cache/uv \
+    if [ -f uv.lock ]; then \
+        uv sync --frozen --no-editable; \
+    else \
+        uv sync --no-editable; \
+    fi
+# Final runtime stage
+FROM ${BASE_IMAGE}
+WORKDIR /app
+# Copy the virtual environment from builder
+COPY --from=builder /app/env/.venv /app/.venv
+# Copy the environment code
+COPY --from=builder /app/env /app/env
+# Set PATH to use the virtual environment
+ENV PATH="/app/.venv/bin:$PATH"
+# Set PYTHONPATH so imports work correctly
+ENV PYTHONPATH="/app/env:$PYTHONPATH"
+# Health check
+HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:8000/health || exit 1
+# Run the FastAPI server
+# The module path is constructed to work with the /app/env structure
+ENV ENABLE_WEB_INTERFACE=true
+CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]

README.md CHANGED Viewed

@@ -1,10 +1,250 @@
 ---
-title: Openenv Benchmark
-emoji: 🌍
-colorFrom: blue
 colorTo: blue
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Benchmark Environment Server
+emoji: 🕹️
+colorFrom: purple
 colorTo: blue
 sdk: docker
 pinned: false
+app_port: 8000
+base_path: /web
+tags:
+  - openenv
 ---
+# Benchmark Environment
+A test environment for benchmarking infrastructure and concurrency. Actions specify how many seconds to wait (sleep), making it ideal for testing parallel execution and server scaling. Returns server identity information to verify which instance handled each request.
+## Quick Start
+The simplest way to use the Benchmark environment is through the `BenchmarkEnv` class:
+```python
+from benchmark import BenchmarkAction, BenchmarkEnv
+try:
+    # Create environment from Docker image
+    benchmarkenv = BenchmarkEnv.from_docker_image("benchmark-env:latest")
+    # Reset - get server identity
+    result = benchmarkenv.reset()
+    print(f"Host URL: {result.observation.host_url}")
+    print(f"PID: {result.observation.pid}")
+    print(f"Session Hash: {result.observation.session_hash}")
+    # Test concurrency with different wait times
+    wait_times = [0.5, 1.0, 2.0]
+    for seconds in wait_times:
+        result = benchmarkenv.step(BenchmarkAction(wait_seconds=seconds))
+        print(f"Waited: {result.observation.waited_seconds}s")
+        print(f"  → Timestamp: {result.observation.timestamp}")
+        print(f"  → Reward: {result.reward}")
+        print(f"  → Server PID: {result.observation.pid}")
+finally:
+    # Always clean up
+    benchmarkenv.close()
+```
+That's it! The `BenchmarkEnv.from_docker_image()` method handles:
+- Starting the Docker container
+- Waiting for the server to be ready
+- Connecting to the environment
+- Container cleanup when you call `close()`
+## Testing Concurrency
+The benchmark environment is designed to test concurrent execution:
+```python
+import asyncio
+from benchmark import BenchmarkAction, BenchmarkEnv
+async def parallel_requests():
+    # Connect to multiple servers or same server
+    clients = [
+        BenchmarkEnv(base_url="http://localhost:8000"),
+        BenchmarkEnv(base_url="http://localhost:8001"),
+        BenchmarkEnv(base_url="http://localhost:8002"),
+    ]
+    # Reset all clients
+    for client in clients:
+        result = client.reset()
+        print(f"Server {result.observation.session_hash}: PID {result.observation.pid}")
+    # Send concurrent requests with different wait times
+    import concurrent.futures
+    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
+        futures = []
+        for i, client in enumerate(clients):
+            future = executor.submit(
+                client.step,
+                BenchmarkAction(wait_seconds=i + 1)
+            )
+            futures.append((client, future))
+        for client, future in futures:
+            result = future.result()
+            print(f"Server {result.observation.session_hash} waited {result.observation.waited_seconds}s")
+    # Clean up
+    for client in clients:
+        client.close()
+```
+## Building the Docker Image
+Before using the environment, you need to build the Docker image:
+```bash
+# From project root
+docker build -t benchmark-env:latest -f server/Dockerfile .
+```
+## Deploying to Hugging Face Spaces
+You can easily deploy your OpenEnv environment to Hugging Face Spaces using the `openenv push` command:
+```bash
+# From the environment directory (where openenv.yaml is located)
+openenv push
+# Or specify options
+openenv push --namespace my-org --private
+```
+The `openenv push` command will:
+1. Validate that the directory is an OpenEnv environment (checks for `openenv.yaml`)
+2. Prepare a custom build for Hugging Face Docker space (enables web interface)
+3. Upload to Hugging Face (ensuring you're logged in)
+### Prerequisites
+- Authenticate with Hugging Face: The command will prompt for login if not already authenticated
+### Options
+- `--directory`, `-d`: Directory containing the OpenEnv environment (defaults to current directory)
+- `--repo-id`, `-r`: Repository ID in format 'username/repo-name' (defaults to 'username/env-name' from openenv.yaml)
+- `--base-image`, `-b`: Base Docker image to use (overrides Dockerfile FROM)
+- `--private`: Deploy the space as private (default: public)
+### Examples
+```bash
+# Push to your personal namespace (defaults to username/env-name from openenv.yaml)
+openenv push
+# Push to a specific repository
+openenv push --repo-id my-org/my-env
+# Push with a custom base image
+openenv push --base-image ghcr.io/meta-pytorch/openenv-base:latest
+# Push as a private space
+openenv push --private
+# Combine options
+openenv push --repo-id my-org/my-env --base-image custom-base:latest --private
+```
+After deployment, your space will be available at:
+`https://huggingface.co/spaces/<repo-id>`
+The deployed space includes:
+- **Web Interface** at `/web` - Interactive UI for exploring the environment
+- **API Documentation** at `/docs` - Full OpenAPI/Swagger interface
+- **Health Check** at `/health` - Container health monitoring
+## Environment Details
+### Action
+**BenchmarkAction**: Contains a single field
+- `wait_seconds` (float) - Seconds to wait/sleep before returning (default: 0.0)
+### Observation
+**BenchmarkObservation**: Contains server identity and timing information
+- `host_url` (str) - The URL of the server that handled the request
+- `pid` (int) - Process ID of the server
+- `session_hash` (str) - Unique 16-character hash identifying this server session
+- `waited_seconds` (float) - Actual seconds waited
+- `timestamp` (float) - Unix timestamp when observation was created
+- `reward` (float) - Reward based on wait time
+- `done` (bool) - Always False for benchmark environment
+- `metadata` (dict) - Additional info
+### Reward
+The reward is calculated as: `1.0 / (1.0 + wait_seconds)`
+- 0 seconds → reward: 1.0
+- 1 second → reward: 0.5
+- 2 seconds → reward: 0.33
+- Encourages faster responses
+## Advanced Usage
+### Connecting to an Existing Server
+If you already have a Benchmark environment server running, you can connect directly:
+```python
+from benchmark import BenchmarkEnv, BenchmarkAction
+# Connect to existing server
+benchmarkenv = BenchmarkEnv(base_url="<ENV_HTTP_URL_HERE>")
+# Use as normal
+result = benchmarkenv.reset()
+print(f"Connected to server: {result.observation.host_url}")
+print(f"Session: {result.observation.session_hash}")
+result = benchmarkenv.step(BenchmarkAction(wait_seconds=1.5))
+print(f"Waited {result.observation.waited_seconds}s")
+```
+Note: When connecting to an existing server, `benchmarkenv.close()` will NOT stop the server.
+## Development & Testing
+### Direct Environment Testing
+Test the environment logic directly without starting the HTTP server:
+```bash
+# From the server directory
+python3 server/benchmark_environment.py
+```
+This verifies that:
+- Environment resets correctly
+- Step executes actions properly
+- State tracking works
+- Server identity is returned correctly
+### Running Locally
+Run the server locally for development:
+```bash
+uvicorn server.app:app --reload
+```
+## Project Structure
+```
+benchmark/
+├── .dockerignore         # Docker build exclusions
+├── __init__.py            # Module exports
+├── README.md              # This file
+├── openenv.yaml           # OpenEnv manifest
+├── pyproject.toml         # Project metadata and dependencies
+├── uv.lock                # Locked dependencies (generated)
+├── client.py              # BenchmarkEnv client implementation
+├── models.py              # Action and Observation models
+└── server/
+    ├── __init__.py        # Server module exports
+    ├── benchmark_environment.py  # Core environment logic
+    ├── app.py             # FastAPI application
+    └── Dockerfile         # Container image definition
+```

__init__.py ADDED Viewed

	@@ -0,0 +1,13 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Benchmark Environment - Test environment for infrastructure and concurrency benchmarking."""
+from .client import BenchmarkEnv
+from .models import BenchmarkAction, BenchmarkObservation
+__all__ = ["BenchmarkAction", "BenchmarkObservation", "BenchmarkEnv"]

client.py ADDED Viewed

	@@ -0,0 +1,104 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Benchmark Environment HTTP Client.
+This module provides the client for connecting to a Benchmark Environment server
+over HTTP. Useful for testing concurrency and infrastructure.
+"""
+from typing import Dict
+from openenv.core.client_types import StepResult
+from openenv.core.env_server.types import State
+from openenv.core.http_env_client import HTTPEnvClient
+from .models import BenchmarkAction, BenchmarkObservation
+class BenchmarkEnv(HTTPEnvClient[BenchmarkAction, BenchmarkObservation]):
+    """
+    HTTP client for the Benchmark Environment.
+    This client connects to a BenchmarkEnvironment HTTP server and provides
+    methods to interact with it: reset(), step(), and state access.
+    Example:
+        >>> # Connect to a running server
+        >>> client = BenchmarkEnv(base_url="http://localhost:8000")
+        >>> result = client.reset()
+        >>> print(result.observation.host_url)
+        >>> print(result.observation.pid)
+        >>> print(result.observation.session_hash)
+        >>>
+        >>> # Test concurrency by waiting
+        >>> result = client.step(BenchmarkAction(wait_seconds=2.0))
+        >>> print(result.observation.waited_seconds)
+    Example with Docker:
+        >>> # Automatically start container and connect
+        >>> client = BenchmarkEnv.from_docker_image("benchmark-env:latest")
+        >>> result = client.reset()
+        >>> result = client.step(BenchmarkAction(wait_seconds=1.0))
+    """
+    def _step_payload(self, action: BenchmarkAction) -> Dict:
+        """
+        Convert BenchmarkAction to JSON payload for step request.
+        Args:
+            action: BenchmarkAction instance
+        Returns:
+            Dictionary representation suitable for JSON encoding
+        """
+        return {
+            "wait_seconds": action.wait_seconds,
+        }
+    def _parse_result(self, payload: Dict) -> StepResult[BenchmarkObservation]:
+        """
+        Parse server response into StepResult[BenchmarkObservation].
+        Args:
+            payload: JSON response from server
+        Returns:
+            StepResult with BenchmarkObservation
+        """
+        obs_data = payload.get("observation", {})
+        observation = BenchmarkObservation(
+            host_url=obs_data.get("host_url", ""),
+            pid=obs_data.get("pid", 0),
+            session_hash=obs_data.get("session_hash", ""),
+            waited_seconds=obs_data.get("waited_seconds", 0.0),
+            timestamp=obs_data.get("timestamp", 0.0),
+            done=payload.get("done", False),
+            reward=payload.get("reward"),
+            metadata=obs_data.get("metadata", {}),
+        )
+        return StepResult(
+            observation=observation,
+            reward=payload.get("reward"),
+            done=payload.get("done", False),
+        )
+    def _parse_state(self, payload: Dict) -> State:
+        """
+        Parse server response into State object.
+        Args:
+            payload: JSON response from /state endpoint
+        Returns:
+            State object with episode_id and step_count
+        """
+        return State(
+            episode_id=payload.get("episode_id"),
+            step_count=payload.get("step_count", 0),
+        )

models.py ADDED Viewed

	@@ -0,0 +1,35 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Data models for the Benchmark Environment.
+The benchmark environment is designed for testing concurrency and infrastructure.
+Actions specify a wait time in seconds, allowing testing of parallel execution.
+"""
+from pydantic import Field
+from openenv.core.env_server.types import Action, Observation
+class BenchmarkAction(Action):
+    """Action for the Benchmark environment - specifies seconds to wait."""
+    wait_seconds: float = Field(default=0.0, ge=0.0, description="Seconds to wait/sleep")
+class BenchmarkObservation(Observation):
+    """Observation from the Benchmark environment with server identity info."""
+    # Server identity
+    host_url: str = Field(default="", description="URL of the server that handled the request")
+    pid: int = Field(default=0, description="Process ID of the server")
+    session_hash: str = Field(default="", description="Unique hash identifying this server session")
+    # Timing info
+    waited_seconds: float = Field(default=0.0, description="Actual seconds waited")
+    timestamp: float = Field(default=0.0, description="Unix timestamp when observation was created")

openenv.yaml ADDED Viewed

	@@ -0,0 +1,7 @@

+spec_version: 1
+name: benchmark
+type: space
+runtime: fastapi
+app: server.app:app
+port: 8000

openenv_benchmark.egg-info/PKG-INFO ADDED Viewed

	@@ -0,0 +1,9 @@

+Metadata-Version: 2.4
+Name: openenv-benchmark
+Version: 0.1.0
+Summary: Benchmark environment for OpenEnv
+Requires-Python: >=3.10
+Requires-Dist: openenv[core]>=0.2.0
+Provides-Extra: dev
+Requires-Dist: pytest>=8.0.0; extra == "dev"
+Requires-Dist: pytest-cov>=4.0.0; extra == "dev"

openenv_benchmark.egg-info/SOURCES.txt ADDED Viewed

	@@ -0,0 +1,14 @@

+README.md
+pyproject.toml
+./__init__.py
+./client.py
+./models.py
+openenv_benchmark.egg-info/PKG-INFO
+openenv_benchmark.egg-info/SOURCES.txt
+openenv_benchmark.egg-info/dependency_links.txt
+openenv_benchmark.egg-info/entry_points.txt
+openenv_benchmark.egg-info/requires.txt
+openenv_benchmark.egg-info/top_level.txt
+server/__init__.py
+server/app.py
+server/benchmark_environment.py

openenv_benchmark.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+

openenv_benchmark.egg-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ [console_scripts]
2	+ server = benchmark.server.app:main

openenv_benchmark.egg-info/requires.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+openenv[core]>=0.2.0
+[dev]
+pytest>=8.0.0
+pytest-cov>=4.0.0

openenv_benchmark.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ benchmark

pyproject.toml ADDED Viewed

	@@ -0,0 +1,43 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+[build-system]
+requires = ["setuptools>=45", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "openenv-benchmark"
+version = "0.1.0"
+description = "Benchmark environment for OpenEnv"
+requires-python = ">=3.10"
+dependencies = [
+    # Core OpenEnv runtime (provides FastAPI server + HTTP client types)
+    "openenv[core]>=0.2.0",
+    # Environment-specific dependencies
+    # Add all dependencies needed for your environment here
+    # Examples:
+    # "numpy>=1.19.0",
+    # "torch>=2.0.0",
+    # "gymnasium>=0.29.0",
+    # "openspiel>=1.0.0",
+    # "smolagents>=1.22.0,<2",
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-cov>=4.0.0",
+]
+[project.scripts]
+# Server entry point - enables running via: uv run --project . server
+# or: python -m benchmark.server.app
+server = "benchmark.server.app:main"
+[tool.setuptools]
+include-package-data = true
+packages = ["benchmark", "benchmark.server"]
+package-dir = { "benchmark" = ".", "benchmark.server" = "server" }

server/__init__.py ADDED Viewed

	@@ -0,0 +1,12 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Benchmark environment server components."""
+from .benchmark_environment import BenchmarkEnvironment
+__all__ = ["BenchmarkEnvironment"]

server/app.py ADDED Viewed

	@@ -0,0 +1,74 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+FastAPI application for the Benchmark Environment.
+This module creates an HTTP server that exposes the BenchmarkEnvironment
+over HTTP endpoints, making it compatible with HTTPEnvClient.
+Usage:
+    # Development (with auto-reload):
+    uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
+    # Production:
+    uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 4
+    # Or run directly:
+    python -m server.app
+"""
+try:
+    from openenv.core.env_server.http_server import create_app
+except Exception as e:  # pragma: no cover
+    raise ImportError(
+        "openenv is required for the web interface. Install dependencies with '\n    uv sync\n'"
+    ) from e
+from benchmark.models import BenchmarkAction, BenchmarkObservation
+from .benchmark_environment import BenchmarkEnvironment
+# Create the environment instance
+env = BenchmarkEnvironment()
+# Create the app with web interface and README integration
+app = create_app(
+    env,
+    BenchmarkAction,
+    BenchmarkObservation,
+    env_name="benchmark",
+)
+def main(host: str = "0.0.0.0", port: int = 8000):
+    """
+    Entry point for direct execution via uv run or python -m.
+    This function enables running the server without Docker:
+        uv run --project . server
+        uv run --project . server --port 8001
+        python -m benchmark.server.app
+    Args:
+        host: Host address to bind to (default: "0.0.0.0")
+        port: Port number to listen on (default: 8000)
+    For production deployments, consider using uvicorn directly with
+    multiple workers:
+        uvicorn benchmark.server.app:app --workers 4
+    """
+    import uvicorn
+    uvicorn.run(app, host=host, port=port)
+if __name__ == "__main__":
+    import argparse
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--port", type=int, default=8000)
+    args = parser.parse_args()
+    main(port=args.port)

server/benchmark_environment.py ADDED Viewed

	@@ -0,0 +1,153 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Benchmark Environment Implementation.
+A test environment for benchmarking infrastructure and concurrency.
+Actions specify how many seconds to wait, allowing testing of parallel execution.
+"""
+import asyncio
+import hashlib
+import os
+import socket
+import time
+from uuid import uuid4
+from openenv.core.env_server.interfaces import Environment
+from openenv.core.env_server.types import State
+from models import BenchmarkAction, BenchmarkObservation
+def _get_host_url() -> str:
+    """Get the host URL for this server."""
+    hostname = socket.gethostname()
+    port = os.environ.get("PORT", "8000")
+    # Try to get the actual IP if possible
+    try:
+        ip = socket.gethostbyname(hostname)
+    except socket.gaierror:
+        ip = "127.0.0.1"
+    return f"http://{ip}:{port}"
+class BenchmarkEnvironment(Environment):
+    """
+    A benchmark environment for testing concurrency and infrastructure.
+    Actions specify a number of seconds to wait (sleep), which is useful for
+    testing parallel execution and concurrency limits. The environment returns
+    identity information (host_url, pid, session_hash) to help verify which
+    server instance handled the request.
+    Example:
+        >>> env = BenchmarkEnvironment()
+        >>> obs = env.reset()
+        >>> print(obs.host_url)  # "http://192.168.1.1:8000"
+        >>> print(obs.pid)  # 12345
+        >>> print(obs.session_hash)  # "a1b2c3d4..."
+        >>>
+        >>> obs = env.step(BenchmarkAction(wait_seconds=2.0))
+        >>> print(obs.waited_seconds)  # 2.0
+    """
+    def __init__(self):
+        """Initialize the benchmark environment."""
+        self._state = State(episode_id=str(uuid4()), step_count=0)
+        self._session_hash = hashlib.sha256(
+            f"{uuid4()}-{time.time()}-{os.getpid()}".encode()
+        ).hexdigest()[:16]
+        self._pid = os.getpid()
+        self._host_url = _get_host_url()
+    def _make_observation(
+        self, waited_seconds: float = 0.0, done: bool = False, reward: float = 0.0
+    ) -> BenchmarkObservation:
+        """Create an observation with current server identity."""
+        return BenchmarkObservation(
+            host_url=self._host_url,
+            pid=self._pid,
+            session_hash=self._session_hash,
+            waited_seconds=waited_seconds,
+            timestamp=time.time(),
+            done=done,
+            reward=reward,
+        )
+    def reset(self) -> BenchmarkObservation:
+        """
+        Reset the environment.
+        Returns:
+            BenchmarkObservation with server identity info
+        """
+        self._state = State(episode_id=str(uuid4()), step_count=0)
+        return self._make_observation(waited_seconds=0.0, done=False, reward=0.0)
+    def step(self, action: BenchmarkAction) -> BenchmarkObservation:  # type: ignore[override]
+        """
+        Execute a step by waiting for the specified seconds.
+        Args:
+            action: BenchmarkAction containing wait_seconds
+        Returns:
+            BenchmarkObservation with server identity and timing info
+        """
+        self._state.step_count += 1
+        wait_time = max(0.0, action.wait_seconds)
+        # Synchronous sleep - for async version, use step_async
+        if wait_time > 0:
+            time.sleep(wait_time)
+        # Reward based on wait time (inverse - faster is better)
+        reward = 1.0 / (1.0 + wait_time)
+        return self._make_observation(
+            waited_seconds=wait_time,
+            done=False,
+            reward=reward,
+        )
+    async def step_async(self, action: BenchmarkAction) -> BenchmarkObservation:
+        """
+        Async version of step - uses asyncio.sleep for better concurrency.
+        Args:
+            action: BenchmarkAction containing wait_seconds
+        Returns:
+            BenchmarkObservation with server identity and timing info
+        """
+        self._state.step_count += 1
+        wait_time = max(0.0, action.wait_seconds)
+        if wait_time > 0:
+            await asyncio.sleep(wait_time)
+        reward = 1.0 / (1.0 + wait_time)
+        return self._make_observation(
+            waited_seconds=wait_time,
+            done=False,
+            reward=reward,
+        )
+    @property
+    def state(self) -> State:
+        """
+        Get the current environment state.
+        Returns:
+            Current State with episode_id and step_count
+        """
+        return self._state

server/requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+openenv[core]>=0.2.0
+fastapi>=0.115.0
+uvicorn>=0.24.0

test_concurrency.py ADDED Viewed

	@@ -0,0 +1,98 @@

+#!/usr/bin/env python3
+"""
+Test script for benchmark environment concurrency.
+Run the server first:
+    cd benchmark && uvicorn server.app:app --reload --port 8000
+Then run this script:
+    python test_concurrency.py --requests 10 --wait 1.0
+"""
+import argparse
+import asyncio
+import time
+import httpx
+BASE_URL = "http://localhost:8000"
+async def reset(client: httpx.AsyncClient) -> dict:
+    """Reset the environment and return observation."""
+    response = await client.post(f"{BASE_URL}/reset")
+    response.raise_for_status()
+    return response.json()
+async def step(client: httpx.AsyncClient, wait_seconds: float) -> dict:
+    """Execute a step with the given wait time."""
+    response = await client.post(
+        f"{BASE_URL}/step",
+        json={"action": {"wait_seconds": wait_seconds}},
+    )
+    response.raise_for_status()
+    return response.json()
+async def timed_request(client: httpx.AsyncClient, wait_seconds: float, request_id: int) -> dict:
+    """Make a timed request and return results with timing info."""
+    start = time.perf_counter()
+    result = await step(client, wait_seconds)
+    elapsed = time.perf_counter() - start
+    obs = result["observation"]
+    return {
+        "request_id": request_id,
+        "wait_requested": wait_seconds,
+        "elapsed": elapsed,
+        "pid": obs["pid"],
+        "session_hash": obs["session_hash"],
+    }
+async def test_concurrent(num_requests: int, wait_seconds: float) -> dict:
+    """Test concurrent requests and return timing stats."""
+    async with httpx.AsyncClient(timeout=60.0) as client:
+        # Reset first
+        reset_result = await reset(client)
+        obs = reset_result["observation"]
+        print(f"Server: {obs['host_url']} | PID: {obs['pid']} | Session: {obs['session_hash']}")
+        print(f"Running {num_requests} concurrent requests, each waiting {wait_seconds}s...")
+        start = time.perf_counter()
+        # Launch all requests concurrently
+        tasks = [timed_request(client, wait_seconds, i) for i in range(num_requests)]
+        results = await asyncio.gather(*tasks)
+        total_time = time.perf_counter() - start
+        avg_time = sum(r["elapsed"] for r in results) / len(results)
+        return {
+            "num_requests": num_requests,
+            "wait_seconds": wait_seconds,
+            "total_time": total_time,
+            "avg_time": avg_time,
+        }
+async def main():
+    parser = argparse.ArgumentParser(description="Test benchmark environment concurrency")
+    parser.add_argument("--requests", "-n", type=int, default=10, help="Number of concurrent requests")
+    parser.add_argument("--wait", "-w", type=float, default=1.0, help="Wait time per request (seconds)")
+    parser.add_argument("--url", "-u", type=str, default="http://localhost:8000", help="Server URL")
+    args = parser.parse_args()
+    global BASE_URL
+    BASE_URL = args.url
+    result = await test_concurrent(args.requests, args.wait)
+    print(f"\nTotal time: {result['total_time']:.3f}s")
+    print(f"Avg time:   {result['avg_time']:.3f}s")
+if __name__ == "__main__":
+    asyncio.run(main())