Spaces:

openenv-testing
/

julia_env-pr-170

Runtime error

App Files Files Community

burtenshaw HF Staff commited on Nov 9, 2025

Commit

be32845

verified ·

1 Parent(s): 1c99cba

Upload folder using huggingface_hub

Browse files

Files changed (34) hide show

Dockerfile +20 -0
README.md +34 -5
src/core/README.md +180 -0
src/core/__init__.py +19 -0
src/core/client_types.py +22 -0
src/core/containers/__init__.py +7 -0
src/core/containers/images/Dockerfile +47 -0
src/core/containers/images/README.md +92 -0
src/core/containers/runtime/__init__.py +15 -0
src/core/containers/runtime/providers.py +359 -0
src/core/containers/test_local_docker_provider.py +258 -0
src/core/env_server/__init__.py +35 -0
src/core/env_server/base_transforms.py +29 -0
src/core/env_server/http_server.py +233 -0
src/core/env_server/interfaces.py +118 -0
src/core/env_server/types.py +57 -0
src/core/env_server/web_interface.py +1613 -0
src/core/http_env_client.py +207 -0
src/core/pyproject.toml +46 -0
src/core/tools/__init__.py +19 -0
src/core/tools/git_server_client.py +362 -0
src/core/tools/julia_process_pool.py +509 -0
src/core/tools/julia_repl_worker.jl +159 -0
src/core/tools/local_julia_executor.py +474 -0
src/core/tools/local_python_executor.py +105 -0
src/envs/julia_env/__init__.py +13 -0
src/envs/julia_env/julia_env_client.py +117 -0
src/envs/julia_env/models.py +70 -0
src/envs/julia_env/server/Dockerfile +54 -0
src/envs/julia_env/server/README.md +436 -0
src/envs/julia_env/server/__init__.py +8 -0
src/envs/julia_env/server/app.py +455 -0
src/envs/julia_env/server/julia_codeact_env.py +276 -0
src/envs/julia_env/server/julia_transforms.py +87 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,20 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+# Use the specified openenv-base image
+FROM ghcr.io/meta-pytorch/openenv-base:latest
+# Copy only what's needed for this environment
+COPY src/core/ /app/src/core/
+COPY src/envs/julia_env/ /app/src/envs/julia_env/
+# Health check
+HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:8000/health || exit 1
+# Run the FastAPI server
+CMD ["uvicorn", "envs.julia_env.server.app:app", "--host", "0.0.0.0", "--port", "8000"]
+ENV ENABLE_WEB_INTERFACE=true

README.md CHANGED Viewed

@@ -1,10 +1,39 @@
 ---
-title: Julia Env-pr-170
-emoji: 🐨
-colorFrom: red
-colorTo: purple
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Julia_env Environment Server
+emoji: 🐳
+colorFrom: blue
+colorTo: green
 sdk: docker
 pinned: false
+app_port: 8000
+base_path: /web
+tags:
+  - openenv-pr
 ---
+# Julia_env Environment Server
+FastAPI server for julia_env environment powered by Meta's OpenEnv.
+## About
+This Space provides a containerized environment for julia_env interactions.
+Built with FastAPI and OpenEnv framework.
+## Web Interface
+This deployment includes an interactive web interface for exploring the environment:
+- **HumanAgent Interface**: Interact with the environment using a web form
+- **State Observer**: Real-time view of environment state and action history
+- **Live Updates**: WebSocket-based real-time updates
+Access the web interface at: `/web`
+## API Documentation
+Visit `/docs` for interactive API documentation.
+## Health Check
+The environment provides a health check endpoint at `/health`.

src/core/README.md ADDED Viewed

	@@ -0,0 +1,180 @@

+# <img width="35" height="35" alt="image" src="https://github.com/user-attachments/assets/2700a971-e5d6-4036-b03f-2f89c9791609" /> OpenEnv: Agentic Execution Environments
+An e2e framework for creating, deploying and using isolated execution environments for agentic RL training, built using Gymnasium style simple APIs. OpenEnv provides a standard for interacting with agentic execution environments via simple Gymnasium style APIs - step(), reset(), state(). Users of agentic execution environments can interact with the environment during RL training loops using these simple APIs.
+In addition to making it easier for researchers and RL framework writers, we also provide tools for environment creators making it easier for them to create richer environments and make them available over familiar protocols like HTTP and packaged using canonical technologies like docker. Environment creators can use the OpenEnv framework to create environments that are isolated, secure, and easy to deploy and use.
+## Overview
+`openenv-core` provides the foundational building blocks for creating and interacting with containerized environments over HTTP. It enables you to build agent environments that can be deployed as Docker containers and accessed via a simple HTTP API.
+> ⚠️ **Early Development Warning** OpenEnv is currently in an experimental
+> stage. You should expect bugs, incomplete features, and APIs that may change
+> in future versions. The project welcomes bugfixes, but to make sure things are
+> well coordinated you should discuss any significant change before starting the
+> work. It's recommended that you signal your intention to contribute in the
+> issue tracker, either by filing a new issue or by claiming an existing one.
+# OpenEnv Core
+Core components for OpenEnv - a framework for building HTTP-based agentic environments.
+## Features
+- **HTTPEnvClient**: Generic HTTP client for interacting with remote environments
+- **HTTPEnvServer**: FastAPI-based server wrapper for exposing environments over HTTP
+- **Container Providers**: Pluggable architecture for running containers (Docker, Kubernetes, etc.)
+- **Type System**: Strongly-typed Action/Observation/State interfaces
+- **Web Interface**: Optional web UI for interacting with environments
+## Installation
+```bash
+pip install openenv-core
+```
+For development:
+```bash
+pip install openenv-core[dev]
+```
+## Quick Start
+### Creating an Environment Client
+```python
+from openenv_core import HTTPEnvClient, StepResult
+from dataclasses import dataclass
+@dataclass
+class MyAction:
+    text: str
+@dataclass
+class MyObservation:
+    response: str
+class MyEnvClient(HTTPEnvClient[MyAction, MyObservation]):
+    def _step_payload(self, action: MyAction) -> dict:
+        return {"text": action.text}
+    def _parse_result(self, payload: dict) -> StepResult[MyObservation]:
+        obs_data = payload["observation"]
+        return StepResult(
+            observation=MyObservation(**obs_data),
+            reward=payload.get("reward"),
+            done=payload.get("done", False)
+        )
+    def _parse_state(self, payload: dict) -> Any:
+        return payload
+# Use with Docker
+env = MyEnvClient.from_docker_image("my-env:latest")
+result = env.reset()
+step_result = env.step(MyAction(text="hello"))
+env.close()
+```
+### Creating an Environment Server
+```python
+from openenv_core.env_server import Environment, HTTPEnvServer, create_app
+from dataclasses import dataclass
+@dataclass
+class MyAction:
+    text: str
+@dataclass
+class MyObservation:
+    response: str
+    reward: float = 0.0
+    done: bool = False
+class MyEnvironment(Environment):
+    def reset(self) -> MyObservation:
+        return MyObservation(response="Ready")
+    def step(self, action: MyAction) -> MyObservation:
+        return MyObservation(
+            response=f"Echo: {action.text}",
+            reward=1.0,
+            done=False
+        )
+# Create FastAPI app
+env = MyEnvironment()
+app = create_app(env, MyAction, MyObservation)
+# Run with: uvicorn module:app --host 0.0.0.0 --port 8000
+```
+## Container Providers
+OpenEnv Core supports multiple container providers:
+### Local Docker Provider
+```python
+from openenv_core.containers.runtime import LocalDockerProvider
+provider = LocalDockerProvider()
+base_url = provider.start_container("my-env:latest")
+provider.wait_for_ready(base_url)
+# Use environment...
+provider.stop_container()
+```
+### Kubernetes Provider (Coming Soon)
+```python
+from openenv_core.containers.runtime import KubernetesProvider
+provider = KubernetesProvider(namespace="envs")
+base_url = provider.start_container("my-env:latest")
+# Use environment...
+provider.stop_container()
+```
+## API Reference
+### HTTPEnvClient
+Base class for environment clients with these abstract methods:
+- `_step_payload(action)`: Convert action to JSON
+- `_parse_result(payload)`: Parse response to StepResult
+- `_parse_state(payload)`: Parse state response
+### HTTPEnvServer
+Server wrapper with these methods:
+- `register_routes(app)`: Register endpoints on FastAPI app
+- `_deserialize_action(data)`: Convert JSON to Action
+- `_serialize_observation(obs)`: Convert Observation to JSON
+### Environment Interface
+Base interface for environment implementations:
+- `reset()`: Reset environment and return initial observation
+- `step(action)`: Execute action and return observation
+- `state`: Property returning current environment state
+## License
+This project is licensed under the BSD-3-Clause License - see the LICENSE file for details.
+## Contributing
+Contributions are welcome! Please see the main OpenEnv repository for contribution guidelines.
+## Links
+- **Homepage**: https://github.com/meta-pytorch/OpenEnv
+- **Documentation**: https://github.com/meta-pytorch/OpenEnv/blob/main/README.md
+- **Bug Tracker**: https://github.com/meta-pytorch/OpenEnv/issues

src/core/__init__.py ADDED Viewed

	@@ -0,0 +1,19 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Core components for agentic environments."""
+# Re-export main components from submodules for convenience
+from .env_server import *
+from .client_types import StepResult
+from .http_env_client import HTTPEnvClient
+# Note: MCP module doesn't export anything yet
+__all__ = [
+    "HTTPEnvClient",
+    "StepResult",
+]

src/core/client_types.py ADDED Viewed

	@@ -0,0 +1,22 @@

+# Type definitions for EnvTorch
+from dataclasses import dataclass
+from typing import Any, Generic, Optional, TypeVar
+# Generic type for observations
+ObsT = TypeVar("ObsT")  # TypeVar for typehinting in IDEs
+@dataclass
+class StepResult(Generic[ObsT]):
+    """
+    Represents the result of one environment step.
+    Attributes:
+        observation: The environment's observation after the action.
+        reward: Scalar reward for this step (optional).
+        done: Whether the episode is finished.
+    """
+    observation: ObsT
+    reward: Optional[float] = None
+    done: bool = False

src/core/containers/__init__.py ADDED Viewed

	@@ -0,0 +1,7 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Container management for environment servers."""

src/core/containers/images/Dockerfile ADDED Viewed

	@@ -0,0 +1,47 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+#
+# OpenEnv Base Image
+#
+# This is the standard base image for all OpenEnv environment servers.
+# It includes the minimal dependencies needed to run HTTP environment servers.
+#
+# Build: docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+# Tag:   docker tag openenv-base:latest openenv-base:0.1.0
+#
+FROM python:3.11-slim
+# Set metadata
+LABEL maintainer="OpenEnv Team"
+LABEL description="Base image for OpenEnv based environment servers"
+LABEL version="0.1.0"
+# Install system dependencies
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Install Python dependencies that all environments need
+RUN pip install --no-cache-dir \
+    "fastapi>=0.104.0" \
+    "uvicorn[standard]>=0.24.0" \
+    "requests>=2.25.0" \
+    "wsproto>=1.0.0" \
+    smolagents
+# Set working directory
+WORKDIR /app
+# Default environment variables
+ENV PYTHONPATH=/app/src
+ENV PYTHONUNBUFFERED=1
+# Default expose port (can be overridden)
+EXPOSE 8000
+# Note: CMD should be specified in child Dockerfiles

src/core/containers/images/README.md ADDED Viewed

	@@ -0,0 +1,92 @@

+# OpenEnv Base Image
+Standard base image for all OpenEnv environment servers.
+## What's Included
+| Layer | Size | Contents |
+|-------|------|----------|
+| python:3.11-slim | 200 MB  | Base Python runtime |
+| + Dependencies   | 100 MB  | FastAPI, uvicorn, requests |
+| **Total**        | **~300 MB** | Ready for environment servers |
+## Image Sizes
+```
+openenv-base:latest   300 MB  (python + fastapi + uvicorn)
+```
+echo-env:latest        500 MB  (python + fastapi + uvicorn + app)
+coding-env:latest      520 MB  (python + fastapi + uvicorn + app + tools)
+another-env:latest     510 MB  (python + fastapi + uvicorn + app)
+---
+Total: 1.5 GB (with lots of duplication)
+```
+### With Base Images (✅ Solution)
+```
+openenv-base:latest    300 MB  (python + fastapi + uvicorn)
+echo-env:latest         50 MB  (app only, uses base)
+coding-env:latest       70 MB  (app + tools, uses base)
+another-env:latest      45 MB  (app only, uses base)
+---
+Total: 465 MB (base shared, minimal duplication)
+```
+## Building the Base Image
+```bash
+# From project root
+docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+```
+## Usage in Environment Dockerfiles
+Each environment Dockerfile should start with:
+```dockerfile
+FROM openenv-base:latest
+# Copy only environment-specific files
+COPY src/core/ /app/src/core/
+COPY src/envs/my_env/ /app/src/envs/my_env/
+# Run the server
+CMD ["uvicorn", "envs.my_env.server.app:app", "--host", "0.0.0.0", "--port", "8000"]
+```
+## Base Image Contents
+- Python 3.11-slim
+- FastAPI >= 0.104.0
+- Uvicorn >= 0.24.0
+- Requests >= 2.25.0
+- curl (for health checks)
+## Example: Building Echo Environment
+```bash
+# Step 1: Build base image (do this once)
+docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+# Step 2: Build echo environment (uses base)
+docker build -t echo-env:latest -f src/envs/echo_env/server/Dockerfile .
+# Step 3: Run echo environment
+docker run -p 8000:8000 echo-env:latest
+```
+## Updating the Base
+When dependencies need updating:
+1. Update `src/core/containers/images/Dockerfile`
+2. Rebuild base image
+3. Rebuild all environment images (they'll use new base)
+```bash
+# Update base
+docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+# Rebuild environments (they automatically use new base)
+docker build -t echo-env:latest -f src/envs/echo_env/server/Dockerfile .
+```

src/core/containers/runtime/__init__.py ADDED Viewed

	@@ -0,0 +1,15 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Container runtime providers."""
+from .providers import ContainerProvider, KubernetesProvider, LocalDockerProvider
+__all__ = [
+    "ContainerProvider",
+    "LocalDockerProvider",
+    "KubernetesProvider",
+]

src/core/containers/runtime/providers.py ADDED Viewed

	@@ -0,0 +1,359 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Container provider abstractions for running environment servers.
+This module provides a pluggable architecture for different container providers
+(local Docker, Kubernetes, cloud providers, etc.) to be used with HTTPEnvClient.
+"""
+from __future__ import annotations
+from abc import ABC, abstractmethod
+from typing import Any, Dict, Optional
+class ContainerProvider(ABC):
+    """
+    Abstract base class for container providers.
+    Providers implement this interface to support different container platforms:
+    - LocalDockerProvider: Runs containers on local Docker daemon
+    - KubernetesProvider: Runs containers in Kubernetes cluster
+    - FargateProvider: Runs containers on AWS Fargate
+    - CloudRunProvider: Runs containers on Google Cloud Run
+    The provider manages a single container lifecycle and provides the base URL
+    for connecting to it.
+    Example:
+        >>> provider = LocalDockerProvider()
+        >>> base_url = provider.start_container("echo-env:latest")
+        >>> print(base_url)  # http://localhost:8000
+        >>> # Use the environment via base_url
+        >>> provider.stop_container()
+    """
+    @abstractmethod
+    def start_container(
+        self,
+        image: str,
+        port: Optional[int] = None,
+        env_vars: Optional[Dict[str, str]] = None,
+        **kwargs: Any,
+    ) -> str:
+        """
+        Start a container from the specified image.
+        Args:
+            image: Container image name (e.g., "echo-env:latest")
+            port: Port to expose (if None, provider chooses)
+            env_vars: Environment variables to pass to container
+            **kwargs: Provider-specific options
+        Returns:
+            Base URL to connect to the container (e.g., "http://localhost:8000")
+        Raises:
+            RuntimeError: If container fails to start
+        """
+        pass
+    @abstractmethod
+    def stop_container(self) -> None:
+        """
+        Stop and remove the running container.
+        This cleans up the container that was started by start_container().
+        """
+        pass
+    @abstractmethod
+    def wait_for_ready(self, base_url: str, timeout_s: float = 30.0) -> None:
+        """
+        Wait for the container to be ready to accept requests.
+        This typically polls the /health endpoint until it returns 200.
+        Args:
+            base_url: Base URL of the container
+            timeout_s: Maximum time to wait
+        Raises:
+            TimeoutError: If container doesn't become ready in time
+        """
+        pass
+class LocalDockerProvider(ContainerProvider):
+    """
+    Container provider for local Docker daemon.
+    This provider runs containers on the local machine using Docker.
+    Useful for development and testing.
+    Example:
+        >>> provider = LocalDockerProvider()
+        >>> base_url = provider.start_container("echo-env:latest")
+        >>> # Container running on http://localhost:<random-port>
+        >>> provider.stop_container()
+    """
+    def __init__(self):
+        """Initialize the local Docker provider."""
+        self._container_id: Optional[str] = None
+        self._container_name: Optional[str] = None
+        # Check if Docker is available
+        import subprocess
+        try:
+            subprocess.run(
+                ["docker", "version"],
+                check=True,
+                capture_output=True,
+                timeout=5,
+            )
+        except (subprocess.CalledProcessError, FileNotFoundError, subprocess.TimeoutExpired):
+            raise RuntimeError(
+                "Docker is not available. Please install Docker Desktop or Docker Engine."
+            )
+    def start_container(
+        self,
+        image: str,
+        port: Optional[int] = None,
+        env_vars: Optional[Dict[str, str]] = None,
+        **kwargs: Any,
+    ) -> str:
+        """
+        Start a Docker container locally.
+        Args:
+            image: Docker image name
+            port: Port to expose (if None, finds available port)
+            env_vars: Environment variables for the container
+            **kwargs: Additional Docker run options
+                - memory_gb: Memory limit in GB (default: 4GB)
+                - command_override: List of command args to override container CMD
+        Returns:
+            Base URL to connect to the container
+        """
+        import subprocess
+        import time
+        import logging
+        logger = logging.getLogger(__name__)
+        # Find available port if not specified
+        if port is None:
+            port = self._find_available_port()
+        # Use default memory limit if not specified
+        memory_gb = kwargs.get("memory_gb", 16)
+        # Generate container name
+        self._container_name = self._generate_container_name(image)
+        # Build docker run command
+        # Use host networking for better performance and consistency with podman
+        # NOTE: Do NOT use --rm initially - if container fails to start, we need logs
+        cmd = [
+            "docker", "run",
+            "-d",  # Detached
+            "--name", self._container_name,
+            "--network", "host",  # Use host network
+            "--memory", f"{memory_gb}g",  # Limit container memory
+            "--memory-swap", f"{memory_gb}g",  # Prevent swap usage (set equal to --memory)
+            "--oom-kill-disable=false",  # Allow OOM killer (exit gracefully)
+        ]
+        # Add environment variables
+        if env_vars:
+            for key, value in env_vars.items():
+                cmd.extend(["-e", f"{key}={value}"])
+        # Pass custom port via environment variable instead of overriding command
+        # This allows the container to use its proper entrypoint/CMD
+        if port != 8000:
+            cmd.extend(["-e", f"PORT={port}"])
+        # Add image
+        cmd.append(image)
+        # Add command override if provided (explicit override by user)
+        if "command_override" in kwargs:
+            cmd.extend(kwargs["command_override"])
+        # Run container
+        try:
+            logger.debug(f"Starting container with command: {' '.join(cmd)}")
+            result = subprocess.run(cmd, capture_output=True, text=True, check=True)
+            self._container_id = result.stdout.strip()
+            logger.debug(f"Container started with ID: {self._container_id}")
+        except subprocess.CalledProcessError as e:
+            error_msg = f"Failed to start Docker container.\nCommand: {' '.join(cmd)}\nExit code: {e.returncode}\nStderr: {e.stderr}\nStdout: {e.stdout}"
+            raise RuntimeError(error_msg) from e
+        # Wait a moment for container to start
+        time.sleep(1)
+        base_url = f"http://127.0.0.1:{port}"
+        return base_url
+    def stop_container(self) -> None:
+        """
+        Stop and remove the Docker container.
+        """
+        if self._container_id is None:
+            return
+        import subprocess
+        try:
+            # Stop container
+            subprocess.run(
+                ["docker", "stop", self._container_id],
+                capture_output=True,
+                check=True,
+                timeout=10,
+            )
+            # Remove container
+            subprocess.run(
+                ["docker", "rm", self._container_id],
+                capture_output=True,
+                check=True,
+                timeout=10,
+            )
+        except subprocess.CalledProcessError:
+            # Container might already be stopped/removed
+            pass
+        finally:
+            self._container_id = None
+            self._container_name = None
+    def wait_for_ready(self, base_url: str, timeout_s: float = 30.0) -> None:
+        """
+        Wait for container to be ready by polling /health endpoint.
+        Args:
+            base_url: Base URL of the container
+            timeout_s: Maximum time to wait
+        Raises:
+            TimeoutError: If container doesn't become ready
+        """
+        import time
+        import requests
+        import subprocess
+        import logging
+        start_time = time.time()
+        health_url = f"{base_url}/health"
+        last_error = None
+        while time.time() - start_time < timeout_s:
+            try:
+                response = requests.get(health_url, timeout=2.0)
+                if response.status_code == 200:
+                    return
+            except requests.RequestException as e:
+                last_error = str(e)
+            time.sleep(0.5)
+        # If we timeout, provide diagnostic information
+        error_msg = f"Container at {base_url} did not become ready within {timeout_s}s"
+        if self._container_id:
+            try:
+                # First check if container exists
+                inspect_result = subprocess.run(
+                    ["docker", "inspect", self._container_id],
+                    capture_output=True,
+                    text=True,
+                    timeout=5,
+                )
+                if inspect_result.returncode != 0:
+                    # Container doesn't exist - likely exited and auto-removed due to --rm flag
+                    error_msg += f"\n\nContainer was auto-removed (likely exited immediately)."
+                    error_msg += f"\nThis typically means:"
+                    error_msg += f"\n  1. The container image has an error in its startup script"
+                    error_msg += f"\n  2. Required dependencies are missing in the container"
+                    error_msg += f"\n  3. Port {base_url.split(':')[-1]} might be in use by another process"
+                    error_msg += f"\n  4. Container command/entrypoint is misconfigured"
+                    error_msg += f"\nTry running the container manually to debug:"
+                    error_msg += f"\n  docker run -it --rm <IMAGE_NAME>"
+                else:
+                    # Container exists, try to get logs
+                    result = subprocess.run(
+                        ["docker", "logs", "--tail", "50", self._container_id],
+                        capture_output=True,
+                        text=True,
+                        timeout=5,
+                    )
+                    if result.stdout or result.stderr:
+                        error_msg += f"\n\nContainer logs (last 50 lines):\n{result.stdout}\n{result.stderr}"
+            except subprocess.TimeoutExpired:
+                error_msg += f"\n\nTimeout while trying to inspect container"
+            except Exception as e:
+                error_msg += f"\n\nFailed to get container diagnostics: {e}"
+        if last_error:
+            error_msg += f"\n\nLast connection error: {last_error}"
+        raise TimeoutError(error_msg)
+    def _find_available_port(self) -> int:
+        """
+        Find an available port on localhost.
+        Returns:
+            An available port number
+        """
+        import socket
+        with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+            s.bind(("", 0))
+            s.listen(1)
+            port = s.getsockname()[1]
+        return port
+    def _generate_container_name(self, image: str) -> str:
+        """
+        Generate a unique container name based on image name and timestamp.
+        Args:
+            image: Docker image name
+        Returns:
+            A unique container name
+        """
+        import time
+        clean_image = image.split("/")[-1].split(":")[0]
+        timestamp = int(time.time() * 1000)
+        return f"{clean_image}-{timestamp}"
+class KubernetesProvider(ContainerProvider):
+    """
+    Container provider for Kubernetes clusters.
+    This provider creates pods in a Kubernetes cluster and exposes them
+    via services or port-forwarding.
+    Example:
+        >>> provider = KubernetesProvider(namespace="envtorch-dev")
+        >>> base_url = provider.start_container("echo-env:latest")
+        >>> # Pod running in k8s, accessible via service or port-forward
+        >>> provider.stop_container()
+    """
+    pass

src/core/containers/test_local_docker_provider.py ADDED Viewed

	@@ -0,0 +1,258 @@

+#!/usr/bin/env python3
+"""
+End-to-end test for LocalDockerProvider.
+This script tests the complete flow:
+1. Start a container using LocalDockerProvider
+2. Wait for it to be ready
+3. Make HTTP requests to test the environment
+4. Clean up the container
+"""
+import sys
+from pathlib import Path
+# Add src to path
+sys.path.insert(0, str(Path(__file__).parent.parent.parent))
+import requests
+from core.containers.runtime import LocalDockerProvider
+# TODO: Remove this test or make it a functional test sicne this will be tested in e2e test for echo env
+def test_local_docker_provider():
+    """Test LocalDockerProvider end-to-end."""
+    print("=" * 60)
+    print("LocalDockerProvider End-to-End Test")
+    print("=" * 60)
+    print()
+    provider = None
+    try:
+        # Step 1: Create provider
+        print("Step 1: Creating LocalDockerProvider...")
+        provider = LocalDockerProvider()
+        print("✓ Provider created\n")
+        # Step 2: Start container
+        print("Step 2: Starting echo-env container...")
+        base_url = provider.start_container("echo-env:latest")
+        print(f"✓ Container started at: {base_url}")
+        if provider._container_id:
+            print(f"  Container ID: {provider._container_id[:12]}...")
+        if provider._container_name:
+            print(f"  Container name: {provider._container_name}\n")
+        # Step 3: Wait for ready
+        print("Step 3: Waiting for container to be ready...")
+        provider.wait_for_ready(base_url, timeout_s=30.0)
+        print("✓ Container is ready!\n")
+        # Step 4: Test health endpoint
+        print("Step 4: Testing /health endpoint...")
+        response = requests.get(f"{base_url}/health")
+        print(f"  Status: {response.status_code}")
+        print(f"  Response: {response.json()}")
+        assert response.status_code == 200
+        assert response.json()["status"] == "healthy"
+        print("✓ Health check passed\n")
+        # Step 5: Test reset endpoint
+        print("Step 5: Testing /reset endpoint...")
+        response = requests.post(
+            f"{base_url}/reset",
+            json={},
+            headers={"Content-Type": "application/json"},
+        )
+        print(f"  Status: {response.status_code}")
+        data = response.json()
+        print(f"  Message: {data['observation']['echoed_message']}")
+        print(f"  Reward: {data['reward']}")
+        print(f"  Done: {data['done']}")
+        assert response.status_code == 200
+        assert data["observation"]["echoed_message"] == "Echo environment ready!"
+        print("✓ Reset test passed\n")
+        # Step 6: Test step endpoint
+        print("Step 6: Testing /step endpoint...")
+        response = requests.post(
+            f"{base_url}/step",
+            json={"action": {"message": "Hello from LocalDockerProvider!"}},
+            headers={"Content-Type": "application/json"},
+        )
+        print(f"  Status: {response.status_code}")
+        data = response.json()
+        print(f"  Echoed: {data['observation']['echoed_message']}")
+        print(f"  Length: {data['observation']['message_length']}")
+        print(f"  Reward: {data['reward']}")
+        assert response.status_code == 200
+        assert data["observation"]["echoed_message"] == "Hello from LocalDockerProvider!"
+        assert data["observation"]["message_length"] == 31
+        print("✓ Step test passed\n")
+        # Step 7: Test state endpoint
+        print("Step 7: Testing /state endpoint...")
+        response = requests.get(f"{base_url}/state")
+        print(f"  Status: {response.status_code}")
+        data = response.json()
+        print(f"  Episode ID: {data['episode_id']}")
+        print(f"  Step count: {data['step_count']}")
+        assert response.status_code == 200
+        assert data["step_count"] == 1  # One step from above
+        print("✓ State test passed\n")
+        # Step 8: Multiple steps
+        print("Step 8: Testing multiple steps...")
+        for i in range(3):
+            response = requests.post(
+                f"{base_url}/step",
+                json={"action": {"message": f"Message {i+1}"}},
+                headers={"Content-Type": "application/json"},
+            )
+            assert response.status_code == 200
+            print(f"  Step {i+1}: ✓")
+        # Check state updated
+        response = requests.get(f"{base_url}/state")
+        data = response.json()
+        assert data["step_count"] == 4  # 1 + 3 more steps
+        print(f"  Final step count: {data['step_count']}")
+        print("✓ Multiple steps test passed\n")
+        print("=" * 60)
+        print("✓ All tests passed!")
+        print("=" * 60)
+        print()
+        return True
+    except Exception as e:
+        print(f"\n❌ Test failed: {e}")
+        import traceback
+        traceback.print_exc()
+        return False
+    finally:
+        # Step 9: Cleanup
+        if provider is not None:
+            print("\nStep 9: Cleaning up container...")
+            try:
+                provider.stop_container()
+                print("✓ Container stopped and removed\n")
+            except Exception as e:
+                print(f"⚠️  Cleanup warning: {e}\n")
+def test_provider_with_custom_port():
+    """Test provider with custom port."""
+    print("=" * 60)
+    print("LocalDockerProvider with Custom Port Test")
+    print("=" * 60)
+    print()
+    provider = None
+    try:
+        provider = LocalDockerProvider()
+        print("Starting container on custom port 8123...")
+        base_url = provider.start_container("echo-env:latest", port=8123)
+        print(f"✓ Started at: {base_url}")
+        assert ":8123" in base_url
+        print("Waiting for ready...")
+        provider.wait_for_ready(base_url)
+        print("✓ Ready!")
+        print("Testing health...")
+        response = requests.get(f"{base_url}/health")
+        assert response.status_code == 200
+        print("✓ Health check passed")
+        print("\n✓ Custom port test passed!\n")
+        return True
+    except Exception as e:
+        print(f"\n❌ Test failed: {e}")
+        return False
+    finally:
+        if provider is not None:
+            provider.stop_container()
+            print("✓ Cleaned up\n")
+def test_provider_with_env_vars():
+    """Test provider with environment variables."""
+    print("=" * 60)
+    print("LocalDockerProvider with Environment Variables Test")
+    print("=" * 60)
+    print()
+    provider = None
+    try:
+        provider = LocalDockerProvider()
+        print("Starting container with environment variables...")
+        base_url = provider.start_container(
+            "echo-env:latest",
+            env_vars={"DEBUG": "true", "LOG_LEVEL": "info"}
+        )
+        print(f"✓ Started at: {base_url}")
+        print("Waiting for ready...")
+        provider.wait_for_ready(base_url)
+        print("✓ Ready!")
+        print("Testing health...")
+        response = requests.get(f"{base_url}/health")
+        assert response.status_code == 200
+        print("✓ Health check passed")
+        print("\n✓ Environment variables test passed!\n")
+        return True
+    except Exception as e:
+        print(f"\n❌ Test failed: {e}")
+        return False
+    finally:
+        if provider is not None:
+            provider.stop_container()
+            print("✓ Cleaned up\n")
+if __name__ == "__main__":
+    print()
+    print("🐳 LocalDockerProvider Test Suite")
+    print()
+    results = []
+    # Run basic test
+    results.append(("Basic End-to-End", test_local_docker_provider()))
+    # Run custom port test
+    results.append(("Custom Port", test_provider_with_custom_port()))
+    # Run environment variables test
+    results.append(("Environment Variables", test_provider_with_env_vars()))
+    # Summary
+    print("=" * 60)
+    print("Test Summary")
+    print("=" * 60)
+    for name, passed in results:
+        status = "✓ PASSED" if passed else "✗ FAILED"
+        print(f"{name:25} {status}")
+    print("=" * 60)
+    all_passed = all(result for _, result in results)
+    if all_passed:
+        print("\n🎉 All tests passed!")
+        exit(0)
+    else:
+        print("\n❌ Some tests failed")
+        exit(1)

src/core/env_server/__init__.py ADDED Viewed

	@@ -0,0 +1,35 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Core environment interfaces and types."""
+from .base_transforms import CompositeTransform, NullTransform
+from .http_server import HTTPEnvServer, create_app, create_fastapi_app
+from .interfaces import Environment, Message, ModelTokenizer, Transform
+from .types import Action, Observation, State
+from .web_interface import create_web_interface_app, WebInterfaceManager
+__all__ = [
+    # Core interfaces
+    "Environment",
+    "Transform",
+    "Message",
+    "ModelTokenizer",
+    # Types
+    "Action",
+    "Observation",
+    "State",
+    # Base transforms
+    "CompositeTransform",
+    "NullTransform",
+    # HTTP Server
+    "HTTPEnvServer",
+    "create_app",
+    "create_fastapi_app",
+    # Web Interface
+    "create_web_interface_app",
+    "WebInterfaceManager",
+]

src/core/env_server/base_transforms.py ADDED Viewed

	@@ -0,0 +1,29 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Base transform implementations for composing environment-specific transforms."""
+from .interfaces import Transform
+from .types import Observation
+class CompositeTransform(Transform):
+    """Combines multiple transforms into a single transform."""
+    def __init__(self, transforms: list[Transform]):
+        self.transforms = transforms
+    def __call__(self, observation: Observation) -> Observation:
+        for transform in self.transforms:
+            observation = transform(observation)
+        return observation
+class NullTransform(Transform):
+    """Default transform that passes through unchanged."""
+    def __call__(self, observation: Observation) -> Observation:
+        return observation

src/core/env_server/http_server.py ADDED Viewed

	@@ -0,0 +1,233 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+HTTP server wrapper for Environment instances.
+This module provides utilities to wrap any Environment subclass and expose it
+over HTTP endpoints that HTTPEnvClient can consume.
+"""
+from __future__ import annotations
+import os
+from dataclasses import asdict
+from typing import Any, Dict, Type
+from .interfaces import Environment
+from .types import Action, Observation
+from fastapi import Body, FastAPI
+class HTTPEnvServer:
+    """
+    HTTP server wrapper for Environment instances.
+    This class wraps an Environment and exposes its reset(), step(), and state
+    methods as HTTP endpoints compatible with HTTPEnvClient.
+    The server expects:
+    - Action deserialization: Converts JSON dict to Action subclass
+    - Observation serialization: Converts Observation subclass to JSON dict
+    Example:
+        >>> from core.env_server import HTTPEnvServer
+        >>> from envs.coding_env.server import CodeExecutionEnvironment
+        >>>
+        >>> env = CodeExecutionEnvironment()
+        >>> server = HTTPEnvServer(env)
+        >>>
+        >>> # Register routes with FastAPI
+        >>> from fastapi import FastAPI
+        >>> app = FastAPI()
+        >>> server.register_routes(app)
+    """
+    def __init__(
+        self,
+        env: Environment,
+        action_cls: Type[Action],
+        observation_cls: Type[Observation],
+    ):
+        """
+        Initialize HTTP server wrapper.
+        Args:
+            env: The Environment instance to wrap
+            action_cls: The Action subclass this environment expects
+            observation_cls: The Observation subclass this environment returns
+        """
+        self.env = env
+        self.action_cls = action_cls
+        self.observation_cls = observation_cls
+    def register_routes(self, app: Any) -> None:
+        """
+        Register HTTP routes on a FastAPI application.
+        Args:
+            app: FastAPI application instance
+        """
+        if not isinstance(app, FastAPI):
+            raise TypeError("app must be a FastAPI instance")
+        @app.post("/reset")
+        async def reset(request: Dict[str, Any] = Body(default={})) -> Dict[str, Any]:
+            """Reset endpoint - returns initial observation."""
+            # TODO: Handle seed, episode_id from request if provided
+            observation = self.env.reset()
+            return self._serialize_observation(observation)
+        @app.post("/step")
+        async def step(request: Dict[str, Any]) -> Dict[str, Any]:
+            """Step endpoint - executes action and returns observation."""
+            action_data = request.get("action", {})
+            # TODO: Handle timeout_s, request_id, episode_id from request if provided
+            # Deserialize action
+            action = self._deserialize_action(action_data)
+            # Execute step
+            observation = self.env.step(action)
+            # Return serialized observation
+            return self._serialize_observation(observation)
+        @app.get("/state")
+        async def get_state() -> Dict[str, Any]:
+            """State endpoint - returns current environment state."""
+            state = self.env.state
+            return asdict(state)
+        @app.get("/health")
+        async def health() -> Dict[str, str]:
+            """Health check endpoint."""
+            return {"status": "healthy"}
+    def _deserialize_action(self, action_data: Dict[str, Any]) -> Action:
+        """
+        Convert JSON dict to Action instance.
+        Args:
+            action_data: Dictionary containing action data
+        Returns:
+            Action instance
+        Note:
+            This is a simple implementation. Subclasses may need to override
+            for more complex deserialization logic.
+        """
+        # Remove metadata if present (it will be set via kw_only field)
+        metadata = action_data.pop("metadata", {})
+        action = self.action_cls(**action_data)
+        action.metadata = metadata
+        return action
+    def _serialize_observation(self, observation: Observation) -> Dict[str, Any]:
+        """
+        Convert Observation instance to JSON-compatible dict.
+        Args:
+            observation: Observation instance
+        Returns:
+            Dictionary compatible with HTTPEnvClient._parse_result()
+        The format matches what HTTPEnvClient expects:
+        {
+            "observation": {...},  # Observation fields
+            "reward": float | None,
+            "done": bool,
+        }
+        """
+        obs_dict = asdict(observation)
+        # Extract reward and done (these are part of StepResult on client side)
+        reward = obs_dict.pop("reward", None)
+        done = obs_dict.pop("done", False)
+        obs_dict.pop("metadata", None)  # Remove metadata from observation
+        # Return in HTTPEnvClient expected format
+        return {
+            "observation": obs_dict,
+            "reward": reward,
+            "done": done,
+        }
+def create_app(
+    env: Environment,
+    action_cls: Type[Action],
+    observation_cls: Type[Observation],
+    env_name: Optional[str] = None,
+) -> Any:
+    """
+    Create a FastAPI application with or without web interface.
+    This function creates a FastAPI app with the web interface enabled by default,
+    including README integration for better user experience.
+    Args:
+        env: The Environment instance to serve
+        action_cls: The Action subclass this environment expects
+        observation_cls: The Observation subclass this environment returns
+        env_name: Optional environment name for README loading
+    Returns:
+        FastAPI application instance with or without web interface and README integration
+    """
+    # Check if web interface should be enabled
+    # This can be controlled via environment variable or build argument
+    enable_web = (
+        os.getenv("ENABLE_WEB_INTERFACE", "false").lower() in ("true", "1", "yes")
+    )
+    if enable_web:
+        # Import web interface only when needed
+        from .web_interface import create_web_interface_app
+        return create_web_interface_app(env, action_cls, observation_cls, env_name)
+    else:
+        # Use standard FastAPI app without web interface
+        return create_fastapi_app(env, action_cls, observation_cls)
+def create_fastapi_app(
+    env: Environment,
+    action_cls: Type[Action],
+    observation_cls: Type[Observation],
+) -> Any:
+    """
+    Create a FastAPI application with routes for the given environment.
+    Args:
+        env: The Environment instance to serve
+        action_cls: The Action subclass this environment expects
+        observation_cls: The Observation subclass this environment returns
+    Returns:
+        FastAPI application instance with routes registered
+    Example:
+        >>> from envs.coding_env.server import CodeExecutionEnvironment
+        >>> from envs.coding_env.models import CodeAction, CodeObservation
+        >>>
+        >>> env = CodeExecutionEnvironment()
+        >>> app = create_fastapi_app(env, CodeAction, CodeObservation)
+        >>>
+        >>> # Run with: uvicorn module:app --host 0.0.0.0 --port 8000
+    """
+    try:
+        from fastapi import FastAPI
+    except ImportError:
+        raise ImportError(
+            "FastAPI is required. Install with: pip install fastapi uvicorn"
+        )
+    app = FastAPI(title="Environment HTTP Server")
+    server = HTTPEnvServer(env, action_cls, observation_cls)
+    server.register_routes(app)
+    return app

src/core/env_server/interfaces.py ADDED Viewed

	@@ -0,0 +1,118 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+from abc import ABC, abstractmethod
+from typing import Any, Protocol, TypedDict
+from .types import Action, Observation, State
+class Message(TypedDict):
+    """A message in a conversation.
+    Compatible with Huggingface chat template format.
+    """
+    role: str
+    content: str
+class ModelTokenizer(Protocol):
+    """Protocol for tokenizers that support chat templates.
+    This protocol defines the interface that tokenizers must implement
+    to work with chat-based environments. It's compatible with
+    Huggingface transformers tokenizers.
+    """
+    def apply_chat_template(
+        self,
+        conversation: list[Message],
+        tokenize: bool = True,
+        return_tensors: str | None = None,
+        **kwargs: Any,
+    ) -> Any:
+        """Apply a chat template to format and optionally tokenize a conversation.
+        Args:
+            conversation: List of message dictionaries with 'role' and 'content'
+            tokenize: Whether to tokenize the output
+            return_tensors: Format for returned tensors ('pt' for PyTorch)
+            **kwargs: Additional arguments
+        Returns:
+            Formatted and optionally tokenized conversation
+        """
+        ...
+    def decode(
+        self, token_ids: Any, skip_special_tokens: bool = False, **kwargs: Any
+    ) -> str:
+        """Decode token IDs back to text.
+        Args:
+            token_ids: Token IDs to decode
+            skip_special_tokens: Whether to skip special tokens in output
+            **kwargs: Additional arguments
+        Returns:
+            Decoded text string
+        """
+        ...
+class Transform(ABC):
+    """Transform observations to add rewards, metrics, or other modifications.
+    Transforms follow the TorchRL pattern where they take an observation
+    and return a (potentially modified) observation. This allows for
+    flexible reward computation and observation augmentation.
+    """
+    @abstractmethod
+    def __call__(self, observation: Observation) -> Observation:
+        """Transform an observation.
+        Args:
+            observation: The input observation
+        Returns:
+            The transformed observation
+        """
+        pass
+class Environment(ABC):
+    """Base class for all environment servers following Gym/Gymnasium API.
+    Args:
+        transform: Optional transform to apply to observations
+    """
+    def __init__(self, transform: Transform | None = None):
+        self.transform = transform
+    @abstractmethod
+    def reset(self) -> Observation:
+        """Reset the environment and return initial observation."""
+        pass
+    @abstractmethod
+    def step(self, action: Action) -> Observation:
+        """Take a step in the environment."""
+        pass
+    @property
+    @abstractmethod
+    def state(self) -> State:
+        """Get the current environment state."""
+        pass
+    def _apply_transform(self, observation: Observation) -> Observation:
+        """Apply transform if one is provided."""
+        if self.transform is not None:
+            return self.transform(observation)
+        return observation

src/core/env_server/types.py ADDED Viewed

	@@ -0,0 +1,57 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional, Union
+# Type aliases
+Scalar = Union[int, float, bool]
+@dataclass(kw_only=True)
+class Action:
+    """Base class for all environment actions."""
+    metadata: Dict[str, Any] = field(default_factory=dict)
+@dataclass(kw_only=True)
+class Observation:
+    """Base class for all environment observations."""
+    done: bool = False
+    reward: Union[bool, int, float, None] = None
+    metadata: Dict[str, Any] = field(default_factory=dict)
+@dataclass
+class State:
+    """Base class for environment state."""
+    episode_id: Optional[str] = None
+    step_count: int = 0
+@dataclass
+class CodeExecResult:
+    """Result of code execution containing stdout, stderr, and exit code."""
+    stdout: str
+    stderr: str
+    exit_code: int
+@dataclass
+class EnvironmentMetadata:
+    """Metadata about an environment for documentation and UI purposes."""
+    name: str
+    description: str
+    readme_content: Optional[str] = None
+    version: Optional[str] = None
+    author: Optional[str] = None
+    documentation_url: Optional[str] = None

src/core/env_server/web_interface.py ADDED Viewed

	@@ -0,0 +1,1613 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Web interface for OpenEnv environments.
+This module provides a web-based interface for interacting with OpenEnv environments,
+including a two-pane layout for HumanAgent interaction and state observation.
+"""
+from __future__ import annotations
+import json
+import time
+from dataclasses import asdict, dataclass
+from typing import Any, Dict, List, Optional, Type
+from datetime import datetime
+from fastapi import FastAPI, WebSocket, WebSocketDisconnect, Request
+from fastapi.responses import HTMLResponse, FileResponse
+from fastapi.staticfiles import StaticFiles
+from pydantic import BaseModel
+from .interfaces import Environment
+from .types import Action, Observation, State, EnvironmentMetadata
+def load_environment_metadata(env: Environment, env_name: Optional[str] = None) -> EnvironmentMetadata:
+    """
+    Load environment metadata including README content.
+    Args:
+        env: The environment instance
+        env_name: Optional environment name for README file lookup
+    Returns:
+        EnvironmentMetadata with loaded information
+    """
+    # Try to get metadata from environment if it has a method for it
+    if hasattr(env, 'get_metadata'):
+        return env.get_metadata()
+    # Default metadata
+    metadata = EnvironmentMetadata(
+        name=env_name or env.__class__.__name__,
+        description=f"{env.__class__.__name__} environment",
+        version="1.0.0"
+    )
+    # Try to load README from file system
+    readme_content = _load_readme_from_filesystem(env_name)
+    if readme_content:
+        metadata.readme_content = readme_content
+    return metadata
+def _load_readme_from_filesystem(env_name: Optional[str]) -> Optional[str]:
+    """
+    Load README content from the filesystem.
+    Tries multiple locations:
+    1. Container filesystem: /app/README.md
+    2. Local development: src/envs/{env_name}/README.md
+    3. Environment variable: ENV_README_PATH
+    """
+    import os
+    from pathlib import Path
+    # Try container filesystem first
+    container_readme = Path("/app/README.md")
+    if container_readme.exists():
+        try:
+            return container_readme.read_text(encoding='utf-8')
+        except Exception:
+            pass
+    # Try environment variable path
+    custom_path = os.environ.get("ENV_README_PATH")
+    if custom_path and Path(custom_path).exists():
+        try:
+            return Path(custom_path).read_text(encoding='utf-8')
+        except Exception:
+            pass
+    # Try local development path
+    if env_name:
+        local_readme = Path(f"src/envs/{env_name}/README.md")
+        if local_readme.exists():
+            try:
+                return local_readme.read_text(encoding='utf-8')
+            except Exception:
+                pass
+    return None
+@dataclass
+class ActionLog:
+    """Log entry for an action taken."""
+    timestamp: str
+    action: Dict[str, Any]
+    observation: Dict[str, Any]
+    reward: Optional[float]
+    done: bool
+    step_count: int
+@dataclass
+class EpisodeState:
+    """Current episode state for the web interface."""
+    episode_id: Optional[str]
+    step_count: int
+    current_observation: Optional[Dict[str, Any]]
+    action_logs: List[ActionLog]
+    is_reset: bool = True
+class WebInterfaceManager:
+    """Manages the web interface for an environment."""
+    def __init__(
+        self,
+        env: Environment,
+        action_cls: Type[Action],
+        observation_cls: Type[Observation],
+        metadata: Optional[EnvironmentMetadata] = None,
+    ):
+        self.env = env
+        self.action_cls = action_cls
+        self.observation_cls = observation_cls
+        self.metadata = metadata or EnvironmentMetadata(
+            name=env.__class__.__name__,
+            description=f"{env.__class__.__name__} environment"
+        )
+        self.episode_state = EpisodeState(
+            episode_id=None,
+            step_count=0,
+            current_observation=None,
+            action_logs=[]
+        )
+        self.connected_clients: List[WebSocket] = []
+    async def connect_websocket(self, websocket: WebSocket):
+        """Connect a new WebSocket client."""
+        await websocket.accept()
+        self.connected_clients.append(websocket)
+        # Send current state to the new client
+        await self._send_state_update()
+    async def disconnect_websocket(self, websocket: WebSocket):
+        """Disconnect a WebSocket client."""
+        if websocket in self.connected_clients:
+            self.connected_clients.remove(websocket)
+    async def _send_state_update(self):
+        """Send current state to all connected clients."""
+        if not self.connected_clients:
+            return
+        state_data = {
+            "type": "state_update",
+            "episode_state": asdict(self.episode_state)
+        }
+        # Send to all connected clients
+        disconnected_clients = []
+        for client in self.connected_clients:
+            try:
+                await client.send_text(json.dumps(state_data))
+            except:
+                disconnected_clients.append(client)
+        # Remove disconnected clients
+        for client in disconnected_clients:
+            self.connected_clients.remove(client)
+    async def reset_environment(self) -> Dict[str, Any]:
+        """Reset the environment and update state."""
+        observation = self.env.reset()
+        state = self.env.state
+        # Update episode state
+        self.episode_state.episode_id = state.episode_id
+        self.episode_state.step_count = 0
+        self.episode_state.current_observation = asdict(observation)
+        self.episode_state.action_logs = []
+        self.episode_state.is_reset = True
+        # Send state update
+        await self._send_state_update()
+        return {
+            "observation": asdict(observation),
+            "reward": observation.reward,
+            "done": observation.done,
+        }
+    async def step_environment(self, action_data: Dict[str, Any]) -> Dict[str, Any]:
+        """Execute a step in the environment and update state."""
+        # Deserialize action
+        action = self._deserialize_action(action_data)
+        # Execute step
+        observation = self.env.step(action)
+        state = self.env.state
+        # Create action log
+        action_log = ActionLog(
+            timestamp=datetime.now().isoformat(),
+            action=asdict(action),
+            observation=asdict(observation),
+            reward=observation.reward,
+            done=observation.done,
+            step_count=state.step_count
+        )
+        # Update episode state
+        self.episode_state.episode_id = state.episode_id
+        self.episode_state.step_count = state.step_count
+        self.episode_state.current_observation = asdict(observation)
+        self.episode_state.action_logs.append(action_log)
+        self.episode_state.is_reset = False
+        # Send state update
+        await self._send_state_update()
+        return {
+            "observation": asdict(observation),
+            "reward": observation.reward,
+            "done": observation.done,
+        }
+    def get_state(self) -> Dict[str, Any]:
+        """Get current environment state."""
+        state = self.env.state
+        return asdict(state)
+    def _deserialize_action(self, action_data: Dict[str, Any]) -> Action:
+        """Convert JSON dict to Action instance."""
+        metadata = action_data.pop("metadata", {})
+        # Handle tensor fields that come from JSON as lists
+        processed_data = {}
+        for key, value in action_data.items():
+            if key == "tokens" and isinstance(value, (list, str)):
+                # Convert list or string to tensor
+                if isinstance(value, str):
+                    # If it's a string, try to parse it as a list of numbers
+                    try:
+                        import json
+                        value = json.loads(value)
+                    except:
+                        # If parsing fails, treat as empty list
+                        value = []
+                if isinstance(value, list):
+                    import torch
+                    processed_data[key] = torch.tensor(value, dtype=torch.long)
+                else:
+                    processed_data[key] = value
+            elif key == "action_id" and isinstance(value, str):
+                # Convert action_id from string to int
+                try:
+                    processed_data[key] = int(value)
+                except ValueError:
+                    # If conversion fails, keep original value
+                    processed_data[key] = value
+            else:
+                processed_data[key] = value
+        action = self.action_cls(**processed_data)
+        action.metadata = metadata
+        return action
+def create_web_interface_app(
+    env: Environment,
+    action_cls: Type[Action],
+    observation_cls: Type[Observation],
+    env_name: Optional[str] = None,
+) -> FastAPI:
+    """
+    Create a FastAPI application with web interface for the given environment.
+    Args:
+        env: The Environment instance to serve
+        action_cls: The Action subclass this environment expects
+        observation_cls: The Observation subclass this environment returns
+        env_name: Optional environment name for README loading
+    Returns:
+        FastAPI application instance with web interface
+    """
+    from .http_server import create_fastapi_app
+    # Create the base environment app
+    app = create_fastapi_app(env, action_cls, observation_cls)
+    # Load environment metadata
+    metadata = load_environment_metadata(env, env_name)
+    # Create web interface manager
+    web_manager = WebInterfaceManager(env, action_cls, observation_cls, metadata)
+    # Add web interface routes
+    @app.get("/web", response_class=HTMLResponse)
+    async def web_interface():
+        """Serve the web interface."""
+        return get_web_interface_html(action_cls, web_manager.metadata)
+    @app.get("/web/metadata")
+    async def web_metadata():
+        """Get environment metadata."""
+        return asdict(web_manager.metadata)
+    @app.websocket("/ws")
+    async def websocket_endpoint(websocket: WebSocket):
+        """WebSocket endpoint for real-time updates."""
+        await web_manager.connect_websocket(websocket)
+        try:
+            while True:
+                # Keep connection alive
+                await websocket.receive_text()
+        except WebSocketDisconnect:
+            await web_manager.disconnect_websocket(websocket)
+    @app.post("/web/reset")
+    async def web_reset():
+        """Reset endpoint for web interface."""
+        return await web_manager.reset_environment()
+    @app.post("/web/step")
+    async def web_step(request: Dict[str, Any]):
+        """Step endpoint for web interface."""
+        # Check if this is a message-based request (chat environment)
+        if "message" in request:
+            message = request["message"]
+            # Convert message to action using the environment's message_to_action method
+            action = web_manager.env.message_to_action(message)
+            action_data = {"tokens": action.tokens.tolist()}
+        else:
+            action_data = request.get("action", {})
+        return await web_manager.step_environment(action_data)
+    @app.get("/web/state")
+    async def web_state():
+        """State endpoint for web interface."""
+        return web_manager.get_state()
+    return app
+def get_web_interface_html(action_cls: Type[Action], metadata: Optional[EnvironmentMetadata] = None) -> str:
+    """Generate the HTML for the web interface."""
+    # Check if this is a chat environment by looking for tokens field
+    is_chat_env = False
+    if hasattr(action_cls, '__dataclass_fields__'):
+        for field_name, field_info in action_cls.__dataclass_fields__.items():
+            if field_name == 'tokens' and hasattr(field_info.type, '__name__') and 'Tensor' in field_info.type.__name__:
+                is_chat_env = True
+                break
+    # Get action fields for dynamic form generation with enhanced metadata
+    action_fields = _extract_action_fields(action_cls)
+    return f"""
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>OpenEnv Web Interface</title>
+    <style>
+        * {{
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }}
+        body {{
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
+            background-color: #f5f5f5;
+            height: 100vh;
+            overflow: hidden;
+        }}
+        .container {{
+            display: flex;
+            height: 100vh;
+        }}
+        .left-pane {{
+            width: 50%;
+            background: white;
+            border-right: 1px solid #e0e0e0;
+            display: flex;
+            flex-direction: column;
+        }}
+        .right-pane {{
+            width: 50%;
+            background: #fafafa;
+            display: flex;
+            flex-direction: column;
+        }}
+        .pane-header {{
+            padding: 20px;
+            border-bottom: 1px solid #e0e0e0;
+            background: #f8f9fa;
+            font-weight: 600;
+            font-size: 16px;
+        }}
+        .pane-content {{
+            flex: 1;
+            padding: 20px;
+            overflow-y: auto;
+        }}
+        .action-form {{
+            background: white;
+            border: 1px solid #e0e0e0;
+            border-radius: 8px;
+            padding: 20px;
+            margin-bottom: 20px;
+        }}
+        .form-group {{
+            margin-bottom: 15px;
+        }}
+        .form-group label {{
+            display: block;
+            margin-bottom: 5px;
+            font-weight: 500;
+            color: #333;
+        }}
+        .form-group input, .form-group textarea {{
+            width: 100%;
+            padding: 8px 12px;
+            border: 1px solid #ddd;
+            border-radius: 4px;
+            font-size: 14px;
+        }}
+        .form-group input:focus, .form-group textarea:focus {{
+            outline: none;
+            border-color: #007bff;
+            box-shadow: 0 0 0 2px rgba(0, 123, 255, 0.25);
+        }}
+        .btn {{
+            background: #007bff;
+            color: white;
+            border: none;
+            padding: 10px 20px;
+            border-radius: 4px;
+            cursor: pointer;
+            font-size: 14px;
+            margin-right: 10px;
+            margin-bottom: 10px;
+        }}
+        .btn:hover {{
+            background: #0056b3;
+        }}
+        .btn:disabled {{
+            background: #6c757d;
+            cursor: not-allowed;
+        }}
+        .btn-secondary {{
+            background: #6c757d;
+        }}
+        .btn-secondary:hover {{
+            background: #545b62;
+        }}
+        .state-display {{
+            background: white;
+            border: 1px solid #e0e0e0;
+            border-radius: 8px;
+            padding: 15px;
+            margin-bottom: 20px;
+        }}
+        .state-item {{
+            margin-bottom: 8px;
+        }}
+        .state-label {{
+            font-weight: 500;
+            color: #666;
+        }}
+        .state-value {{
+            color: #333;
+            font-family: monospace;
+        }}
+        .logs-container {{
+            background: white;
+            border: 1px solid #e0e0e0;
+            border-radius: 8px;
+            padding: 15px;
+            max-height: 400px;
+            overflow-y: auto;
+        }}
+        .log-entry {{
+            border-bottom: 1px solid #f0f0f0;
+            padding: 10px 0;
+        }}
+        .log-entry:last-child {{
+            border-bottom: none;
+        }}
+        .log-timestamp {{
+            font-size: 12px;
+            color: #666;
+            margin-bottom: 5px;
+        }}
+        .log-action {{
+            background: #e3f2fd;
+            padding: 8px;
+            border-radius: 4px;
+            margin-bottom: 5px;
+            font-family: monospace;
+            font-size: 12px;
+        }}
+        .log-observation {{
+            background: #f3e5f5;
+            padding: 8px;
+            border-radius: 4px;
+            font-family: monospace;
+            font-size: 12px;
+        }}
+        .log-reward {{
+            font-weight: 600;
+            color: #28a745;
+        }}
+        .log-done {{
+            font-weight: 600;
+            color: #dc3545;
+        }}
+        .status-indicator {{
+            display: inline-block;
+            width: 8px;
+            height: 8px;
+            border-radius: 50%;
+            margin-right: 8px;
+        }}
+        .status-connected {{
+            background: #28a745;
+        }}
+        .status-disconnected {{
+            background: #dc3545;
+        }}
+        .json-display {{
+            background: #f8f9fa;
+            border: 1px solid #e9ecef;
+            border-radius: 4px;
+            padding: 10px;
+            font-family: monospace;
+            font-size: 12px;
+            white-space: pre-wrap;
+            max-height: 200px;
+            overflow-y: auto;
+        }}
+        /* Chat Interface Styles */
+        .chat-interface {{
+            background: white;
+            border: 1px solid #e0e0e0;
+            border-radius: 8px;
+            padding: 20px;
+            margin-bottom: 20px;
+        }}
+        .chat-messages {{
+            background: #f8f9fa;
+            border: 1px solid #e0e0e0;
+            border-radius: 8px;
+            padding: 15px;
+            margin-bottom: 15px;
+            max-height: 400px;
+            overflow-y: auto;
+        }}
+        .chat-message {{
+            margin-bottom: 15px;
+            padding: 10px;
+            border-radius: 8px;
+        }}
+        .chat-message:last-child {{
+            margin-bottom: 0;
+        }}
+        .chat-message.user {{
+            background: #e3f2fd;
+            margin-left: 20px;
+        }}
+        .chat-message.assistant {{
+            background: #f3e5f5;
+            margin-right: 20px;
+        }}
+        .chat-message.system {{
+            background: #e8f5e8;
+            font-style: italic;
+        }}
+        .message-role {{
+            font-weight: 600;
+            font-size: 12px;
+            color: #666;
+            margin-bottom: 5px;
+        }}
+        .message-content {{
+            font-size: 14px;
+            line-height: 1.4;
+        }}
+        .chat-input-container {{
+            border-top: 1px solid #e0e0e0;
+            padding-top: 15px;
+        }}
+        .role-selector {{
+            margin-bottom: 10px;
+        }}
+        .role-selector label {{
+            font-weight: 500;
+            margin-right: 10px;
+        }}
+        .role-selector select {{
+            padding: 5px 10px;
+            border: 1px solid #ddd;
+            border-radius: 4px;
+        }}
+        .message-input {{
+            display: flex;
+            gap: 10px;
+            align-items: flex-end;
+        }}
+        .message-input textarea {{
+            flex: 1;
+            padding: 10px;
+            border: 1px solid #ddd;
+            border-radius: 4px;
+            resize: vertical;
+            font-family: inherit;
+        }}
+        .message-input textarea:focus {{
+            outline: none;
+            border-color: #007bff;
+            box-shadow: 0 0 0 2px rgba(0, 123, 255, 0.25);
+        }}
+        /* Instructions Section Styles */
+        .instructions-section {{
+            background: white;
+            border: 1px solid #e0e0e0;
+            border-radius: 8px;
+            padding: 20px;
+            margin-bottom: 20px;
+        }}
+        .instructions-header {{
+            display: flex;
+            justify-content: space-between;
+            align-items: center;
+            margin-bottom: 15px;
+        }}
+        .instructions-title {{
+            font-size: 18px;
+            font-weight: 600;
+            color: #333;
+            margin: 0;
+        }}
+        .instructions-toggle {{
+            background: #f8f9fa;
+            border: 1px solid #dee2e6;
+            border-radius: 4px;
+            padding: 5px 10px;
+            cursor: pointer;
+            font-size: 12px;
+            color: #6c757d;
+        }}
+        .instructions-toggle:hover {{
+            background: #e9ecef;
+        }}
+        .instructions-content {{
+            display: none;
+            max-height: 400px;
+            overflow-y: auto;
+            border-top: 1px solid #e0e0e0;
+            padding-top: 15px;
+        }}
+        .instructions-content.expanded {{
+            display: block;
+        }}
+        .instructions-content h1,
+        .instructions-content h2,
+        .instructions-content h3 {{
+            color: #333;
+            margin-top: 20px;
+            margin-bottom: 10px;
+        }}
+        .instructions-content h1 {{
+            font-size: 24px;
+            border-bottom: 2px solid #007bff;
+            padding-bottom: 10px;
+        }}
+        .instructions-content h2 {{
+            font-size: 20px;
+        }}
+        .instructions-content h3 {{
+            font-size: 16px;
+        }}
+        .instructions-content p {{
+            margin-bottom: 10px;
+            line-height: 1.6;
+        }}
+        .instructions-content code {{
+            background: #f8f9fa;
+            padding: 2px 4px;
+            border-radius: 3px;
+            font-family: monospace;
+            font-size: 14px;
+        }}
+        .instructions-content pre {{
+            background: #f8f9fa;
+            border: 1px solid #e9ecef;
+            border-radius: 4px;
+            padding: 15px;
+            overflow-x: auto;
+            margin: 10px 0;
+        }}
+        .instructions-content pre code {{
+            background: none;
+            padding: 0;
+        }}
+        .instructions-content ul,
+        .instructions-content ol {{
+            margin: 10px 0;
+            padding-left: 20px;
+        }}
+        .instructions-content li {{
+            margin-bottom: 5px;
+        }}
+        .instructions-content table {{
+            border-collapse: collapse;
+            width: 100%;
+            margin: 15px 0;
+        }}
+        .instructions-content th,
+        .instructions-content td {{
+            border: 1px solid #dee2e6;
+            padding: 8px 12px;
+            text-align: left;
+        }}
+        .instructions-content th {{
+            background: #f8f9fa;
+            font-weight: 600;
+        }}
+        /* Enhanced Form Styles */
+        .help-text {{
+            display: block;
+            margin-top: 5px;
+            font-size: 12px;
+            color: #6c757d;
+            font-style: italic;
+        }}
+        .form-group label {{
+            font-weight: 500;
+            color: #333;
+            margin-bottom: 5px;
+        }}
+        .form-group select {{
+            width: 100%;
+            padding: 8px 12px;
+            border: 1px solid #ddd;
+            border-radius: 4px;
+            font-size: 14px;
+            background-color: white;
+        }}
+        .form-group select:focus {{
+            outline: none;
+            border-color: #007bff;
+            box-shadow: 0 0 0 2px rgba(0, 123, 255, 0.25);
+        }}
+        .form-group textarea {{
+            width: 100%;
+            padding: 8px 12px;
+            border: 1px solid #ddd;
+            border-radius: 4px;
+            font-size: 14px;
+            font-family: inherit;
+            resize: vertical;
+        }}
+        .form-group textarea:focus {{
+            outline: none;
+            border-color: #007bff;
+            box-shadow: 0 0 0 2px rgba(0, 123, 255, 0.25);
+        }}
+        .form-group input[type="number"] {{
+            width: 100%;
+            padding: 8px 12px;
+            border: 1px solid #ddd;
+            border-radius: 4px;
+            font-size: 14px;
+        }}
+        .form-group input[type="number"]:focus {{
+            outline: none;
+            border-color: #007bff;
+            box-shadow: 0 0 0 2px rgba(0, 123, 255, 0.25);
+        }}
+        .form-group input[type="text"]:focus {{
+            outline: none;
+            border-color: #007bff;
+            box-shadow: 0 0 0 2px rgba(0, 123, 255, 0.25);
+        }}
+        .required-indicator {{
+            color: #dc3545;
+            font-weight: bold;
+        }}
+        .form-group .field-description {{
+            font-size: 11px;
+            color: #666;
+            margin-top: 2px;
+            font-style: italic;
+        }}
+    </style>
+</head>
+<body>
+    <div class="container">
+        <!-- Left Pane: HumanAgent Interface -->
+        <div class="left-pane">
+            <div class="pane-header">
+                <span class="status-indicator status-disconnected" id="connection-status"></span>
+                HumanAgent Interface
+            </div>
+            <div class="pane-content">
+                <!-- Instructions Section -->
+                {_generate_instructions_section(metadata)}
+                <!-- Action Form or Chat Interface -->
+                {_generate_action_interface(action_fields, is_chat_env)}
+                <!-- Control Buttons -->
+                <div style="margin-bottom: 20px;">
+                    <button class="btn btn-secondary" id="reset-btn">Reset Environment</button>
+                    <button class="btn btn-secondary" id="state-btn">Get State</button>
+                </div>
+                <!-- Current State Display -->
+                <div class="state-display">
+                    <h3>Current State</h3>
+                    <div id="current-state">
+                        <div class="state-item">
+                            <span class="state-label">Status:</span>
+                            <span class="state-value" id="env-status">Not initialized</span>
+                        </div>
+                        <div class="state-item">
+                            <span class="state-label">Episode ID:</span>
+                            <span class="state-value" id="episode-id">-</span>
+                        </div>
+                        <div class="state-item">
+                            <span class="state-label">Step Count:</span>
+                            <span class="state-value" id="step-count">0</span>
+                        </div>
+                    </div>
+                </div>
+            </div>
+        </div>
+        <!-- Right Pane: State Observer -->
+        <div class="right-pane">
+            <div class="pane-header">
+                State Observer
+            </div>
+            <div class="pane-content">
+                <!-- Current Observation -->
+                <div class="state-display">
+                    <h3>Current Observation</h3>
+                    <div id="current-observation" class="json-display">
+                        No observation yet
+                    </div>
+                </div>
+                <!-- Action Logs -->
+                <div class="logs-container">
+                    <h3>Action History</h3>
+                    <div id="action-logs">
+                        No actions taken yet
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+    <script>
+        class OpenEnvWebInterface {{
+            constructor() {{
+                this.ws = null;
+                this.isConnected = false;
+                this.init();
+            }}
+            init() {{
+                this.connectWebSocket();
+                this.setupEventListeners();
+            }}
+            connectWebSocket() {{
+                const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
+                const wsUrl = `${{protocol}}//${{window.location.host}}/ws`;
+                this.ws = new WebSocket(wsUrl);
+                this.ws.onopen = () => {{
+                    this.isConnected = true;
+                    this.updateConnectionStatus(true);
+                    console.log('WebSocket connected');
+                }};
+                this.ws.onmessage = (event) => {{
+                    const data = JSON.parse(event.data);
+                    if (data.type === 'state_update') {{
+                        this.updateUI(data.episode_state);
+                    }}
+                }};
+                this.ws.onclose = () => {{
+                    this.isConnected = false;
+                    this.updateConnectionStatus(false);
+                    console.log('WebSocket disconnected');
+                    // Attempt to reconnect after 3 seconds
+                    setTimeout(() => this.connectWebSocket(), 3000);
+                }};
+                this.ws.onerror = (error) => {{
+                    console.error('WebSocket error:', error);
+                }};
+            }}
+            setupEventListeners() {{
+                // Instructions toggle
+                const instructionsToggle = document.getElementById('instructions-toggle');
+                const instructionsContent = document.getElementById('instructions-content');
+                if (instructionsToggle && instructionsContent) {{
+                    instructionsToggle.addEventListener('click', () => {{
+                        instructionsContent.classList.toggle('expanded');
+                        instructionsToggle.textContent = instructionsContent.classList.contains('expanded')
+                            ? 'Hide Instructions' : 'Show Instructions';
+                    }});
+                }}
+                // Check if this is a chat environment
+                const isChatEnv = document.getElementById('chat-messages') !== null;
+                if (isChatEnv) {{
+                    // Chat environment event listeners
+                    document.getElementById('send-message-btn').addEventListener('click', () => {{
+                        this.sendMessage();
+                    }});
+                    // Send message on Enter (but allow Shift+Enter for new lines)
+                    document.getElementById('message-input').addEventListener('keydown', (e) => {{
+                        if (e.key === 'Enter' && !e.shiftKey) {{
+                            e.preventDefault();
+                            this.sendMessage();
+                        }}
+                    }});
+                }} else {{
+                    // Traditional action form submission
+                    const actionForm = document.getElementById('action-form');
+                    if (actionForm) {{
+                        actionForm.addEventListener('submit', (e) => {{
+                            e.preventDefault();
+                            this.submitAction();
+                        }});
+                    }}
+                }}
+                // Reset button
+                document.getElementById('reset-btn').addEventListener('click', () => {{
+                    this.resetEnvironment();
+                }});
+                // State button
+                document.getElementById('state-btn').addEventListener('click', () => {{
+                    this.getState();
+                }});
+            }}
+            async sendMessage() {{
+                const messageInput = document.getElementById('message-input');
+                const roleSelect = document.getElementById('message-role');
+                const message = messageInput.value.trim();
+                const role = roleSelect.value;
+                if (!message) {{
+                    return;
+                }}
+                // Add message to chat display immediately
+                this.addMessageToChat(role, message);
+                // Clear input
+                messageInput.value = '';
+                try {{
+                    // Send message to server to convert to action and step
+                    const response = await fetch('/web/step', {{
+                        method: 'POST',
+                        headers: {{ 'Content-Type': 'application/json' }},
+                        body: JSON.stringify({{
+                            message: {{
+                                role: role,
+                                content: message
+                            }}
+                        }})
+                    }});
+                    if (!response.ok) {{
+                        throw new Error(`HTTP error! status: ${{response.status}}`);
+                    }}
+                    const result = await response.json();
+                    console.log('Message sent:', result);
+                }} catch (error) {{
+                    console.error('Error sending message:', error);
+                    alert('Error sending message: ' + error.message);
+                }}
+            }}
+            addMessageToChat(role, content) {{
+                const chatMessages = document.getElementById('chat-messages');
+                const messageDiv = document.createElement('div');
+                messageDiv.className = `chat-message ${{role}}`;
+                messageDiv.innerHTML = `
+                    <div class="message-role">${{role.charAt(0).toUpperCase() + role.slice(1)}}</div>
+                    <div class="message-content">${{content}}</div>
+                `;
+                chatMessages.appendChild(messageDiv);
+                chatMessages.scrollTop = chatMessages.scrollHeight;
+            }}
+            async submitAction() {{
+                const formData = new FormData(document.getElementById('action-form'));
+                const action = {{}};
+                // Collect form data
+                for (const [key, value] of formData.entries()) {{
+                    if (value !== '') {{
+                        // Handle tensor fields (tokens) - convert comma-separated string to array
+                        if (key === 'tokens') {{
+                            try {{
+                                action[key] = value.split(',').map(x => parseInt(x.trim())).filter(x => !isNaN(x));
+                            }} catch (e) {{
+                                console.error('Error parsing tokens:', e);
+                                action[key] = [];
+                            }}
+                        }} else {{
+                            action[key] = value;
+                        }}
+                    }}
+                }}
+                try {{
+                    const response = await fetch('/web/step', {{
+                        method: 'POST',
+                        headers: {{ 'Content-Type': 'application/json' }},
+                        body: JSON.stringify({{ action }})
+                    }});
+                    if (!response.ok) {{
+                        throw new Error(`HTTP error! status: ${{response.status}}`);
+                    }}
+                    const result = await response.json();
+                    console.log('Step result:', result);
+                }} catch (error) {{
+                    console.error('Error submitting action:', error);
+                    alert('Error submitting action: ' + error.message);
+                }}
+            }}
+            async resetEnvironment() {{
+                try {{
+                    const response = await fetch('/web/reset', {{
+                        method: 'POST',
+                        headers: {{ 'Content-Type': 'application/json' }}
+                    }});
+                    if (!response.ok) {{
+                        throw new Error(`HTTP error! status: ${{response.status}}`);
+                    }}
+                    const result = await response.json();
+                    console.log('Reset result:', result);
+                }} catch (error) {{
+                    console.error('Error resetting environment:', error);
+                    alert('Error resetting environment: ' + error.message);
+                }}
+            }}
+            async getState() {{
+                try {{
+                    const response = await fetch('/web/state');
+                    const state = await response.json();
+                    console.log('Current state:', state);
+                    alert('Current state: ' + JSON.stringify(state, null, 2));
+                }} catch (error) {{
+                    console.error('Error getting state:', error);
+                    alert('Error getting state: ' + error.message);
+                }}
+            }}
+            updateConnectionStatus(connected) {{
+                const indicator = document.getElementById('connection-status');
+                if (connected) {{
+                    indicator.className = 'status-indicator status-connected';
+                }} else {{
+                    indicator.className = 'status-indicator status-disconnected';
+                }}
+            }}
+            updateUI(episodeState) {{
+                // Check if this is a chat environment
+                const isChatEnv = document.getElementById('chat-messages') !== null;
+                // Update current state
+                document.getElementById('env-status').textContent =
+                    episodeState.is_reset ? 'Reset' : 'Running';
+                document.getElementById('episode-id').textContent =
+                    episodeState.episode_id || '-';
+                document.getElementById('step-count').textContent =
+                    episodeState.step_count.toString();
+                if (isChatEnv) {{
+                    // Update chat interface
+                    this.updateChatInterface(episodeState);
+                }} else {{
+                    // Update traditional observation display
+                    const observationDiv = document.getElementById('current-observation');
+                    if (episodeState.current_observation) {{
+                        observationDiv.textContent = JSON.stringify(
+                            episodeState.current_observation, null, 2
+                        );
+                    }} else {{
+                        observationDiv.textContent = 'No observation yet';
+                    }}
+                }}
+                // Update action logs
+                const logsDiv = document.getElementById('action-logs');
+                if (episodeState.action_logs.length === 0) {{
+                    logsDiv.innerHTML = 'No actions taken yet';
+                }} else {{
+                    logsDiv.innerHTML = episodeState.action_logs.map(log => `
+                        <div class="log-entry">
+                            <div class="log-timestamp">${{log.timestamp}} (Step ${{log.step_count}})</div>
+                            <div class="log-action">Action: ${{JSON.stringify(log.action, null, 2)}}</div>
+                            <div class="log-observation">Observation: ${{JSON.stringify(log.observation, null, 2)}}</div>
+                            <div>
+                                <span class="log-reward">Reward: ${{log.reward !== null ? log.reward : 'None'}}</span>
+                                ${{log.done ? '<span class="log-done">DONE</span>' : ''}}
+                            </div>
+                        </div>
+                    `).join('');
+                }}
+            }}
+            updateChatInterface(episodeState) {{
+                const chatMessages = document.getElementById('chat-messages');
+                if (!chatMessages) return;
+                // Clear existing messages (except system message)
+                const systemMessage = chatMessages.querySelector('.chat-message.system');
+                chatMessages.innerHTML = '';
+                if (systemMessage) {{
+                    chatMessages.appendChild(systemMessage);
+                }}
+                // Add messages from current observation
+                if (episodeState.current_observation && episodeState.current_observation.messages) {{
+                    episodeState.current_observation.messages.forEach(msg => {{
+                        this.addMessageToChat(msg.role, msg.content);
+                    }});
+                }}
+            }}
+        }}
+        // Initialize the web interface when the page loads
+        document.addEventListener('DOMContentLoaded', () => {{
+            new OpenEnvWebInterface();
+        }});
+    </script>
+</body>
+</html>
+    """.replace('{_generate_action_form_fields(action_fields)}', _generate_action_form_fields(action_fields))
+def _generate_instructions_section(metadata: Optional[EnvironmentMetadata]) -> str:
+    """Generate the instructions section with environment documentation."""
+    if not metadata or not metadata.readme_content:
+        return ''
+    # Convert markdown to HTML (basic conversion)
+    import re
+    html_content = _markdown_to_html(metadata.readme_content)
+    return f'''
+                <!-- Instructions Section -->
+                <div class="instructions-section">
+                    <div class="instructions-header">
+                        <h3 class="instructions-title">{metadata.name}</h3>
+                        <button class="instructions-toggle" id="instructions-toggle">Show Instructions</button>
+                    </div>
+                    <div class="instructions-content" id="instructions-content">
+                        <div class="instructions-readme">
+                            {html_content}
+                        </div>
+                    </div>
+                </div>
+    '''
+def _extract_action_fields(action_cls: Type[Action]) -> List[Dict[str, Any]]:
+    """Extract enhanced field metadata from Action class for form generation."""
+    import typing
+    from typing import get_origin, get_args
+    action_fields = []
+    if not hasattr(action_cls, '__dataclass_fields__'):
+        return action_fields
+    for field_name, field_info in action_cls.__dataclass_fields__.items():
+        if field_name == 'metadata':
+            continue
+        field_type = field_info.type
+        field_metadata = _extract_field_metadata(field_name, field_info)
+        # Determine input type based on field type
+        input_type = _determine_input_type(field_type)
+        # Check if field is required
+        is_required = field_info.default is field_info.default_factory
+        action_fields.append({
+            'name': field_name,
+            'type': input_type,
+            'required': is_required,
+            'description': field_metadata.get('description', ''),
+            'default_value': field_metadata.get('default_value'),
+            'choices': field_metadata.get('choices', []),
+            'min_value': field_metadata.get('min_value'),
+            'max_value': field_metadata.get('max_value'),
+            'placeholder': field_metadata.get('placeholder', ''),
+            'help_text': field_metadata.get('help_text', ''),
+        })
+    return action_fields
+def _extract_field_metadata(field_name: str, field_info) -> Dict[str, Any]:
+    """Extract metadata from dataclass field including docstring and type hints."""
+    import typing
+    from typing import get_origin, get_args, Literal, Union, Optional
+    metadata = {}
+    # Extract description from field docstring or annotation
+    if hasattr(field_info, 'metadata') and field_info.metadata:
+        # Check for custom metadata
+        for meta in field_info.metadata:
+            if isinstance(meta, dict):
+                metadata.update(meta)
+    # Extract type information
+    field_type = field_info.type
+    origin = get_origin(field_type)
+    # Handle Literal types for dropdown choices
+    if origin is Literal:
+        args = get_args(field_type)
+        metadata['choices'] = list(args)
+    # Handle Optional types
+    if origin is Union:
+        args = get_args(field_type)
+        if len(args) == 2 and type(None) in args:
+            # This is Optional[SomeType]
+            non_none_type = args[0] if args[1] is type(None) else args[1]
+            metadata['optional'] = True
+            # Recursively check the non-None type for choices
+            if get_origin(non_none_type) is Literal:
+                metadata['choices'] = list(get_args(non_none_type))
+        else:
+            # Regular Union type
+            metadata['choices'] = [str(arg) for arg in args if arg is not type(None)]
+    # Handle numeric constraints
+    if field_type in (int, float):
+        # Check for common constraint patterns in field name
+        if 'count' in field_name.lower() or 'num' in field_name.lower():
+            metadata['min_value'] = 0
+        if 'id' in field_name.lower():
+            metadata['min_value'] = 0
+    # Generate placeholder text
+    if 'message' in field_name.lower():
+        metadata['placeholder'] = f'Enter {field_name.replace("_", " ")}...'
+    elif 'code' in field_name.lower():
+        metadata['placeholder'] = 'Enter Python code here...'
+    elif 'tokens' in field_name.lower():
+        metadata['placeholder'] = 'Enter comma-separated token IDs (e.g., 1,2,3,4,5)'
+    else:
+        metadata['placeholder'] = f'Enter {field_name.replace("_", " ")}...'
+    # Generate help text based on field name and type
+    if 'action_id' in field_name.lower():
+        metadata['help_text'] = 'The action ID to execute in the environment'
+    elif 'game_name' in field_name.lower():
+        metadata['help_text'] = 'Name of the game or environment'
+    elif 'tokens' in field_name.lower():
+        metadata['help_text'] = 'Token IDs as a comma-separated list of integers'
+    elif 'code' in field_name.lower():
+        metadata['help_text'] = 'Python code to execute in the environment'
+    elif 'message' in field_name.lower():
+        metadata['help_text'] = 'Text message to send'
+    return metadata
+def _determine_input_type(field_type) -> str:
+    """Determine the appropriate HTML input type for a field type."""
+    import typing
+    from typing import get_origin, get_args, Literal, Union
+    # Handle direct types
+    if field_type == str:
+        return "text"
+    elif field_type == int:
+        return "number"
+    elif field_type == float:
+        return "number"
+    elif field_type == bool:
+        return "checkbox"
+    # Handle complex types
+    origin = get_origin(field_type)
+    if origin is Literal:
+        return "select"
+    elif origin is Union:
+        args = get_args(field_type)
+        if len(args) == 2 and type(None) in args:
+            # Optional type - use the non-None type
+            non_none_type = args[0] if args[1] is type(None) else args[1]
+            return _determine_input_type(non_none_type)
+        elif all(isinstance(arg, str) for arg in args if arg is not type(None)):
+            return "select"
+        else:
+            return "text"
+    elif hasattr(field_type, '__name__') and 'Tensor' in field_type.__name__:
+        return "tensor"
+    else:
+        return "text"
+def _markdown_to_html(markdown: str) -> str:
+    """Convert basic markdown to HTML for README display."""
+    import html
+    import re
+    # Escape HTML first
+    html_content = html.escape(markdown)
+    # Convert headers
+    html_content = re.sub(r'^# (.*?)$', r'<h1>\1</h1>', html_content, flags=re.MULTILINE)
+    html_content = re.sub(r'^## (.*?)$', r'<h2>\1</h2>', html_content, flags=re.MULTILINE)
+    html_content = re.sub(r'^### (.*?)$', r'<h3>\1</h3>', html_content, flags=re.MULTILINE)
+    # Convert code blocks
+    html_content = re.sub(r'```(.*?)\n(.*?)\n```', r'<pre><code>\2</code></pre>', html_content, flags=re.DOTALL)
+    html_content = re.sub(r'`([^`]+)`', r'<code>\1</code>', html_content)
+    # Convert bold and italic
+    html_content = re.sub(r'\*\*(.*?)\*\*', r'<strong>\1</strong>', html_content)
+    html_content = re.sub(r'\*(.*?)\*', r'<em>\1</em>', html_content)
+    # Convert lists
+    html_content = re.sub(r'^- (.*?)$', r'<li>\1</li>', html_content, flags=re.MULTILINE)
+    html_content = re.sub(r'(<li>.*</li>)', r'<ul>\1</ul>', html_content, flags=re.DOTALL)
+    # Convert line breaks
+    html_content = html_content.replace('\n', '<br>')
+    return html_content
+def _generate_action_interface(action_fields: List[Dict[str, Any]], is_chat_env: bool) -> str:
+    """Generate either a chat interface or action form based on environment type."""
+    if is_chat_env:
+        return _generate_chat_interface()
+    else:
+        return _generate_action_form(action_fields)
+def _generate_chat_interface() -> str:
+    """Generate a chat-style interface for chat environments."""
+    return '''
+                <!-- Chat Interface -->
+                <div class="chat-interface">
+                    <h3>Chat Interface</h3>
+                    <div class="chat-messages" id="chat-messages">
+                        <div class="chat-message system">
+                            <div class="message-role">System</div>
+                            <div class="message-content">Chat environment ready. Send a message to start the conversation.</div>
+                        </div>
+                    </div>
+                    <div class="chat-input-container">
+                        <div class="role-selector">
+                            <label for="message-role">Role:</label>
+                            <select id="message-role">
+                                <option value="user">User</option>
+                                <option value="assistant">Assistant</option>
+                            </select>
+                        </div>
+                        <div class="message-input">
+                            <textarea id="message-input" placeholder="Type your message here..." rows="3"></textarea>
+                            <button class="btn" id="send-message-btn">Send Message</button>
+                        </div>
+                    </div>
+                </div>
+    '''
+def _generate_action_form(action_fields: List[Dict[str, Any]]) -> str:
+    """Generate a traditional action form for non-chat environments."""
+    return f'''
+                <!-- Action Form -->
+                <div class="action-form">
+                    <h3>Take Action</h3>
+                    <form id="action-form">
+                        {_generate_action_form_fields(action_fields)}
+                        <button type="submit" class="btn" id="step-btn">Step</button>
+                    </form>
+                </div>
+    '''
+def _generate_action_form_fields(action_fields: List[Dict[str, Any]]) -> str:
+    """Generate HTML form fields for action input with enhanced metadata."""
+    if not action_fields:
+        return '<p>No action fields available</p>'
+    fields_html = []
+    for field in action_fields:
+        field_html = _generate_single_field(field)
+        fields_html.append(field_html)
+    return '\n'.join(fields_html)
+def _generate_single_field(field: Dict[str, Any]) -> str:
+    """Generate HTML for a single form field with enhanced metadata."""
+    field_name = field['name']
+    field_type = field['type']
+    required = field['required']
+    placeholder = field.get('placeholder', '')
+    help_text = field.get('help_text', '')
+    choices = field.get('choices', [])
+    min_value = field.get('min_value')
+    max_value = field.get('max_value')
+    default_value = field.get('default_value')
+    # Build label with required indicator
+    label_text = field_name.replace('_', ' ').title()
+    if required:
+        label_text += ' <span style="color: red;">*</span>'
+    # Build input attributes
+    input_attrs = []
+    if required:
+        input_attrs.append('required')
+    if placeholder:
+        input_attrs.append(f'placeholder="{placeholder}"')
+    if min_value is not None:
+        input_attrs.append(f'min="{min_value}"')
+    if max_value is not None:
+        input_attrs.append(f'max="{max_value}"')
+    if default_value is not None:
+        input_attrs.append(f'value="{default_value}"')
+    attrs_str = ' '.join(input_attrs)
+    if field_type == 'checkbox':
+        return f'''
+            <div class="form-group">
+                <label>
+                    <input type="checkbox" name="{field_name}" value="true" {attrs_str}>
+                    {label_text}
+                </label>
+                {f'<small class="help-text">{help_text}</small>' if help_text else ''}
+            </div>
+        '''
+    elif field_type == 'select':
+        options_html = []
+        if not required:
+            options_html.append(f'<option value="">-- Select {label_text} --</option>')
+        for choice in choices:
+            selected = 'selected' if str(choice) == str(default_value) else ''
+            options_html.append(f'<option value="{choice}" {selected}>{choice}</option>')
+        return f'''
+            <div class="form-group">
+                <label for="{field_name}">{label_text}:</label>
+                <select name="{field_name}" id="{field_name}" {attrs_str}>
+                    {''.join(options_html)}
+                </select>
+                {f'<small class="help-text">{help_text}</small>' if help_text else ''}
+            </div>
+        '''
+    elif field_type == 'tensor':
+        return f'''
+            <div class="form-group">
+                <label for="{field_name}">{label_text} (comma-separated integers):</label>
+                <input type="text" name="{field_name}" id="{field_name}" {attrs_str}>
+                <small class="help-text">{help_text or 'Enter token IDs as comma-separated integers (e.g., 1,2,3,4,5)'}</small>
+            </div>
+        '''
+    elif field_type == 'text' and ('message' in field_name.lower() or 'code' in field_name.lower()):
+        return f'''
+            <div class="form-group">
+                <label for="{field_name}">{label_text}:</label>
+                <textarea name="{field_name}" id="{field_name}" rows="3" {attrs_str}></textarea>
+                {f'<small class="help-text">{help_text}</small>' if help_text else ''}
+            </div>
+        '''
+    else:
+        return f'''
+            <div class="form-group">
+                <label for="{field_name}">{label_text}:</label>
+                <input type="{field_type}" name="{field_name}" id="{field_name}" {attrs_str}>
+                {f'<small class="help-text">{help_text}</small>' if help_text else ''}
+            </div>
+        '''

src/core/http_env_client.py ADDED Viewed

	@@ -0,0 +1,207 @@

+"""
+core/runner_env.py
+Minimal HTTP-based environment client.
+- Talks to a single env worker exposing: POST /reset, POST /step
+Future hooks (commented below) for:
+- episode_id, seed on reset
+- request_id on step
+- custom headers (auth/trace)
+"""
+from __future__ import annotations
+from abc import ABC, abstractmethod
+from typing import Any, Dict, Generic, Optional, Type, TYPE_CHECKING, TypeVar
+import requests
+from .client_types import StepResult
+from .containers.runtime import LocalDockerProvider
+if TYPE_CHECKING:
+    from .containers.runtime import ContainerProvider
+ActT = TypeVar("ActT")
+ObsT = TypeVar("ObsT")
+EnvClientT = TypeVar("EnvClientT", bound="HTTPEnvClient")
+class HTTPEnvClient(ABC, Generic[ActT, ObsT]):
+    def __init__(
+        self,
+        base_url: str,
+        request_timeout_s: float = 15.0,
+        default_headers: Optional[Dict[str, str]] = None,
+        provider: Optional["ContainerProvider"] = None,
+    ):
+        self._base = base_url.rstrip("/")
+        self._timeout = float(request_timeout_s)
+        self._http = requests.Session()
+        self._headers = default_headers or {}
+        self._provider = provider
+    @classmethod
+    def from_docker_image(
+        cls: Type[EnvClientT],
+        image: str,
+        provider: Optional["ContainerProvider"] = None,
+        **kwargs: Any,
+    ) -> EnvClientT:
+        """
+        Create an environment client by spinning up a Docker container locally.
+        This is a development utility that:
+        1. Starts a Docker container from the specified image
+        2. Waits for the server to be ready
+        3. Creates and returns a client instance connected to the container
+        Note: The container lifecycle management is left to the user or higher-level
+        orchestration. The container will keep running until manually stopped.
+        Args:
+            image: Docker image name to run (e.g., "echo-env:latest")
+            provider: Container provider to use (defaults to LocalDockerProvider)
+            **kwargs: Additional arguments to pass to provider.start_container()
+                     (e.g., env_vars, port)
+        Returns:
+            An instance of the client class connected to the running container
+        Example:
+            >>> from envs.coding_env.client import CodingEnv
+            >>> from envs.coding_env.models import CodeAction
+            >>>
+            >>> # Create environment from image
+            >>> env = CodingEnv.from_docker_image("coding-env:latest")
+            >>>
+            >>> # Create environment with custom env vars
+            >>> env = CodingEnv.from_docker_image(
+            ...     "coding-env:latest",
+            ...     env_vars={"MY_VAR": "value"}
+            ... )
+            >>>
+            >>> # Use the environment
+            >>> result = env.reset()
+            >>> print(result.observation)
+            >>>
+            >>> step_result = env.step(CodeAction(code="print('hello')"))
+            >>> print(step_result.observation.stdout)
+            >>>
+            >>> # Cleanup (optional)
+            >>> env.close()
+        """
+        # Use default provider if none provided
+        if provider is None:
+            provider = LocalDockerProvider()
+        # Extract timeout_s from kwargs for wait_for_ready, with a default
+        timeout_s = kwargs.pop('timeout_s', 30.0)
+        request_timeout_s = kwargs.pop('request_timeout_s', 15.0)
+        # 1. Start container with optional kwargs (e.g., env_vars, port)
+        base_url = provider.start_container(image, **kwargs)
+        # 2. Wait for server to be ready with the specified timeout
+        provider.wait_for_ready(base_url, timeout_s=timeout_s)
+        # 3. Create and return client instance with provider reference and request timeout
+        return cls(base_url=base_url, request_timeout_s=request_timeout_s, provider=provider)
+    @classmethod
+    def from_hub(cls: Type[EnvClientT], repo_id: str, provider: Optional["ContainerProvider"] = None, **kwargs: Any) -> EnvClientT:
+        """
+        Create an environment client by pulling from a Hugging Face model hub.
+        """
+        if provider is None:
+            provider = LocalDockerProvider()
+        if "tag" in kwargs:
+            tag = kwargs["tag"]
+        else:
+            tag = "latest"
+        base_url = f"registry.hf.space/{repo_id.replace('/', '-')}:{tag}"
+        return cls.from_docker_image(image=base_url, provider=provider)
+    @abstractmethod
+    def _step_payload(self, action: ActT) -> dict:
+        """Convert an Action object to the JSON body expected by the env server."""
+        raise NotImplementedError
+    @abstractmethod
+    def _parse_result(self, payload: dict) -> StepResult[ObsT]:
+        """Convert a JSON response from the env server to StepResult[ObsT]."""
+        raise NotImplementedError
+    @abstractmethod
+    def _parse_state(self, payload: dict) -> Any:
+        """Convert a JSON response from the state endpoint to a State object."""
+        raise NotImplementedError
+    # ---------- Environment Server Interface Methods ----------
+    def reset(self) -> StepResult[ObsT]:
+        body: Dict[str, Any] = {}
+        # TODO: later:
+        # body["seed"] = seed
+        # body["episode_id"] = episode_id
+        r = self._http.post(
+            f"{self._base}/reset",
+            json=body,
+            headers=self._headers,
+            timeout=self._timeout,
+        )
+        r.raise_for_status()
+        return self._parse_result(r.json())
+    def step(self, action: ActT) -> StepResult[ObsT]:
+        body: Dict[str, Any] = {
+            "action": self._step_payload(action),
+            "timeout_s": int(self._timeout),
+        }
+        # TODO: later:
+        # body["request_id"] = str(uuid.uuid4())
+        # body["episode_id"] = current_episode_id
+        r = self._http.post(
+            f"{self._base}/step",
+            json=body,
+            headers=self._headers,
+            timeout=self._timeout,
+        )
+        r.raise_for_status()
+        return self._parse_result(r.json())
+    def state(self) -> Any:
+        """
+        Get the current environment state from the server.
+        Returns:
+            State object with environment state information (e.g., episode_id, step_count)
+        Example:
+            >>> client = EchoEnv.from_docker_image("echo-env:latest")
+            >>> result = client.reset()
+            >>> state = client.state()
+            >>> print(state.episode_id)
+            >>> print(state.step_count)
+        """
+        r = self._http.get(
+            f"{self._base}/state",
+            headers=self._headers,
+            timeout=self._timeout,
+        )
+        r.raise_for_status()
+        return self._parse_state(r.json())
+    def close(self) -> None:
+        """
+        Close the environment and clean up resources.
+        If this client was created via from_docker_image(), this will stop
+        and remove the associated container.
+        """
+        if self._provider is not None:
+            self._provider.stop_container()

src/core/pyproject.toml ADDED Viewed

	@@ -0,0 +1,46 @@

+[build-system]
+requires = ["setuptools>=45", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "openenv-core"
+version = "0.1.0"
+description = "Core components for OpenEnv - HTTP-based agentic environments"
+readme = "README.md"
+requires-python = ">=3.8"
+license = {text = "BSD-3-Clause"}
+authors = [
+    {name = "Meta Platforms, Inc.", email = "opensource@meta.com"}
+]
+keywords = ["environment", "agent", "http", "docker", "fastapi"]
+dependencies = [
+    "requests>=2.25.0",
+    "fastapi>=0.104.0",
+    "uvicorn>=0.24.0",
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=7.0.0",
+    "black>=23.0.0",
+    "ruff>=0.1.0",
+    "mypy>=1.0.0",
+]
+[project.urls]
+Homepage = "https://github.com/facebookresearch/OpenEnv"
+Repository = "https://github.com/facebookresearch/OpenEnv"
+Documentation = "https://github.com/facebookresearch/OpenEnv/blob/main/README.md"
+"Bug Tracker" = "https://github.com/facebookresearch/OpenEnv/issues"
+[tool.setuptools]
+py-modules = ["openenv_core.__init__", "openenv_core.http_env_client", "openenv_core.client_types"]
+packages = [
+    "openenv_core",
+    "openenv_core.containers",
+    "openenv_core.containers.runtime",
+    "openenv_core.env_server",
+    "openenv_core.tools"
+]
+package-dir = {"openenv_core" = "."}

src/core/tools/__init__.py ADDED Viewed

	@@ -0,0 +1,19 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Core tools for code execution and other utilities."""
+from .git_server_client import GitServerClient, RepoInfo
+from .local_python_executor import PyExecutor
+from .local_julia_executor import JuliaExecutor
+__all__ = [
+    "PyExecutor",
+    "JuliaExecutor",
+    "GitServerClient",
+    "RepoInfo",
+]

src/core/tools/git_server_client.py ADDED Viewed

	@@ -0,0 +1,362 @@

+#!/usr/bin/env python3
+"""
+Git Server Client for connecting to external Gitea instance.
+This module provides a lightweight client for interacting with a shared
+Gitea service, optimized for task-based isolation where multiple environment
+instances share the same Gitea server but have isolated workspaces.
+"""
+import json
+import os
+import shutil
+import subprocess
+import time
+from dataclasses import dataclass
+from pathlib import Path
+from urllib.parse import urlparse
+@dataclass
+class RepoInfo:
+    """Information about a repository."""
+    name: str
+    url: str
+    commit: str
+    clone_url: str
+class GitServerClient:
+    """
+    Client for connecting to an external Gitea server.
+    This client is optimized for task-based isolation where:
+    - Multiple tasks share the same Gitea instance
+    - Each task has its own isolated workspace
+    - Fast reset() via git operations (no server restart)
+    - Repos are pre-migrated to Gitea once
+    Args:
+        gitea_url: URL of the Gitea server (e.g., "http://gitea:3000")
+        username: Gitea username for authentication
+        password: Gitea password for authentication
+        workspace_dir: Local workspace directory for cloning repos
+    Example:
+        >>> # Connect to shared Gitea (credentials from environment)
+        >>> import os
+        >>> client = GitServerClient(
+        ...     gitea_url=os.getenv("GITEA_URL"),
+        ...     username=os.getenv("GITEA_USERNAME"),
+        ...     password=os.getenv("GITEA_PASSWORD")
+        ... )
+        >>> client.wait_for_ready()
+        >>> # Clone repo to workspace
+        >>> path = client.clone_to_workspace("my-repo", commit="abc123")
+        >>> # Fast reset to base state
+        >>> client.reset_workspace("my-repo", commit="abc123")
+    """
+    def __init__(
+        self,
+        gitea_url: str,
+        username: str,
+        password: str,
+        workspace_dir: str = "/workspace",
+    ):
+        """Initialize Git Server Client."""
+        self.gitea_url = gitea_url.rstrip("/")
+        self.username = username
+        self.password = password
+        self.workspace_dir = Path(workspace_dir)
+        self.is_ready = False
+        # Parse Gitea URL
+        parsed = urlparse(self.gitea_url)
+        self.domain = parsed.hostname or "localhost"
+        self.port = parsed.port or 3000
+        # Ensure workspace exists
+        os.makedirs(self.workspace_dir, exist_ok=True)
+        # Configure git credentials
+        self._configure_git()
+    def _configure_git(self):
+        """Configure git credentials for automatic authentication."""
+        home_dir = Path.home()
+        # Git config
+        git_config = f"""[user]
+    name = {self.username}
+    email = {self.username}@local.env
+[init]
+    defaultBranch = main
+[credential]
+    helper = store
+"""
+        gitconfig_path = home_dir / ".gitconfig"
+        gitconfig_path.write_text(git_config)
+        # Git credentials
+        git_credentials = f"http://{self.username}:{self.password}@{self.domain}:{self.port}\n"
+        gitcreds_path = home_dir / ".git-credentials"
+        gitcreds_path.write_text(git_credentials)
+        gitcreds_path.chmod(0o600)
+    def wait_for_ready(self, timeout: int = 30) -> bool:
+        """
+        Wait for Gitea server to be ready.
+        Args:
+            timeout: Maximum seconds to wait
+        Returns:
+            True if server is ready, False otherwise
+        """
+        start_time = time.time()
+        while time.time() - start_time < timeout:
+            try:
+                result = subprocess.run(
+                    ["curl", "-sf", f"{self.gitea_url}/"],
+                    capture_output=True,
+                    timeout=5,
+                )
+                if result.returncode == 0:
+                    self.is_ready = True
+                    return True
+            except subprocess.TimeoutExpired:
+                pass
+            except Exception:
+                pass
+            time.sleep(1)
+        return False
+    def list_repositories(self) -> list[dict[str, str]]:
+        """
+        List all repositories in Gitea.
+        Returns:
+            List of repository information dictionaries
+        """
+        if not self.is_ready:
+            raise RuntimeError("Gitea server is not ready")
+        result = subprocess.run(
+            [
+                "curl",
+                "-s",
+                f"{self.gitea_url}/api/v1/user/repos",
+                "-u",
+                f"{self.username}:{self.password}",
+            ],
+            capture_output=True,
+            text=True,
+        )
+        if result.returncode != 0:
+            return []
+        try:
+            repos = json.loads(result.stdout)
+            return [
+                {
+                    "name": repo["name"],
+                    "full_name": repo["full_name"],
+                    "clone_url": repo["clone_url"],
+                    "description": repo.get("description", ""),
+                }
+                for repo in repos
+            ]
+        except (json.JSONDecodeError, KeyError):
+            return []
+    def clone_to_workspace(
+        self, repo_name: str, target_dir: str | None = None, commit: str = "main"
+    ) -> str:
+        """
+        Clone a repository to the workspace at a specific commit.
+        This creates a fresh clone optimized for task isolation.
+        Args:
+            repo_name: Name of repository to clone
+            target_dir: Target directory name (defaults to repo_name)
+            commit: Commit hash or branch to check out
+        Returns:
+            Path to cloned repository
+        Raises:
+            RuntimeError: If clone fails
+        """
+        if not self.is_ready:
+            raise RuntimeError("Gitea server is not ready")
+        target_dir = target_dir or repo_name
+        target_path = self.workspace_dir / target_dir
+        # Remove existing directory if present
+        if target_path.exists():
+            shutil.rmtree(target_path)
+        clone_url = f"{self.gitea_url}/{self.username}/{repo_name}.git"
+        # Clone repository
+        result = subprocess.run(
+            ["git", "clone", clone_url, str(target_path)],
+            capture_output=True,
+            text=True,
+        )
+        if result.returncode != 0:
+            raise RuntimeError(f"Clone failed: {result.stderr}")
+        # Checkout specific commit
+        if commit != "main":
+            result = subprocess.run(
+                ["git", "checkout", commit],
+                cwd=str(target_path),
+                capture_output=True,
+                text=True,
+            )
+            if result.returncode != 0:
+                raise RuntimeError(f"Checkout failed: {result.stderr}")
+        return str(target_path)
+    def reset_workspace(self, repo_name: str, commit: str = "main") -> bool:
+        """
+        Fast reset of workspace to base state (optimized for task resets).
+        This is much faster than re-cloning. It:
+        1. Checks out the target commit
+        2. Resets to that commit (hard)
+        3. Cleans untracked files
+        Args:
+            repo_name: Name of repository (directory in workspace)
+            commit: Commit hash or branch to reset to
+        Returns:
+            True if reset successful
+        Raises:
+            RuntimeError: If reset fails
+        """
+        repo_path = self.workspace_dir / repo_name
+        if not repo_path.exists():
+            raise RuntimeError(f"Repository not found in workspace: {repo_name}")
+        # Fetch latest (in case commit is new)
+        subprocess.run(
+            ["git", "fetch", "--all"],
+            cwd=str(repo_path),
+            capture_output=True,
+        )
+        # Checkout and hard reset to commit
+        result = subprocess.run(
+            ["git", "checkout", commit],
+            cwd=str(repo_path),
+            capture_output=True,
+            text=True,
+        )
+        if result.returncode != 0:
+            raise RuntimeError(f"Checkout failed: {result.stderr}")
+        result = subprocess.run(
+            ["git", "reset", "--hard", f"origin/{commit}" if commit != "main" else commit],
+            cwd=str(repo_path),
+            capture_output=True,
+            text=True,
+        )
+        if result.returncode != 0:
+            # Try without origin/ prefix
+            result = subprocess.run(
+                ["git", "reset", "--hard", commit],
+                cwd=str(repo_path),
+                capture_output=True,
+                text=True,
+            )
+            if result.returncode != 0:
+                raise RuntimeError(f"Reset failed: {result.stderr}")
+        # Clean untracked files and directories
+        subprocess.run(
+            ["git", "clean", "-fdx"],
+            cwd=str(repo_path),
+            capture_output=True,
+        )
+        return True
+    def execute_git_command(
+        self, command: str, working_dir: str = ""
+    ) -> tuple[int, str, str]:
+        """
+        Execute a git command in the workspace.
+        Args:
+            command: Git command to execute (without 'git' prefix)
+            working_dir: Working directory relative to workspace
+        Returns:
+            Tuple of (exit_code, stdout, stderr)
+        """
+        work_path = (
+            self.workspace_dir / working_dir if working_dir else self.workspace_dir
+        )
+        if not work_path.exists():
+            return (1, "", f"Working directory does not exist: {work_path}")
+        # Split command safely
+        cmd_parts = ["git"] + command.split()
+        result = subprocess.run(
+            cmd_parts,
+            cwd=str(work_path),
+            capture_output=True,
+            text=True,
+        )
+        return (result.returncode, result.stdout, result.stderr)
+    def get_current_commit(self, repo_name: str) -> str:
+        """
+        Get current commit hash of a workspace repository.
+        Args:
+            repo_name: Name of repository in workspace
+        Returns:
+            Commit hash
+        """
+        repo_path = self.workspace_dir / repo_name
+        if not repo_path.exists():
+            raise RuntimeError(f"Repository not found: {repo_name}")
+        result = subprocess.run(
+            ["git", "rev-parse", "HEAD"],
+            cwd=str(repo_path),
+            capture_output=True,
+            text=True,
+        )
+        if result.returncode != 0:
+            raise RuntimeError(f"Failed to get commit: {result.stderr}")
+        return result.stdout.strip()
+    def workspace_exists(self, repo_name: str) -> bool:
+        """Check if a repository exists in workspace."""
+        return (self.workspace_dir / repo_name).exists()

src/core/tools/julia_process_pool.py ADDED Viewed

	@@ -0,0 +1,509 @@

+# Copyright (c) Yogesh Singla and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Julia Process Pool for high-performance code execution.
+This module provides a pool of persistent Julia processes that can be reused
+for multiple code executions, eliminating the overhead of spawning new processes.
+Expected speedup: 50-100x for repeated executions compared to spawning new processes.
+Features:
+- Persistent Julia processes (no startup overhead)
+- Thread-safe process allocation
+- Automatic recovery from process failures
+- Proper cleanup on shutdown
+- Timeout handling per execution
+Example:
+    >>> pool = JuliaProcessPool(size=4, timeout=30)
+    >>> result = pool.execute("println('Hello, Julia!')")
+    >>> print(result.stdout)  # "Hello, Julia!\n"
+    >>> pool.shutdown()  # Clean up all processes
+"""
+import atexit
+import logging
+import os
+import subprocess
+import threading
+import time
+from collections import deque
+from pathlib import Path
+from typing import Optional
+from core.env_server.types import CodeExecResult
+# Setup logging
+logger = logging.getLogger(__name__)
+class JuliaWorkerProcess:
+    """
+    Single Julia worker process that can execute code repeatedly.
+    This class manages communication with a persistent Julia REPL process
+    using a delimiter-based protocol.
+    """
+    # Communication protocol delimiters
+    START_OUTPUT = "<<<START_OUTPUT>>>"
+    START_ERROR = "<<<START_ERROR>>>"
+    EXIT_CODE_PREFIX = "<<<EXIT_CODE:"
+    END_EXECUTION = "<<<END_EXECUTION>>>"
+    END_CODE = "<<<END_CODE>>>"
+    def __init__(
+        self,
+        worker_id: int,
+        julia_path: str,
+        worker_script: str,
+        optimization_flags: bool = True,
+    ):
+        """
+        Initialize a Julia worker process.
+        Args:
+            worker_id: Unique identifier for this worker
+            julia_path: Path to Julia executable
+            worker_script: Path to julia_repl_worker.jl script
+            optimization_flags: Enable Julia optimization flags
+        """
+        self.worker_id = worker_id
+        self.julia_path = julia_path
+        self.worker_script = worker_script
+        self.optimization_flags = optimization_flags
+        self.process: Optional[subprocess.Popen] = None
+        self.is_busy = False
+        self.is_healthy = True
+        self.lock = threading.Lock()
+        # Start the worker process
+        self._start_process()
+    def _start_process(self) -> None:
+        """Start the Julia worker process."""
+        cmd = [self.julia_path]
+        if self.optimization_flags:
+            cmd.extend(
+                [
+                    "--compile=min",
+                    "--optimize=2",
+                    "--startup-file=no",
+                    "--history-file=no",
+                ]
+            )
+        cmd.append(self.worker_script)
+        try:
+            self.process = subprocess.Popen(
+                cmd,
+                stdin=subprocess.PIPE,
+                stdout=subprocess.PIPE,
+                stderr=subprocess.PIPE,
+                text=True,
+                bufsize=1,  # Line buffered
+            )
+            # Wait for "Julia worker ready" message on stderr
+            ready_msg = self.process.stderr.readline()
+            if "ready" not in ready_msg.lower():
+                raise RuntimeError(
+                    f"Worker {self.worker_id} did not start properly: {ready_msg}"
+                )
+            self.is_healthy = True
+            logger.info(f"Worker {self.worker_id} started (PID: {self.process.pid})")
+        except Exception as e:
+            self.is_healthy = False
+            logger.error(f"Failed to start worker {self.worker_id}: {e}")
+            raise
+    def execute(self, code: str, timeout: int = 60) -> CodeExecResult:
+        """
+        Execute Julia code in this worker process.
+        Args:
+            code: Julia code to execute
+            timeout: Maximum execution time in seconds
+        Returns:
+            CodeExecResult with stdout, stderr, and exit_code
+        """
+        with self.lock:
+            if not self.is_healthy or self.process is None:
+                raise RuntimeError(f"Worker {self.worker_id} is not healthy")
+            self.is_busy = True
+            try:
+                # Send code to worker
+                self.process.stdin.write(code + "\n")
+                self.process.stdin.write(self.END_CODE + "\n")
+                self.process.stdin.flush()
+                # Read response with timeout
+                start_time = time.time()
+                stdout_lines = []
+                stderr_lines = []
+                exit_code = -1
+                current_section = None  # Track which section we're reading
+                while True:
+                    # Check timeout
+                    if time.time() - start_time > timeout:
+                        logger.error(f"Worker {self.worker_id} execution timed out")
+                        self.is_healthy = False
+                        self._kill_process()
+                        return CodeExecResult(
+                            stdout="",
+                            stderr=f"Execution timed out after {timeout} seconds",
+                            exit_code=-1,
+                        )
+                    # Read line with timeout (use select for non-blocking read on Unix)
+                    try:
+                        line = self.process.stdout.readline()
+                        if not line:
+                            # EOF - process died
+                            logger.error(f"Worker {self.worker_id} died unexpectedly")
+                            self.is_healthy = False
+                            return CodeExecResult(
+                                stdout="".join(stdout_lines),
+                                stderr="Worker process died unexpectedly",
+                                exit_code=-1,
+                            )
+                        line = line.rstrip("\n")
+                        # Check for delimiters
+                        if line == self.START_OUTPUT:
+                            current_section = "stdout"
+                            continue
+                        elif line == self.START_ERROR:
+                            current_section = "stderr"
+                            continue
+                        elif line.startswith(self.EXIT_CODE_PREFIX):
+                            # Parse exit code
+                            exit_code_str = line[
+                                len(self.EXIT_CODE_PREFIX) : -3
+                            ]  # Remove prefix and ">>>"
+                            exit_code = int(exit_code_str)
+                            continue
+                        elif line == self.END_EXECUTION:
+                            # Execution complete
+                            break
+                        # Accumulate output
+                        if current_section == "stdout":
+                            stdout_lines.append(line)
+                        elif current_section == "stderr":
+                            stderr_lines.append(line)
+                    except Exception as e:
+                        logger.error(f"Error reading from worker {self.worker_id}: {e}")
+                        self.is_healthy = False
+                        return CodeExecResult(
+                            stdout="".join(stdout_lines),
+                            stderr=f"Error reading from worker: {str(e)}",
+                            exit_code=-1,
+                        )
+                # Reconstruct output (add newlines back)
+                stdout_str = "\n".join(stdout_lines) + ("\n" if stdout_lines else "")
+                stderr_str = "\n".join(stderr_lines) + ("\n" if stderr_lines else "")
+                return CodeExecResult(
+                    stdout=stdout_str,
+                    stderr=stderr_str,
+                    exit_code=exit_code,
+                )
+            finally:
+                self.is_busy = False
+    def _kill_process(self) -> None:
+        """Kill the worker process."""
+        if self.process is not None:
+            try:
+                self.process.terminate()
+                self.process.wait(timeout=2.0)
+            except:
+                try:
+                    self.process.kill()
+                    self.process.wait(timeout=1.0)
+                except:
+                    pass
+    def shutdown(self) -> None:
+        """Shutdown the worker process gracefully."""
+        with self.lock:
+            if self.process is not None:
+                logger.info(f"Shutting down worker {self.worker_id}")
+                self._kill_process()
+                self.process = None
+                self.is_healthy = False
+class JuliaProcessPool:
+    """
+    Pool of persistent Julia processes for high-performance code execution.
+    This class manages multiple Julia worker processes and distributes
+    code execution among them, providing significant speedup by eliminating
+    process startup overhead.
+    Thread-safe for concurrent access from multiple threads.
+    Example:
+        >>> pool = JuliaProcessPool(size=4)
+        >>>
+        >>> # Execute code
+        >>> result = pool.execute("println('Hello')")
+        >>>
+        >>> # Pool automatically manages workers
+        >>> results = [pool.execute(f"println({i})") for i in range(100)]
+        >>>
+        >>> # Cleanup when done
+        >>> pool.shutdown()
+    """
+    def __init__(
+        self,
+        size: int = 4,
+        timeout: int = 60,
+        julia_path: Optional[str] = None,
+        optimization_flags: bool = True,
+        auto_recover: bool = True,
+    ):
+        """
+        Initialize the Julia process pool.
+        Args:
+            size: Number of worker processes to create (default: 4)
+            timeout: Default timeout for code execution in seconds (default: 60)
+            julia_path: Path to Julia executable (auto-detected if None)
+            optimization_flags: Enable Julia optimization flags (default: True)
+            auto_recover: Automatically restart failed workers (default: True)
+        Raises:
+            RuntimeError: If Julia executable is not found
+        """
+        self.size = size
+        self.timeout = timeout
+        self.optimization_flags = optimization_flags
+        self.auto_recover = auto_recover
+        # Find Julia executable
+        if julia_path is None:
+            julia_path = self._find_julia_executable()
+        self.julia_path = julia_path
+        # Find worker script
+        self.worker_script = self._find_worker_script()
+        # Initialize workers
+        self.workers: list[JuliaWorkerProcess] = []
+        self.available_workers: deque[JuliaWorkerProcess] = deque()
+        self.pool_lock = threading.Lock()
+        self.shutdown_flag = False
+        # Create worker processes
+        logger.info(f"Creating Julia process pool with {size} workers")
+        for i in range(size):
+            try:
+                worker = JuliaWorkerProcess(
+                    worker_id=i,
+                    julia_path=self.julia_path,
+                    worker_script=self.worker_script,
+                    optimization_flags=self.optimization_flags,
+                )
+                self.workers.append(worker)
+                self.available_workers.append(worker)
+            except Exception as e:
+                logger.error(f"Failed to create worker {i}: {e}")
+                # Clean up partially created pool
+                self.shutdown()
+                raise RuntimeError(f"Failed to create worker pool: {e}")
+        logger.info(f"Julia process pool initialized with {len(self.workers)} workers")
+        # Register cleanup on exit
+        atexit.register(self.shutdown)
+    def _find_julia_executable(self) -> str:
+        """Find Julia executable in PATH or common locations."""
+        # Try PATH first
+        julia_path = os.popen("which julia").read().strip()
+        if julia_path:
+            return julia_path
+        # Try common locations
+        common_paths = [
+            os.path.expanduser("~/.juliaup/bin/julia"),
+            os.path.expanduser("~/.julia/bin/julia"),
+            "/usr/local/bin/julia",
+            "/usr/bin/julia",
+        ]
+        for path in common_paths:
+            if os.path.isfile(path) and os.access(path, os.X_OK):
+                return path
+        raise RuntimeError(
+            "Julia executable not found. Please install Julia: "
+            "https://julialang.org/downloads/"
+        )
+    def _find_worker_script(self) -> str:
+        """Find the julia_repl_worker.jl script."""
+        # Try relative to this file
+        this_dir = Path(__file__).parent
+        worker_script = this_dir / "julia_repl_worker.jl"
+        if worker_script.exists():
+            return str(worker_script)
+        raise RuntimeError(
+            f"Worker script not found at {worker_script}. "
+            "Please ensure julia_repl_worker.jl is in the same directory."
+        )
+    def _get_available_worker(
+        self, timeout: float = 30.0
+    ) -> Optional[JuliaWorkerProcess]:
+        """
+        Get an available worker from the pool.
+        Args:
+            timeout: Maximum time to wait for a worker (seconds)
+        Returns:
+            Available worker or None if timeout
+        """
+        start_time = time.time()
+        while time.time() - start_time < timeout:
+            with self.pool_lock:
+                # Try to get healthy worker
+                while self.available_workers:
+                    worker = self.available_workers.popleft()
+                    if worker.is_healthy:
+                        return worker
+                    # Worker is unhealthy, try to recover
+                    if self.auto_recover and not self.shutdown_flag:
+                        logger.warning(
+                            f"Worker {worker.worker_id} is unhealthy, attempting recovery"
+                        )
+                        try:
+                            worker.shutdown()
+                            worker = JuliaWorkerProcess(
+                                worker_id=worker.worker_id,
+                                julia_path=self.julia_path,
+                                worker_script=self.worker_script,
+                                optimization_flags=self.optimization_flags,
+                            )
+                            # Update in workers list
+                            self.workers[worker.worker_id] = worker
+                            return worker
+                        except Exception as e:
+                            logger.error(
+                                f"Failed to recover worker {worker.worker_id}: {e}"
+                            )
+            # No workers available, wait a bit
+            time.sleep(0.1)
+        logger.error("Timeout waiting for available worker")
+        return None
+    def _return_worker(self, worker: JuliaWorkerProcess) -> None:
+        """Return a worker to the available pool."""
+        with self.pool_lock:
+            if worker.is_healthy and not self.shutdown_flag:
+                self.available_workers.append(worker)
+    def execute(self, code: str, timeout: Optional[int] = None) -> CodeExecResult:
+        """
+        Execute Julia code using an available worker from the pool.
+        Args:
+            code: Julia code to execute
+            timeout: Execution timeout in seconds (uses pool default if None)
+        Returns:
+            CodeExecResult with stdout, stderr, and exit_code
+        """
+        if self.shutdown_flag:
+            return CodeExecResult(
+                stdout="",
+                stderr="Process pool has been shut down",
+                exit_code=-1,
+            )
+        if timeout is None:
+            timeout = self.timeout
+        # Get available worker
+        worker = self._get_available_worker()
+        if worker is None:
+            return CodeExecResult(
+                stdout="",
+                stderr="No available worker (timeout waiting for worker)",
+                exit_code=-1,
+            )
+        try:
+            # Execute code in worker
+            result = worker.execute(code, timeout=timeout)
+            return result
+        finally:
+            # Return worker to pool
+            self._return_worker(worker)
+    def shutdown(self) -> None:
+        """
+        Shutdown all worker processes gracefully.
+        This method is automatically called on exit via atexit.
+        """
+        if self.shutdown_flag:
+            return
+        logger.info("Shutting down Julia process pool")
+        self.shutdown_flag = True
+        with self.pool_lock:
+            for worker in self.workers:
+                worker.shutdown()
+            self.workers.clear()
+            self.available_workers.clear()
+        logger.info("Julia process pool shutdown complete")
+    def __enter__(self):
+        """Context manager entry."""
+        return self
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        """Context manager exit."""
+        self.shutdown()
+    def __del__(self):
+        """Ensure cleanup on garbage collection."""
+        self.shutdown()

src/core/tools/julia_repl_worker.jl ADDED Viewed

	@@ -0,0 +1,159 @@

+#!/usr/bin/env julia
+"""
+Julia REPL Worker for Process Pool
+This script runs as a persistent Julia process that accepts code via stdin,
+executes it, and returns results via stdout with delimiters.
+Protocol:
+- Input: Code block followed by "<<<END_CODE>>>"
+- Output: Results with status markers:
+  - "<<<START_OUTPUT>>>" - stdout begins
+  - "<<<START_ERROR>>>" - stderr begins
+  - "<<<EXIT_CODE:N>>>" - exit code (0 = success, 1 = error)
+  - "<<<END_EXECUTION>>>" - execution complete
+"""
+# Delimiters for communication protocol
+const START_OUTPUT = "<<<START_OUTPUT>>>"
+const START_ERROR = "<<<START_ERROR>>>"
+const EXIT_CODE_PREFIX = "<<<EXIT_CODE:"
+const END_EXECUTION = "<<<END_EXECUTION>>>"
+const END_CODE = "<<<END_CODE>>>"
+"""
+Execute code and capture output using pipes
+"""
+function execute_code(code::String)
+    # Initialize return values
+    stdout_str = ""
+    stderr_str = ""
+    exit_code = 0
+    # Create pipes for output capture
+    out_pipe = Pipe()
+    err_pipe = Pipe()
+    try
+        # Execute with output redirected to pipes
+        redirect_stdout(out_pipe) do
+            redirect_stderr(err_pipe) do
+                try
+                    # Execute the code using include_string which properly handles
+                    # multiple statements including 'using' statements
+                    include_string(Main, code)
+                catch e
+                    # Execution error - write to stderr
+                    exit_code = 1
+                    showerror(stderr, e, catch_backtrace())
+                    println(stderr)
+                end
+            end
+        end
+        # Close write ends to signal EOF to readers
+        Base.close(out_pipe.in)
+        Base.close(err_pipe.in)
+        # Read captured output
+        stdout_str = read(out_pipe.out, String)
+        stderr_str = read(err_pipe.out, String)
+        # Close read ends
+        Base.close(out_pipe.out)
+        Base.close(err_pipe.out)
+    catch e
+        # Worker error
+        exit_code = 1
+        # Try to close pipes
+        try
+            Base.close(out_pipe)
+            Base.close(err_pipe)
+        catch
+        end
+        stderr_str = "Worker error: " * sprint(showerror, e)
+    end
+    return (stdout_str, stderr_str, exit_code)
+end
+"""
+Main loop: read code, execute, return results
+"""
+function main()
+    # Signal that worker is ready
+    println(stderr, "Julia worker ready")
+    flush(stderr)
+    while true
+        try
+            # Read code until END_CODE delimiter
+            code_lines = String[]
+            while true
+                if eof(stdin)
+                    println(stderr, "Worker received EOF, shutting down")
+                    return
+                end
+                line = readline(stdin)
+                # Check for end of code
+                if line == END_CODE
+                    break
+                end
+                push!(code_lines, line)
+            end
+            # If no code received, continue
+            if isempty(code_lines)
+                # Send empty response
+                println(START_OUTPUT)
+                println(START_ERROR)
+                println(EXIT_CODE_PREFIX, 0, ">>>")
+                println(END_EXECUTION)
+                flush(stdout)
+                continue
+            end
+            code = join(code_lines, "\n")
+            # Execute code and capture output
+            (stdout_str, stderr_str, exit_code) = execute_code(code)
+            # Send results with delimiters
+            println(START_OUTPUT)
+            print(stdout_str)
+            flush(stdout)
+            println(START_ERROR)
+            print(stderr_str)
+            flush(stdout)
+            println(EXIT_CODE_PREFIX, exit_code, ">>>")
+            println(END_EXECUTION)
+            flush(stdout)
+        catch e
+            # Worker error - report and continue
+            println(stderr, "Worker error: ", e)
+            flush(stderr)
+            # Send error response
+            println(START_OUTPUT)
+            println(START_ERROR)
+            println("Worker internal error: ", e)
+            println(EXIT_CODE_PREFIX, 1, ">>>")
+            println(END_EXECUTION)
+            flush(stdout)
+        end
+    end
+end
+# Run main loop
+main()

src/core/tools/local_julia_executor.py ADDED Viewed

	@@ -0,0 +1,474 @@

+# Copyright (c) Yogesh Singla and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Local Julia Executor.
+This module provides functionality for executing Julia code locally using
+subprocess, similar to PyExecutor.
+Features:
+- Proper process cleanup on timeout (no zombie processes)
+- Robust error handling and logging
+- Process group management for complete cleanup
+- Automatic retry on transient failures
+- Optional process pool for 50-100x speedup on repeated executions
+Performance Modes:
+- Standard mode: Spawn new process for each execution (default for single executions)
+- Pool mode: Reuse persistent Julia processes (recommended for repeated executions)
+"""
+import logging
+import os
+import shutil
+import signal
+import subprocess
+import tempfile
+import threading
+import time
+from pathlib import Path
+from typing import Optional
+from core.env_server.types import CodeExecResult
+# Try to import process pool (optional dependency)
+try:
+    from core.tools.julia_process_pool import JuliaProcessPool
+    POOL_AVAILABLE = True
+except ImportError:
+    POOL_AVAILABLE = False
+    JuliaProcessPool = None
+# Setup logging
+logger = logging.getLogger(__name__)
+class JuliaExecutor:
+    """
+    Executor for running Julia code in a subprocess with robust process management.
+    This class provides a safe interface to execute Julia code in isolation
+    and capture the results including stdout, stderr, and exit code.
+    Features:
+    - Proper timeout handling without zombie processes
+    - Process group cleanup for nested processes
+    - Automatic retry on transient failures
+    - Comprehensive logging for debugging
+    - Optional process pool for 50-100x speedup on repeated executions
+    Example:
+        >>> executor = JuliaExecutor()
+        >>> result = executor.run('println("Hello, Julia!")')
+        >>> print(result.stdout)  # "Hello, Julia!\n"
+        >>> print(result.exit_code)  # 0
+        >>>
+        >>> # With tests
+        >>> code = '''
+        ... function add(a, b)
+        ...     return a + b
+        ... end
+        ...
+        ... using Test
+        ... @test add(2, 3) == 5
+        ... '''
+        >>> result = executor.run(code)
+        >>> print(result.exit_code)  # 0
+        >>>
+        >>> # With process pool (recommended for repeated executions)
+        >>> executor.enable_process_pool(size=4)
+        >>> for i in range(100):
+        ...     result = executor.run(f'println({i})')  # 50-100x faster!
+        >>> executor.shutdown_pool()  # Clean up when done
+    """
+    # Class-level process pool (shared across all instances if enabled)
+    _shared_pool: Optional["JuliaProcessPool"] = None
+    _pool_lock = threading.Lock()
+    def __init__(
+        self,
+        timeout: int = 60,
+        max_retries: int = 1,
+        use_optimization_flags: bool = True,
+        use_process_pool: bool = False,
+        pool_size: int = 4,
+    ):
+        """
+        Initialize the JuliaExecutor.
+        Args:
+            timeout: Maximum execution time in seconds (default: 60)
+            max_retries: Number of retry attempts on transient failures (default: 1)
+            use_optimization_flags: Enable Julia performance flags (default: True)
+            use_process_pool: Enable process pool for better performance (default: False)
+            pool_size: Number of workers in pool if enabled (default: 4)
+        Raises:
+            RuntimeError: If Julia executable is not found in PATH
+        """
+        self.timeout = timeout
+        self.max_retries = max_retries
+        self.use_optimization_flags = use_optimization_flags
+        self.use_process_pool = use_process_pool
+        self.pool_size = pool_size
+        # Find Julia executable in PATH
+        self.julia_path = shutil.which("julia")
+        if not self.julia_path:
+            # Try common installation paths
+            common_paths = [
+                os.path.expanduser("~/.juliaup/bin/julia"),
+                os.path.expanduser("~/.julia/bin/julia"),
+                "/usr/local/bin/julia",
+                "/usr/bin/julia",
+            ]
+            for path in common_paths:
+                if os.path.isfile(path) and os.access(path, os.X_OK):
+                    self.julia_path = path
+                    break
+        if not self.julia_path:
+            raise RuntimeError(
+                "Julia executable not found in PATH or common locations. "
+                "Please install Julia: https://julialang.org/downloads/ "
+                "or ensure it's in your PATH environment variable."
+            )
+        # Build optimized Julia command with performance flags
+        self.base_cmd = [self.julia_path]
+        if self.use_optimization_flags:
+            # Performance optimization flags:
+            # --compile=min: Reduce compilation overhead (faster startup)
+            # --optimize=2: Medium optimization level (good balance)
+            # --startup-file=no: Don't load ~/.julia/config/startup.jl
+            # --history-file=no: Don't save REPL history
+            self.base_cmd.extend(
+                [
+                    "--compile=min",  # Minimize compilation for faster startup
+                    "--optimize=2",  # Good optimization level
+                    "--startup-file=no",  # Skip startup file
+                    "--history-file=no",  # Skip history
+                ]
+            )
+            logger.info("Julia optimization flags enabled for faster execution")
+        logger.info(f"JuliaExecutor initialized with Julia at: {self.julia_path}")
+        logger.info(f"Command: {' '.join(self.base_cmd)}")
+        logger.info(f"Timeout: {self.timeout}s, Max retries: {self.max_retries}")
+        # Initialize process pool if requested
+        if self.use_process_pool:
+            self.enable_process_pool(size=self.pool_size)
+    def _kill_process_tree(
+        self, proc: subprocess.Popen, script_file: Optional[str] = None
+    ) -> None:
+        """
+        Terminate a process and all its children.
+        Args:
+            proc: The subprocess.Popen instance to terminate
+            script_file: Optional script file path to kill if process is stuck
+        """
+        if proc.poll() is None:  # Process is still running
+            try:
+                # Try graceful termination first
+                logger.warning(f"Terminating process {proc.pid} gracefully...")
+                proc.terminate()
+                # Wait up to 2 seconds for graceful termination
+                try:
+                    proc.wait(timeout=2.0)
+                    logger.info(f"Process {proc.pid} terminated gracefully")
+                    return
+                except subprocess.TimeoutExpired:
+                    logger.warning(
+                        f"Process {proc.pid} did not terminate, forcing kill..."
+                    )
+                # Force kill if still running
+                proc.kill()
+                proc.wait(timeout=2.0)
+                logger.info(f"Process {proc.pid} killed forcefully")
+            except Exception as e:
+                logger.error(f"Error killing process {proc.pid}: {e}")
+                # Last resort: try killing via process group
+                try:
+                    if hasattr(os, "killpg"):
+                        os.killpg(os.getpgid(proc.pid), signal.SIGKILL)
+                        logger.info(f"Killed process group for {proc.pid}")
+                except Exception as pg_error:
+                    logger.error(f"Failed to kill process group: {pg_error}")
+    def run(self, code: str) -> CodeExecResult:
+        """
+        Execute Julia code and return the result with robust error handling.
+        This method provides:
+        - Automatic retry on transient failures
+        - Proper timeout handling without zombie processes
+        - Process group cleanup for nested processes
+        - Comprehensive error logging
+        - Optional process pool for 50-100x speedup
+        Args:
+            code: Julia code string to execute
+        Returns:
+            CodeExecResult containing stdout, stderr, and exit_code
+        Example:
+            >>> executor = JuliaExecutor()
+            >>> result = executor.run("x = 5 + 3\\nprintln(x)")
+            >>> print(result.stdout)  # "8\n"
+            >>> print(result.exit_code)  # 0
+            >>>
+            >>> # Error handling
+            >>> result = executor.run("1 / 0")
+            >>> print(result.exit_code)  # 1
+            >>> print(result.stderr)  # Contains error message
+        """
+        # Use process pool if enabled and available
+        if self.use_process_pool and JuliaExecutor._shared_pool is not None:
+            try:
+                return JuliaExecutor._shared_pool.execute(code, timeout=self.timeout)
+            except Exception as e:
+                logger.warning(
+                    f"Process pool execution failed: {e}, falling back to subprocess"
+                )
+                # Fall through to standard execution
+        code_file = None
+        for attempt in range(self.max_retries + 1):
+            proc = None
+            try:
+                # Create temporary file for Julia code
+                with tempfile.NamedTemporaryFile(
+                    mode="w", suffix=".jl", delete=False, encoding="utf-8"
+                ) as f:
+                    f.write(code)
+                    code_file = f.name
+                script_name = Path(code_file).name
+                logger.debug(
+                    f"[Attempt {attempt + 1}/{self.max_retries + 1}] Executing Julia script: {script_name}"
+                )
+                # Start process with Popen for better control
+                # Use process group to ensure we can kill all child processes
+                start_time = time.time()
+                # On Unix systems, use process groups for better cleanup
+                kwargs = {
+                    "stdout": subprocess.PIPE,
+                    "stderr": subprocess.PIPE,
+                    "text": True,
+                }
+                # Create new process group on Unix systems
+                if hasattr(os, "setpgrp"):
+                    kwargs["preexec_fn"] = os.setpgrp
+                proc = subprocess.Popen(self.base_cmd + [code_file], **kwargs)
+                logger.debug(
+                    f"Started Julia process {proc.pid} for script {script_name}"
+                )
+                # Wait for process with timeout
+                try:
+                    stdout, stderr = proc.communicate(timeout=self.timeout)
+                    exit_code = proc.returncode
+                    elapsed = time.time() - start_time
+                    logger.debug(
+                        f"Julia execution completed in {elapsed:.2f}s (exit code: {exit_code})"
+                    )
+                    # Clean up temp file
+                    try:
+                        Path(code_file).unlink()
+                    except Exception as cleanup_error:
+                        logger.debug(
+                            f"Could not delete temp file {code_file}: {cleanup_error}"
+                        )
+                    return CodeExecResult(
+                        stdout=stdout,
+                        stderr=stderr,
+                        exit_code=exit_code,
+                    )
+                except subprocess.TimeoutExpired:
+                    logger.error(
+                        f"Julia execution timed out after {self.timeout}s (attempt {attempt + 1}/{self.max_retries + 1})"
+                    )
+                    # CRITICAL: Kill the process AND all its children to prevent zombies
+                    self._kill_process_tree(proc, code_file)
+                    # If this was our last retry, return timeout error
+                    if attempt >= self.max_retries:
+                        logger.error(
+                            f"Julia execution failed permanently after {self.max_retries + 1} timeout attempts"
+                        )
+                        return CodeExecResult(
+                            stdout="",
+                            stderr=f"Execution timed out after {self.timeout} seconds (tried {self.max_retries + 1} times)",
+                            exit_code=-1,
+                        )
+                    # Wait before retry
+                    logger.info(f"Waiting 1s before retry...")
+                    time.sleep(1.0)
+                    continue
+            except FileNotFoundError:
+                logger.error(f"Julia executable not found at {self.julia_path}")
+                return CodeExecResult(
+                    stdout="",
+                    stderr=f"Julia executable not found: {self.julia_path}",
+                    exit_code=-1,
+                )
+            except Exception as e:
+                logger.error(
+                    f"Error executing Julia code (attempt {attempt + 1}/{self.max_retries + 1}): {e}"
+                )
+                # Try to kill process if it exists
+                if proc is not None and proc.poll() is None:
+                    self._kill_process_tree(proc, code_file)
+                # If this was our last retry, return error
+                if attempt >= self.max_retries:
+                    logger.error(
+                        f"Julia execution failed permanently after {self.max_retries + 1} attempts"
+                    )
+                    return CodeExecResult(
+                        stdout="",
+                        stderr=f"Error executing Julia code: {str(e)}",
+                        exit_code=-1,
+                    )
+                # Wait before retry
+                logger.info(f"Waiting 1s before retry...")
+                time.sleep(1.0)
+                continue
+            finally:
+                # Always ensure temp file is cleaned up
+                if code_file and Path(code_file).exists():
+                    try:
+                        Path(code_file).unlink()
+                        logger.debug(f"Cleaned up temp file: {code_file}")
+                    except Exception as cleanup_error:
+                        logger.debug(
+                            f"Could not delete temp file {code_file}: {cleanup_error}"
+                        )
+        # Should never reach here, but just in case
+        return CodeExecResult(
+            stdout="",
+            stderr="Unexpected error: all retries exhausted",
+            exit_code=-1,
+        )
+    @classmethod
+    def enable_process_pool(cls, size: int = 4, timeout: int = 60) -> bool:
+        """
+        Enable the shared Julia process pool for all JuliaExecutor instances.
+        This provides 50-100x speedup for repeated code executions by reusing
+        persistent Julia processes instead of spawning new ones.
+        Args:
+            size: Number of worker processes to create (default: 4)
+            timeout: Default timeout for code execution in seconds (default: 60)
+        Returns:
+            True if pool was created successfully, False otherwise
+        Example:
+            >>> JuliaExecutor.enable_process_pool(size=8)
+            >>> executor = JuliaExecutor(use_process_pool=True)
+            >>> # All executors with use_process_pool=True will use the shared pool
+        """
+        if not POOL_AVAILABLE:
+            logger.warning(
+                "Process pool not available (julia_process_pool module not found)"
+            )
+            return False
+        with cls._pool_lock:
+            if cls._shared_pool is not None:
+                logger.warning("Process pool already enabled")
+                return True
+            try:
+                logger.info(f"Enabling Julia process pool with {size} workers")
+                cls._shared_pool = JuliaProcessPool(size=size, timeout=timeout)
+                logger.info("Julia process pool enabled successfully")
+                return True
+            except Exception as e:
+                logger.error(f"Failed to enable process pool: {e}")
+                return False
+    @classmethod
+    def shutdown_pool(cls) -> None:
+        """
+        Shutdown the shared Julia process pool.
+        This should be called when you're done with all Julia executions
+        to properly clean up worker processes.
+        Example:
+            >>> JuliaExecutor.enable_process_pool()
+            >>> # ... do work ...
+            >>> JuliaExecutor.shutdown_pool()  # Clean up
+        """
+        with cls._pool_lock:
+            if cls._shared_pool is not None:
+                logger.info("Shutting down Julia process pool")
+                cls._shared_pool.shutdown()
+                cls._shared_pool = None
+                logger.info("Julia process pool shutdown complete")
+    @classmethod
+    def is_pool_enabled(cls) -> bool:
+        """
+        Check if the process pool is currently enabled.
+        Returns:
+            True if pool is enabled, False otherwise
+        """
+        with cls._pool_lock:
+            return cls._shared_pool is not None
+    def __enter__(self):
+        """Context manager entry."""
+        return self
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        """Context manager exit."""
+        # Don't shutdown the shared pool when exiting a single executor
+        pass
+    def __del__(self):
+        """Cleanup on garbage collection."""
+        # Don't shutdown the shared pool when a single executor is deleted
+        pass

src/core/tools/local_python_executor.py ADDED Viewed

	@@ -0,0 +1,105 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Local Python Executor.
+This module provides functionality for executing Python code locally by wrapping
+the smolagents LocalPythonExecutor.
+"""
+from smolagents import LocalPythonExecutor
+from core.env_server.types import CodeExecResult
+class PyExecutor:
+    """
+    Wrapper around smolagents LocalPythonExecutor for executing Python code.
+    This class provides a simple interface to execute Python code in a subprocess
+    and capture the results including stdout, stderr, and exit code.
+    Args:
+        additional_imports: List of additional module imports to authorize.
+                          For example: ["numpy", "pandas", "matplotlib"]
+                          These will be added to the base authorized imports.
+    Example:
+        >>> # Basic usage with default imports
+        >>> executor = PyExecutor()
+        >>> result = executor.run("print('Hello, World!')")
+        >>> print(result.stdout)  # "Hello, World!\n"
+        >>> print(result.exit_code)  # 0
+        >>>
+        >>> # Usage with additional imports
+        >>> executor = PyExecutor(additional_imports=["numpy", "pandas"])
+        >>> result = executor.run("import numpy as np\\nprint(np.array([1, 2, 3]))")
+        >>> print(result.stdout)  # "[1 2 3]\n"
+    """
+    def __init__(self, additional_imports: list[str] | None = None):
+        """
+        Initialize the PyExecutor with a LocalPythonExecutor instance.
+        Args:
+            additional_imports: List of additional module names to authorize for import.
+                              Defaults to an empty list if not provided.
+        """
+        if additional_imports is None:
+            additional_imports = []
+        self._executor = LocalPythonExecutor(
+            additional_authorized_imports=additional_imports
+        )
+        # Initialize tools to make BASE_PYTHON_TOOLS available (including print)
+        self._executor.send_tools({})
+    def run(self, code: str) -> CodeExecResult:
+        """
+        Execute Python code and return the result.
+        Args:
+            code: Python code string to execute
+        Returns:
+            CodeExecResult containing stdout, stderr, and exit_code
+        Example:
+            >>> executor = PyExecutor()
+            >>> result = executor.run("x = 5 + 3\\nprint(x)")
+            >>> print(result.stdout)  # "8\n"
+            >>> print(result.exit_code)  # 0
+            >>>
+            >>> # Error handling
+            >>> result = executor.run("1 / 0")
+            >>> print(result.exit_code)  # 1
+            >>> print(result.stderr)  # Contains error message
+        """
+        try:
+            # Execute the code using LocalPythonExecutor
+            # LocalPythonExecutor returns a CodeOutput object with output, logs, is_final_answer
+            exec_result = self._executor(code)
+            # Extract the logs (which contain print outputs) as stdout
+            # The output field contains the return value of the code
+            stdout = exec_result.logs
+            stderr = ""
+            exit_code = 0  # Success
+            return CodeExecResult(
+                stdout=stdout,
+                stderr=stderr,
+                exit_code=exit_code,
+            )
+        except Exception as e:
+            # LocalPythonExecutor raises InterpreterError for various issues
+            # (syntax errors, forbidden operations, runtime errors, etc.)
+            return CodeExecResult(
+                stdout="",
+                stderr=str(e),
+                exit_code=1,  # Non-zero indicates error
+            )

src/envs/julia_env/__init__.py ADDED Viewed

	@@ -0,0 +1,13 @@

+# Copyright (c) Yogesh Singla and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Julia Environment - Code execution environment for RL training."""
+from .julia_env_client import JuliaEnv
+from .models import JuliaAction, JuliaObservation, JuliaState
+__all__ = ["JuliaAction", "JuliaObservation", "JuliaState", "JuliaEnv"]

src/envs/julia_env/julia_env_client.py ADDED Viewed

	@@ -0,0 +1,117 @@

+# Copyright (c) Yogesh Singla and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Julia Environment HTTP Client.
+This module provides the client for connecting to a Julia Environment server
+over HTTP.
+"""
+from typing import Dict
+from core.client_types import StepResult
+from core.http_env_client import HTTPEnvClient
+from .models import JuliaAction, JuliaObservation, JuliaState
+class JuliaEnv(HTTPEnvClient[JuliaAction, JuliaObservation]):
+    """
+    HTTP client for the Julia Environment.
+    This client connects to a JuliaEnvironment HTTP server and provides
+    methods to interact with it: reset(), step(), and state access.
+    Example:
+        >>> # Connect to a running server
+        >>> client = JuliaEnv(base_url="http://localhost:8000")
+        >>> result = client.reset()
+        >>> print(result.observation.stdout)
+        >>>
+        >>> # Execute Julia code
+        >>> action = JuliaAction(code='''
+        ... function multiply(a, b)
+        ...     return a * b
+        ... end
+        ...
+        ... using Test
+        ... @test multiply(3, 4) == 12
+        ... ''')
+        >>> result = client.step(action)
+        >>> print(result.observation.tests_passed)  # 1
+        >>> print(result.reward)
+    Example with Docker:
+        >>> # Automatically start container and connect
+        >>> client = JuliaEnv.from_docker_image("julia-env:latest")
+        >>> result = client.reset()
+        >>> result = client.step(JuliaAction(code="println(2 + 2)"))
+        >>> print(result.observation.stdout)  # "4\n"
+        >>> client.close()
+    """
+    def _step_payload(self, action: JuliaAction) -> Dict:
+        """
+        Convert JuliaAction to JSON payload for step request.
+        Args:
+            action: JuliaAction instance
+        Returns:
+            Dictionary representation suitable for JSON encoding
+        """
+        return {
+            "core_code": action.core_code,
+            "test_code": action.test_code
+        }
+    def _parse_result(self, payload: Dict) -> StepResult[JuliaObservation]:
+        """
+        Parse server response into StepResult[JuliaObservation].
+        Args:
+            payload: JSON response from server
+        Returns:
+            StepResult with JuliaObservation
+        """
+        obs_data = payload.get("observation", {})
+        observation = JuliaObservation(
+            stdout=obs_data.get("stdout", ""),
+            stderr=obs_data.get("stderr", ""),
+            exit_code=obs_data.get("exit_code", 0),
+            tests_passed=obs_data.get("tests_passed", 0),
+            tests_failed=obs_data.get("tests_failed", 0),
+            code_compiles=obs_data.get("code_compiles", True),
+            metadata=obs_data.get("metadata", {}),
+        )
+        return StepResult[JuliaObservation](
+            observation=observation,
+            reward=payload.get("reward"),
+            done=payload.get("done", False),
+        )
+    def _parse_state(self, payload: Dict) -> JuliaState:
+        """
+        Parse server response into JuliaState object.
+        Args:
+            payload: JSON response from /state endpoint
+        Returns:
+            JuliaState object with episode metadata
+        """
+        return JuliaState(
+            episode_id=payload.get("episode_id"),
+            step_count=payload.get("step_count", 0),
+            last_exit_code=payload.get("last_exit_code", 0),
+            last_code_compiles=payload.get("last_code_compiles", True),
+            total_tests_passed=payload.get("total_tests_passed", 0),
+            total_tests_failed=payload.get("total_tests_failed", 0),
+        )

src/envs/julia_env/models.py ADDED Viewed

	@@ -0,0 +1,70 @@

+# Copyright (c) Yogesh Singla and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+Data models for the Julia Environment.
+The Julia environment executes Julia code and provides feedback through
+compilation and unit test results.
+"""
+from dataclasses import dataclass, field
+from typing import Optional
+from core.env_server.types import Action, Observation, State
+@dataclass(kw_only=True)
+class JuliaAction(Action):
+    """
+    Action for the Julia environment - code to execute.
+    Attributes:
+        core_code: Core Julia code to execute
+        test_code: Test code to execute
+    """
+    core_code: str
+    test_code: str
+@dataclass(kw_only=True)
+class JuliaObservation(Observation):
+    """
+    Observation from the Julia environment - execution results.
+    Attributes:
+        stdout: Standard output from Julia execution
+        stderr: Standard error from Julia execution
+        exit_code: Exit code (0 = success, non-zero = error)
+        execution_time: Time taken to execute in seconds
+        tests_passed: Number of tests passed (if tests were run)
+        tests_failed: Number of tests failed (if tests were run)
+        code_compiles: Whether the core code compiled/executed successfully
+    """
+    stdout: str = ""
+    stderr: str = ""
+    exit_code: int = 0
+    tests_passed: int = 0
+    tests_failed: int = 0
+    code_compiles: bool = True
+@dataclass
+class JuliaState(State):
+    """
+    State for Julia environment.
+    Attributes:
+        episode_id: Unique episode identifier
+        step_count: Number of steps taken in episode
+        last_exit_code: Exit code from last execution
+        total_tests_passed: Cumulative tests passed in episode
+        total_tests_failed: Cumulative tests failed in episode
+    """
+    last_exit_code: int = 0
+    last_code_compiles: bool = True
+    total_tests_passed: int = 0
+    total_tests_failed: int = 0

src/envs/julia_env/server/Dockerfile ADDED Viewed

	@@ -0,0 +1,54 @@

+# Copyright (c) Yogesh Singla, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+# Use the standard openenv base image
+# Built from: docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+# In GitHub Actions, this is overridden to use the GHCR base image
+# Use the standard openenv base image
+ARG BASE_IMAGE=openenv-base:latest
+FROM ${BASE_IMAGE}
+# Install Julia using juliaup (official installer - more reliable in Docker)
+RUN apt-get update && apt-get install -y \
+    curl \
+    ca-certificates \
+    && rm -rf /var/lib/apt/lists/*
+# Install juliaup and Julia
+RUN curl -fsSL https://install.julialang.org | sh -s -- --yes --default-channel 1.10
+# Add Julia to PATH
+ENV PATH="/root/.juliaup/bin:${PATH}"
+# Verify Julia installation
+RUN julia --version
+# Precompile commonly used Julia packages (Test is built-in, but precompile it)
+RUN julia -e 'using Test; println("Julia Test module ready")'
+# Install smolagents for Python code execution utilities
+RUN pip install --no-cache-dir smolagents
+# Environment variable to enable Julia process pool (optional - can be set at runtime)
+# Set to "1" to enable process pool, "0" to use standard execution
+ENV JULIA_USE_PROCESS_POOL=1
+ENV JULIA_POOL_SIZE=32
+# Copy only what's needed for the Julia environment
+COPY src/core/ /app/src/core/
+COPY src/envs/julia_env/ /app/src/envs/julia_env/
+# Environment variables for port and workers with defaults
+ENV PORT=8000
+ENV NUM_WORKER=4
+# Health check
+HEALTHCHECK --interval=30s --timeout=5s --start-period=30s --retries=3 \
+    CMD curl -f http://localhost:${PORT}/health || exit 1
+# Run the FastAPI server
+CMD uvicorn envs.julia_env.server.app:app --host 0.0.0.0 --port ${PORT} --workers ${NUM_WORKER}

src/envs/julia_env/server/README.md ADDED Viewed

	@@ -0,0 +1,436 @@

+# Julia Environment Server
+HTTP server for executing Julia code with test result tracking and reward calculation.
+## Overview
+This server provides a Julia code execution environment through OpenEnv's HTTP interface. It executes Julia code, parses test results from the `Test` module, and calculates rewards based on execution success and test outcomes.
+## Features
+- ✅ Execute Julia code in isolated subprocess
+- ✅ Parse `Test` module output (tests passed/failed)
+- ✅ Calculate rewards based on execution results
+- ✅ Safety transforms for output truncation
+- ✅ Docker support for reproducible execution
+- ✅ Compatible with GRPO training
+## Docker Setup
+### Prerequisites
+First, build the OpenEnv base image (one-time setup):
+```bash
+# From OpenEnv root directory
+docker build -t openenv-base:latest -f src/core/containers/images/Dockerfile .
+```
+### Build Julia Environment Image
+```bash
+# From OpenEnv root directory
+docker build -t julia-env:latest -f src/envs/julia_env/server/Dockerfile .
+```
+### Run the Server
+```bash
+# Run in background with default settings (port 8000, 4 workers)
+docker run -d -p 8000:8000 --name julia-env-server julia-env:latest
+# OR run in foreground (to see logs)
+docker run -p 8000:8000 --name julia-env-server julia-env:latest
+# Run with custom port
+docker run -d -p 9000:9000 -e PORT=9000 --name julia-env-server julia-env:latest
+# Run with custom number of workers (uvicorn workers)
+docker run -d -p 8000:8000 -e NUM_WORKER=8 --name julia-env-server julia-env:latest
+# Run with custom Julia max workers (for process pool)
+docker run -d -p 8000:8000 -e JULIA_MAX_WORKERS=32 --name julia-env-server julia-env:latest
+# Run with all custom configurations
+docker run -d -p 9000:9000 \
+  -e PORT=9000 \
+  -e NUM_WORKER=8 \
+  -e JULIA_MAX_WORKERS=32 \
+  --name julia-env-server julia-env:latest
+```
+### Test the Server
+```bash
+# Health check
+curl http://localhost:8000/health
+# Expected: {"status":"healthy"}
+# Check Julia version inside container
+docker exec julia-env-server julia --version
+# Expected: julia version 1.10.0
+```
+### Docker Management Commands
+```bash
+# View logs
+docker logs julia-env-server
+docker logs -f julia-env-server  # Follow logs
+# Stop/start container
+docker stop julia-env-server
+docker start julia-env-server
+# Remove container
+docker rm -f julia-env-server
+# Rebuild after code changes
+docker build -t julia-env:latest -f src/envs/julia_env/server/Dockerfile .
+docker rm -f julia-env-server
+docker run -d -p 8000:8000 --name julia-env-server julia-env:latest
+# Interactive debugging
+docker exec -it julia-env-server /bin/bash
+```
+## Local Development (Without Docker)
+### Prerequisites
+- Python 3.10+
+- Julia 1.10.0+ installed and in PATH
+- FastAPI and dependencies
+### Install Julia
+**Using juliaup (recommended):**
+```bash
+curl -fsSL https://install.julialang.org | sh
+```
+**Or download from:** https://julialang.org/downloads/
+### Install Python Dependencies
+```bash
+pip install fastapi uvicorn
+```
+### Run Server Locally
+```bash
+# From OpenEnv root directory
+export PYTHONPATH="${PWD}/src:${PYTHONPATH}"
+python -m envs.julia_env.server.app
+```
+Server will start at: http://localhost:8000
+## API Endpoints
+### Health Check
+```
+GET /health
+Response: {"status": "healthy"}
+```
+### Reset Environment
+```
+POST /reset
+Response: {
+  "observation": {
+    "stdout": "",
+    "stderr": "",
+    "exit_code": 0,
+    "tests_passed": 0,
+    "tests_failed": 0,
+    "reward": 0.0,
+    "execution_time": 0.0
+  }
+}
+```
+### Execute Code (Step)
+```
+POST /step
+Body: {"code": "function add(a,b)\n  a+b\nend\nusing Test\n@test add(2,3)==5"}
+Response: {
+  "observation": {
+    "stdout": "Test Passed",
+    "stderr": "",
+    "exit_code": 0,
+    "tests_passed": 1,
+    "tests_failed": 0,
+    "reward": 1.0,
+    "execution_time": 0.15
+  },
+  "reward": 1.0,
+  "done": false
+}
+```
+### Get State
+```
+GET /state
+Response: {
+  "episode_id": "uuid",
+  "step_count": 5,
+  "last_exit_code": 0,
+  "total_tests_passed": 10,
+  "total_tests_failed": 2
+}
+```
+## Reward Structure
+The environment calculates rewards based on:
+- **Failed execution** (exit_code != 0): `-0.5`
+- **Clean execution** (exit_code == 0): `+0.2`
+- **Tests passed**: `+0.3 × (passed/total)`
+- **Tests failed**: `-0.2 × (failed/total)`
+- **All tests passed bonus**: `+0.5`
+Example:
+```julia
+# 3 tests pass, 1 fails → exit_code 1
+reward = -0.5  # Failed execution
+# Total: -0.5
+# 3 tests pass, 0 fail → exit_code 0
+reward = 0.2 + 0.3 × 1.0 + 0.5 = 1.0
+# Total: 1.0 (perfect score!)
+```
+## Test Parsing
+The environment parses Julia's `Test` module output:
+### Method 1: Error Message Pattern
+```
+Some tests did not pass: 3 passed, 1 failed, 0 errored, 0 broken.
+→ tests_passed=3, tests_failed=1
+```
+### Method 2: Test Summary Table
+```
+Test Summary:      | Pass  Fail  Total  Time
+Add function Tests |    3     1      4  0.5s
+→ tests_passed=3, tests_failed=1
+```
+## Example Usage
+### From Python Client
+```python
+from envs.julia_env import JuliaEnv, JuliaAction
+# Connect to server
+env = JuliaEnv(base_url="http://localhost:8000")
+# Reset
+result = env.reset()
+# Execute Julia code with tests
+code = """
+function fibonacci(n)
+    if n <= 1
+        return n
+    end
+    return fibonacci(n-1) + fibonacci(n-2)
+end
+using Test
+@test fibonacci(0) == 0
+@test fibonacci(1) == 1
+@test fibonacci(5) == 5
+@test fibonacci(10) == 55
+"""
+result = env.step(JuliaAction(code=code))
+print(f"Exit code: {result.observation.exit_code}")
+print(f"Tests passed: {result.observation.tests_passed}")
+print(f"Tests failed: {result.observation.tests_failed}")
+print(f"Reward: {result.reward}")
+# Close connection
+env.close()
+```
+### Example Script
+```bash
+# From OpenEnv root
+python examples/julia_simple.py
+```
+## GRPO Training Integration
+This environment is designed for GRPO (Group Relative Policy Optimization) training:
+```python
+# In your GRPO training loop
+async def play_julia_game(game_idx, game_id, server_url, policy, tokenizer):
+    env = JuliaEnv(base_url=server_url)
+    # Generate code with LLM
+    prompt = format_julia_prompt(task)
+    responses = await policy.generate.route(prompt)
+    code = extract_julia_code(responses[0].text)
+    # Execute in environment
+    result = env.step(JuliaAction(code=code))
+    # Get reward
+    reward = result.observation.reward
+    return {
+        "prompt": prompt,
+        "response": responses[0],
+        "reward": reward,
+        "tests_passed": result.observation.tests_passed,
+        "tests_failed": result.observation.tests_failed
+    }
+```
+See `examples/grpo_blackjack/` for a complete GRPO training example that can be adapted for Julia.
+## Configuration
+### Docker Environment Variables
+The Docker container accepts the following environment variables:
+- **`PORT`**: HTTP server port (default: `8000`)
+  - Controls which port the FastAPI server listens on
+  - Must match the port mapping in `-p` flag (e.g., `-p 9000:9000 -e PORT=9000`)
+- **`NUM_WORKER`**: Number of uvicorn worker processes (default: `4`)
+  - Controls parallel request handling capacity
+  - More workers = more concurrent requests but higher memory usage
+  - Recommended: 2-8 workers for typical workloads
+- **`JULIA_MAX_WORKERS`**: Maximum Julia process pool size (default: `16`)
+  - Controls maximum concurrent Julia code executions
+  - Higher values allow more parallel Julia executions
+  - Each worker consumes memory; tune based on available resources
+  - Recommended: 8-32 workers depending on your workload
+### Runtime Environment Variables
+These can be set when running locally (non-Docker):
+- `HOST`: Server host (default: 0.0.0.0)
+- `JULIA_TIMEOUT`: Julia execution timeout in seconds (default: 60)
+### Dockerfile Customization
+To use a different Julia version:
+```dockerfile
+# In Dockerfile, change the version
+RUN curl -fsSL https://install.julialang.org | sh -s -- --yes --default-channel 1.11
+```
+## Troubleshooting
+### Julia not found
+```bash
+# Verify Julia is in PATH
+julia --version
+# In Docker, check installation
+docker exec julia-env-server julia --version
+```
+### Port already in use
+```bash
+# Use different port
+docker run -p 8001:8000 --name julia-env-server julia-env:latest
+# Update client base_url
+env = JuliaEnv(base_url="http://localhost:8001")
+```
+### Container exits immediately
+```bash
+# Check logs
+docker logs julia-env-server
+# Run in foreground to see errors
+docker run -p 8000:8000 julia-env:latest
+```
+### Build failures
+```bash
+# Clean build with no cache
+docker build --no-cache -t julia-env:latest -f src/envs/julia_env/server/Dockerfile .
+# Verbose output
+docker build --progress=plain -t julia-env:latest -f src/envs/julia_env/server/Dockerfile .
+```
+## Architecture
+```
+┌─────────────────────────────────────┐
+│   Python Client (HTTP)              │
+│   JuliaEnv                          │
+└────────────┬────────────────────────┘
+             │ HTTP POST /step
+             │ {"code": "..."}
+             ▼
+┌─────────────────────────────────────┐
+│   FastAPI Server                    │
+│   app.py                            │
+└────────────┬────────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────────┐
+│   JuliaCodeActEnv                   │
+│   - Execute code via JuliaExecutor  │
+│   - Parse test results              │
+│   - Calculate rewards               │
+│   - Apply transforms                │
+└────────────┬────────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────────┐
+│   JuliaExecutor (subprocess)        │
+│   - Write code to temp file         │
+│   - Run: julia temp_file.jl         │
+│   - Capture stdout/stderr           │
+│   - Return results                  │
+└─────────────────────────────────────┘
+```
+## Development
+### Running Tests
+```bash
+# Unit tests
+pytest tests/envs/julia_env/
+# Integration test
+python examples/julia_simple.py
+```
+### Code Structure
+```
+server/
+├── Dockerfile              # Docker build instructions
+├── README.md              # This file
+├── __init__.py            # Package initialization
+├── app.py                 # FastAPI server entry point
+├── julia_codeact_env.py   # Environment implementation
+└── julia_transforms.py    # Output transforms
+```
+## License
+BSD-style license. See LICENSE file in repository root.

src/envs/julia_env/server/__init__.py ADDED Viewed

	@@ -0,0 +1,8 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""Julia Environment Server."""

src/envs/julia_env/server/app.py ADDED Viewed

	@@ -0,0 +1,455 @@

+# Copyright (c) Yogesh Singla and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+"""
+FastAPI application for the Julia Environment with concurrent execution support.
+This module creates an HTTP server that exposes the JuliaCodeActEnv
+over HTTP endpoints with optimized async execution for handling multiple
+concurrent requests efficiently.
+Features:
+- Async Julia code execution to avoid blocking
+- Environment pool for concurrent request handling
+- Thread pool executor for CPU-bound Julia tasks
+- Automatic error recovery and retry logic
+- Comprehensive logging to file and console
+- Worker health monitoring and auto-restart
+- 10x+ performance improvement over single-threaded version
+Usage:
+    # Development (with auto-reload):
+    uvicorn envs.julia_env.server.app:app --reload --host 0.0.0.0 --port 8000
+    # Production (with multiple workers for even better concurrency):
+    uvicorn envs.julia_env.server.app:app --host 0.0.0.0 --port 8000 --workers 4
+    # Or run directly:
+    python -m envs.julia_env.server.app
+"""
+import asyncio
+import logging
+import os
+import sys
+import traceback
+from concurrent.futures import ThreadPoolExecutor
+from contextlib import asynccontextmanager
+from dataclasses import asdict
+from datetime import datetime
+from logging.handlers import RotatingFileHandler
+from typing import Any, Dict
+from fastapi import Body, FastAPI, HTTPException, Request
+from fastapi.responses import JSONResponse
+from ..models import JuliaAction, JuliaObservation
+from .julia_codeact_env import JuliaCodeActEnv
+# Configuration
+MAX_WORKERS = int(
+    os.getenv("JULIA_MAX_WORKERS", "8")
+)  # Number of concurrent Julia executions
+ENABLE_WEB = os.getenv("ENABLE_WEB_INTERFACE", "false").lower() in ("true", "1", "yes")
+EXECUTION_TIMEOUT = int(os.getenv("JULIA_EXECUTION_TIMEOUT", "120"))  # seconds
+LOG_FILE = os.getenv("JULIA_LOG_FILE", "/tmp/run.log")
+LOG_LEVEL = os.getenv("JULIA_LOG_LEVEL", "INFO")
+# Global thread pool executor for CPU-bound Julia tasks
+executor = None
+# Setup comprehensive logging
+def setup_logging():
+    """Configure logging to both file and console with rotation."""
+    logger = logging.getLogger("julia_env")
+    logger.setLevel(getattr(logging, LOG_LEVEL))
+    # Prevent duplicate handlers
+    if logger.handlers:
+        return logger
+    # Create formatters
+    detailed_formatter = logging.Formatter(
+        "%(asctime)s - %(name)s - [%(process)d:%(thread)d] - %(levelname)s - %(message)s",
+        datefmt="%Y-%m-%d %H:%M:%S",
+    )
+    # File handler with rotation (10MB max, keep 5 backup files)
+    try:
+        os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True)
+        file_handler = RotatingFileHandler(
+            LOG_FILE, maxBytes=10 * 1024 * 1024, backupCount=5, encoding="utf-8"  # 10MB
+        )
+        file_handler.setLevel(logging.DEBUG)
+        file_handler.setFormatter(detailed_formatter)
+        logger.addHandler(file_handler)
+    except Exception as e:
+        print(f"Warning: Could not create log file {LOG_FILE}: {e}")
+    # Console handler
+    console_handler = logging.StreamHandler(sys.stdout)
+    console_handler.setLevel(logging.INFO)
+    console_handler.setFormatter(detailed_formatter)
+    logger.addHandler(console_handler)
+    return logger
+logger = setup_logging()
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """Lifespan context manager for startup/shutdown with health monitoring"""
+    global executor
+    logger.info("=" * 80)
+    logger.info("Starting Julia Environment Server")
+    logger.info(f"Max Workers: {MAX_WORKERS}")
+    logger.info(f"Execution Timeout: {EXECUTION_TIMEOUT}s")
+    logger.info(f"Log File: {LOG_FILE}")
+    logger.info(f"Log Level: {LOG_LEVEL}")
+    logger.info("=" * 80)
+    # Startup: Create thread pool with error handling
+    try:
+        executor = ThreadPoolExecutor(
+            max_workers=MAX_WORKERS, thread_name_prefix="julia_worker"
+        )
+        logger.info(f"✅ Thread pool created with {MAX_WORKERS} workers")
+        logger.info(f"✅ Julia Environment Server started successfully")
+        print(
+            f"✅ Julia Environment Server started with {MAX_WORKERS} concurrent workers"
+        )
+    except Exception as e:
+        logger.error(f"❌ Failed to start server: {e}")
+        logger.error(traceback.format_exc())
+        raise
+    yield
+    # Shutdown: Cleanup with grace period
+    logger.info("Shutting down Julia Environment Server...")
+    try:
+        executor.shutdown(wait=True, cancel_futures=False)
+        logger.info("✅ All workers completed gracefully")
+    except Exception as e:
+        logger.error(f"Error during shutdown: {e}")
+    logger.info("✅ Julia Environment Server shutdown complete")
+    print("✅ Julia Environment Server shutdown complete")
+# Create FastAPI app with lifespan management
+app = FastAPI(
+    title="Julia Environment Server",
+    description="Async Julia code execution environment with concurrent request support and auto-recovery",
+    version="2.1.0",
+    lifespan=lifespan,
+)
+# Global exception handler for uncaught errors
+@app.exception_handler(Exception)
+async def global_exception_handler(request: Request, exc: Exception):
+    """Handle all uncaught exceptions to prevent worker crashes"""
+    error_id = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
+    logger.error(f"[ERROR-{error_id}] Uncaught exception in {request.url.path}")
+    logger.error(f"[ERROR-{error_id}] Request: {request.method} {request.url}")
+    logger.error(f"[ERROR-{error_id}] Exception: {type(exc).__name__}: {exc}")
+    logger.error(f"[ERROR-{error_id}] Traceback:\n{traceback.format_exc()}")
+    return JSONResponse(
+        status_code=500,
+        content={
+            "error": "Internal server error",
+            "type": type(exc).__name__,
+            "message": str(exc),
+            "error_id": error_id,
+            "timestamp": datetime.now().isoformat(),
+        },
+    )
+async def execute_julia_async(
+    action: JuliaAction, request_id: str = None
+) -> JuliaObservation:
+    """
+    Execute Julia code asynchronously in thread pool with timeout and error recovery.
+    This runs the CPU-bound Julia execution in a separate thread to avoid
+    blocking the event loop, allowing the server to handle multiple requests
+    concurrently.
+    Features:
+    - Timeout protection
+    - Automatic retry on transient failures
+    - Comprehensive error logging
+    - Resource cleanup
+    """
+    if request_id is None:
+        request_id = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
+    loop = asyncio.get_event_loop()
+    max_retries = 2
+    retry_count = 0
+    logger.debug(
+        f"[{request_id}] Starting Julia execution (timeout: {EXECUTION_TIMEOUT}s)"
+    )
+    while retry_count <= max_retries:
+        env = None
+        try:
+            # Create a fresh environment instance for this request
+            # This ensures thread safety and allows concurrent execution
+            env = JuliaCodeActEnv()
+            # Run the blocking step() call in thread pool with timeout
+            observation = await asyncio.wait_for(
+                loop.run_in_executor(executor, env.step, action),
+                timeout=EXECUTION_TIMEOUT,
+            )
+            logger.debug(f"[{request_id}] Julia execution completed successfully")
+            logger.debug(
+                f"[{request_id}] Result: tests_passed={observation.tests_passed}, "
+                f"tests_failed={observation.tests_failed}, reward={observation.reward}"
+            )
+            return observation
+        except asyncio.TimeoutError:
+            retry_count += 1
+            logger.warning(
+                f"[{request_id}] Julia execution timeout (attempt {retry_count}/{max_retries + 1})"
+            )
+            if retry_count > max_retries:
+                logger.error(
+                    f"[{request_id}] Julia execution failed after {max_retries + 1} attempts"
+                )
+                # Return a failure observation
+                return JuliaObservation(
+                    stdout="",
+                    stderr=f"Execution timeout after {EXECUTION_TIMEOUT}s",
+                    exit_code=-1,
+                    tests_passed=0,
+                    tests_failed=1,
+                    code_compiles=False,
+                    reward=0.0,
+                    done=True,
+                )
+            # Wait a bit before retry
+            await asyncio.sleep(0.5)
+        except Exception as e:
+            retry_count += 1
+            logger.error(
+                f"[{request_id}] Julia execution error (attempt {retry_count}/{max_retries + 1}): {e}"
+            )
+            logger.error(f"[{request_id}] Traceback:\n{traceback.format_exc()}")
+            if retry_count > max_retries:
+                logger.error(
+                    f"[{request_id}] Julia execution failed permanently after {max_retries + 1} attempts"
+                )
+                # Return a failure observation
+                return JuliaObservation(
+                    stdout="",
+                    stderr=f"Execution error: {str(e)}",
+                    exit_code=-1,
+                    tests_passed=0,
+                    tests_failed=1,
+                    code_compiles=False,
+                    reward=0.0,
+                    done=True,
+                )
+            # Wait a bit before retry
+            await asyncio.sleep(0.5)
+        finally:
+            # Clean up environment resources if possible
+            if env is not None:
+                try:
+                    # Add any cleanup needed here
+                    del env
+                except Exception as cleanup_error:
+                    logger.debug(f"[{request_id}] Cleanup warning: {cleanup_error}")
+@app.post("/reset")
+async def reset(request: Dict[str, Any] = Body(default={})) -> Dict[str, Any]:
+    """
+    Reset endpoint - returns initial observation.
+    Creates a fresh environment instance for the new episode.
+    """
+    request_id = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
+    logger.info(f"[{request_id}] Reset request received")
+    try:
+        # Run reset in thread pool to avoid blocking
+        loop = asyncio.get_event_loop()
+        env = JuliaCodeActEnv()
+        observation = await asyncio.wait_for(
+            loop.run_in_executor(executor, env.reset),
+            timeout=30.0,  # Reset should be quick
+        )
+        # Serialize observation
+        obs_dict = asdict(observation)
+        reward = obs_dict.pop("reward", None)
+        done = obs_dict.pop("done", False)
+        obs_dict.pop("metadata", None)
+        logger.info(f"[{request_id}] Reset completed successfully")
+        return {
+            "observation": obs_dict,
+            "reward": reward,
+            "done": done,
+        }
+    except asyncio.TimeoutError:
+        logger.error(f"[{request_id}] Reset timeout")
+        raise HTTPException(status_code=504, detail="Reset operation timed out")
+    except Exception as e:
+        logger.error(f"[{request_id}] Reset error: {e}")
+        logger.error(traceback.format_exc())
+        raise HTTPException(status_code=500, detail=f"Reset failed: {str(e)}")
+@app.post("/step")
+async def step(request: Dict[str, Any]) -> Dict[str, Any]:
+    """
+    Step endpoint - executes Julia code and returns observation.
+    Runs Julia code execution asynchronously to handle multiple concurrent requests.
+    Each request gets its own environment instance for thread safety.
+    """
+    request_id = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
+    try:
+        action_data = request.get("action", {})
+        if not action_data:
+            logger.warning(f"[{request_id}] Step request with empty action")
+            raise HTTPException(status_code=400, detail="Action data is required")
+        # Deserialize action
+        metadata = action_data.pop("metadata", {})
+        action = JuliaAction(**action_data)
+        action.metadata = metadata
+        logger.info(f"[{request_id}] Step request received")
+        logger.debug(
+            f"[{request_id}] Action: core_code_length={len(action.core_code) if action.core_code else 0}, "
+            f"test_code_length={len(action.test_code) if action.test_code else 0}"
+        )
+        # Execute Julia code asynchronously with timeout and retry
+        observation = await execute_julia_async(action, request_id)
+        # Serialize observation
+        obs_dict = asdict(observation)
+        reward = obs_dict.pop("reward", None)
+        done = obs_dict.pop("done", False)
+        obs_dict.pop("metadata", None)
+        logger.info(
+            f"[{request_id}] Step completed - reward={reward}, "
+            f"tests_passed={observation.tests_passed}, tests_failed={observation.tests_failed}"
+        )
+        return {
+            "observation": obs_dict,
+            "reward": reward,
+            "done": done,
+        }
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"[{request_id}] Step endpoint error: {e}")
+        logger.error(f"[{request_id}] Traceback:\n{traceback.format_exc()}")
+        raise HTTPException(status_code=500, detail=f"Step execution failed: {str(e)}")
+@app.get("/state")
+async def get_state() -> Dict[str, Any]:
+    """
+    State endpoint - returns environment metadata and server health.
+    Note: Since each request creates a fresh environment, this returns
+    general server state rather than specific episode state.
+    """
+    try:
+        import psutil
+        process = psutil.Process()
+        memory_info = process.memory_info()
+        return {
+            "max_workers": MAX_WORKERS,
+            "executor_type": "ThreadPoolExecutor",
+            "status": "ready",
+            "timeout": EXECUTION_TIMEOUT,
+            "log_file": LOG_FILE,
+            "memory_mb": memory_info.rss / 1024 / 1024,
+            "threads": len(process.threads()),
+        }
+    except ImportError:
+        # psutil not available, return basic info
+        return {
+            "max_workers": MAX_WORKERS,
+            "executor_type": "ThreadPoolExecutor",
+            "status": "ready",
+            "timeout": EXECUTION_TIMEOUT,
+            "log_file": LOG_FILE,
+        }
+    except Exception as e:
+        logger.warning(f"Could not get full state info: {e}")
+        return {
+            "max_workers": MAX_WORKERS,
+            "executor_type": "ThreadPoolExecutor",
+            "status": "ready",
+        }
+@app.get("/health")
+async def health() -> Dict[str, str]:
+    """
+    Health check endpoint.
+    Returns healthy status if the server is operational and can accept requests.
+    """
+    try:
+        # Quick health check - verify executor is available
+        if executor is None:
+            logger.error("Health check failed: executor not initialized")
+            raise HTTPException(status_code=503, detail="Service not ready")
+        return {
+            "status": "healthy",
+            "workers": str(MAX_WORKERS),
+            "timeout": str(EXECUTION_TIMEOUT),
+            "timestamp": datetime.now().isoformat(),
+        }
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"Health check error: {e}")
+        raise HTTPException(status_code=503, detail="Health check failed")
+if __name__ == "__main__":
+    import uvicorn
+    # Run with uvicorn
+    # Use multiple workers for even better concurrency
+    uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")

src/envs/julia_env/server/julia_codeact_env.py ADDED Viewed

	@@ -0,0 +1,276 @@

+"""
+Julia Code Action Environment.
+This environment mirrors the PythonCodeActEnv but runs Julia code instead.
+It executes Julia code using JuliaExecutor, captures output,
+tracks the last exit code, and returns a JuliaObservation.
+"""
+import re
+import uuid
+from core.env_server import Environment
+from core.tools import JuliaExecutor
+from ..models import JuliaAction, JuliaObservation, JuliaState
+from .julia_transforms import create_safe_julia_transform
+class JuliaCodeActEnv(Environment):
+    """
+    Julia Code Action Environment for executing code and tracking state.
+    This environment executes Julia code submitted as CodeAction during step,
+    maintains the last exit code in its state, and returns results wrapped
+    in CodeObservation.
+    Example:
+        >>> env = JuliaCodeActEnv()
+        >>> obs = env.reset()
+        >>> action = CodeAction(code='println("Hello, Julia!")')
+        >>> obs = env.step(action)
+        >>> print(obs.stdout)  # "Hello, Julia!\n"
+        >>> print(obs.exit_code)  # 0
+        >>> print(env.state.last_exit_code)  # 0
+    """
+    def __init__(self):
+        """Initialize the Julia Code Act Environment."""
+        self._executor = JuliaExecutor()
+        self._state = JuliaState()
+        self.transform = create_safe_julia_transform()
+    def reset(self) -> JuliaObservation:
+        """
+        Reset environment for a fresh Julia execution session.
+        Returns an empty JuliaObservation with exit_code=0.
+        """
+        self._state = JuliaState(episode_id=str(uuid.uuid4()), step_count=0)
+        self._state.last_exit_code = 0
+        self._state.last_code_compiles = True
+        self._executor = JuliaExecutor()
+        observation = JuliaObservation(
+            stdout="",
+            stderr="",
+            exit_code=0,
+            reward=0.0,
+            metadata={"core_code": "", "test_code": ""},
+            tests_passed=0,
+            tests_failed=0,
+            code_compiles=True,
+        )
+        observation = self._apply_transform(observation)
+        return observation
+    def step(self, action: JuliaAction) -> JuliaObservation:
+        """
+        Execute Julia code and return the result as JuliaObservation.
+        Optimized single-pass execution:
+        - Runs core_code + test_code together
+        - Infers compilation status from combined execution
+        - 2x faster than double execution
+        """
+        if not isinstance(action, JuliaAction):
+            raise ValueError(f"Expected JuliaAction, got {type(action)}")
+        # Single execution: Run core_code + test_code together
+        combined_code = action.core_code + "\n\n" + action.test_code
+        full_result = self._executor.run(combined_code)
+        # Parse test results from execution output
+        tests_passed, tests_failed = self._parse_test_results(
+            full_result.stdout, full_result.stderr
+        )
+        # Infer compilation status from execution
+        # If tests ran, code compiled successfully
+        # If exit_code != 0 and no tests ran, code didn't compile
+        code_compiles = (
+            full_result.exit_code == 0  # Clean execution
+            or tests_passed > 0  # Some tests passed (code must have compiled)
+            or tests_failed > 0  # Some tests failed (code compiled but tests failed)
+        )
+        # If no tests detected and non-zero exit, check for compilation errors
+        if not code_compiles and tests_passed == 0 and tests_failed == 0:
+            # Check stderr for compilation errors
+            stderr_lower = full_result.stderr.lower()
+            if any(
+                err in stderr_lower
+                for err in ["error", "syntax", "undefined", "loadError"]
+            ):
+                code_compiles = False
+            else:
+                # If no clear compilation error, assume it compiled
+                code_compiles = True
+        # Calculate reward based on compilation and test results
+        reward = self._calculate_reward(code_compiles, tests_passed, tests_failed)
+        # Update environment state
+        self._state.step_count += 1
+        self._state.last_exit_code = full_result.exit_code
+        self._state.last_code_compiles = code_compiles
+        self._state.total_tests_passed = tests_passed
+        self._state.total_tests_failed = tests_failed
+        # Build observation
+        observation = JuliaObservation(
+            stdout=full_result.stdout,
+            stderr=full_result.stderr,
+            exit_code=full_result.exit_code,
+            reward=reward,
+            metadata={"core_code": action.core_code, "test_code": action.test_code},
+            tests_passed=tests_passed,
+            tests_failed=tests_failed,
+            code_compiles=code_compiles,
+        )
+        # Apply safety and quality transforms
+        observation = self._apply_transform(observation)
+        return observation
+    def _parse_test_results(self, stdout: str, stderr: str) -> tuple[int, int]:
+        """
+        Parse Julia test output to count passed/failed tests.
+        Julia's Test module outputs results like:
+        "Test Summary:      | Pass  Fail  Total  Time"
+        "Add function Tests |    1     1      2  1.5s"
+        Also checks error messages:
+        "Some tests did not pass: 1 passed, 1 failed, 0 errored, 0 broken."
+        Args:
+            stdout: Standard output from Julia execution
+            stderr: Standard error from Julia execution
+        Returns:
+            Tuple of (tests_passed, tests_failed)
+        """
+        # Combine stdout and stderr for analysis
+        passed = 0
+        failed = 0
+        output = stdout + "\n" + stderr
+        # Method 1: Look for "Some tests did not pass" error message
+        # Pattern: "Some tests did not pass: X passed, Y failed, Z errored, W broken."
+        error_pattern = r"Some tests did not pass:\s*(\d+)\s+passed,\s*(\d+)\s+failed,\s*(\d+)\s+errored"
+        match = re.search(error_pattern, output)
+        if match:
+            passed = int(match.group(1))
+            failed = int(match.group(2))
+            errored = int(match.group(3))
+            return passed, failed + errored  # Treat errors as failures
+        # Method 2: Look for Test Summary table
+        # Multiple possible formats:
+        # All pass:     "Test Summary: | Pass  Total  Time"
+        #               "My Tests     |    3      3  0.5s"
+        # Some fail:    "Test Summary: | Pass  Fail  Total  Time"
+        #               "My Tests     |    2     1      3  0.5s"
+        # All error:    "Test Summary: | Error  Total  Time"
+        #               "My Tests     |     3      3  0.9s"
+        # Mixed:        "Test Summary: | Pass  Fail  Error  Total  Time"
+        #               "My Tests     |    1     1      1      3  0.5s"
+        summary_lines = output.split("\n")
+        for i, line in enumerate(summary_lines):
+            if "Test Summary:" in line and i + 1 < len(summary_lines):
+                header_line = line
+                next_line = summary_lines[i + 1]
+                # Determine which columns are present
+                has_pass = "Pass" in header_line
+                has_fail = "Fail" in header_line
+                has_error = "Error" in header_line
+                # Extract all numbers from the line
+                all_numbers = re.findall(r"\d+", next_line)
+                if not all_numbers:
+                    continue
+                # Last number is always Total, second to last is Time (skip it)
+                # Extract based on which columns exist
+                if has_pass and has_fail and has_error:
+                    # Pass  Fail  Error  Total  Time
+                    if len(all_numbers) >= 5:
+                        passed = int(all_numbers[0])
+                        failed = int(all_numbers[1]) + int(
+                            all_numbers[2]
+                        )  # Fail + Error
+                        return passed, failed
+                elif has_pass and has_fail:
+                    # Pass  Fail  Total  Time
+                    if len(all_numbers) >= 4:
+                        passed = int(all_numbers[0])
+                        failed = int(all_numbers[1])
+                        return passed, failed
+                elif has_pass and has_error:
+                    # Pass  Error  Total  Time
+                    if len(all_numbers) >= 4:
+                        passed = int(all_numbers[0])
+                        failed = int(all_numbers[1])  # Treat errors as failures
+                        return passed, failed
+                elif has_fail and has_error:
+                    # Fail  Error  Total  Time (no passes)
+                    if len(all_numbers) >= 4:
+                        passed = 0
+                        failed = int(all_numbers[0]) + int(all_numbers[1])
+                        return passed, failed
+                elif has_pass:
+                    # Pass  Total  Time (no failures/errors)
+                    if len(all_numbers) >= 3:
+                        passed = int(all_numbers[0])
+                        failed = 0
+                        return passed, failed
+                elif has_error:
+                    # Error  Total  Time (all errors, no passes)
+                    if len(all_numbers) >= 3:
+                        passed = 0
+                        failed = int(all_numbers[0])  # Treat all errors as failures
+                        return passed, failed
+                elif has_fail:
+                    # Fail  Total  Time (all failures, no passes)
+                    if len(all_numbers) >= 3:
+                        passed = 0
+                        failed = int(all_numbers[0])
+                        return passed, failed
+        return passed, failed
+    def _calculate_reward(
+        self, code_compiles: bool, tests_passed: int, tests_failed: int
+    ) -> int:
+        """
+        Optimized integer reward for Julia GRPO.
+        Strong signal shaping: rewards correctness, penalizes instability,
+        and gives higher incentive for near-perfect results.
+        """
+        # Code doesn't compile — immediate strong penalty
+        if not code_compiles:
+            return -3
+        reward = 1
+        reward += 3 * tests_passed - 1 * tests_failed
+        if tests_failed == 0 and tests_passed > 0:
+            reward += 2
+        return reward
+    def _apply_transform(self, observation: JuliaObservation) -> JuliaObservation:
+        """Apply safety and quality transforms to observation."""
+        if self.transform:
+            observation = self.transform(observation)
+        return observation
+    @property
+    def state(self) -> JuliaState:
+        """Return current environment state."""
+        return self._state

src/envs/julia_env/server/julia_transforms.py ADDED Viewed

	@@ -0,0 +1,87 @@

+"""
+envs/julia_env/julia_transforms.py
+--------------------------------
+Safety and quality transforms for Julia code.
+"""
+import re
+from core.env_server.base_transforms import CompositeTransform
+from core.env_server.interfaces import Transform
+from ..models import JuliaObservation
+# -------------------------
+# Safety Transform
+# -------------------------
+class JuliaSafetyTransform(Transform):
+    """Detects dangerous Julia operations and penalizes them with a negative reward."""
+    def __init__(self, penalty: float = -3.0):
+        self.penalty = penalty
+        self.dangerous_patterns = [
+            r"run\(",
+            r"read\(",
+            r"write\(",
+            r"unsafe_",
+            r"ccall\(",
+            r"Base\.exit",
+            r"Base\.kill",
+            r"rm\(",      # file deletion
+            r"download\(" # downloading
+        ]
+    def __call__(self, observation):
+        # Only act on JuliaObservation objects
+        if not isinstance(observation, JuliaObservation):
+            return observation
+        # Extract last executed code from metadata
+        code = observation.metadata.get("last_code", "") if observation.metadata else ""
+        for pattern in self.dangerous_patterns:
+            if re.search(pattern, code):
+                # Apply penalty and record violation
+                observation.reward = (observation.reward or 0.0) + self.penalty
+                observation.metadata = observation.metadata or {}
+                observation.metadata["safety_violation"] = pattern
+                return observation
+        # Safe code gets neutral reward
+        observation.reward = observation.reward or 0.0
+        return observation
+# -------------------------
+# Quality Transform
+# -------------------------
+class JuliaQualityTransform(Transform):
+    """Evaluates and rewards Julia code quality."""
+    def __init__(self, concise_bonus=1, max_length_threshold=120):
+        self.concise_bonus = concise_bonus
+        self.max_length_threshold = max_length_threshold
+    def __call__(self, observation):
+        # Only act on JuliaObservation objects
+        if not isinstance(observation, JuliaObservation):
+            return observation
+        code = observation.metadata.get("last_code", "") if observation.metadata else ""
+        reward = observation.reward or 0.0
+        # Reward concise code
+        if len(code.strip()) <= self.max_length_threshold:
+            reward += self.concise_bonus
+        else:
+            reward -= 0.1  # slight penalty for verbosity
+        observation.reward = reward
+        return observation
+# -------------------------
+# Composite Transform
+# -------------------------
+def create_safe_julia_transform():
+    """Combines safety and quality transforms into one pipeline."""
+    return CompositeTransform([JuliaSafetyTransform(), JuliaQualityTransform()])