Spaces:
Runtime error
Runtime error
| Metadata-Version: 2.4 | |
| Name: inference-server | |
| Version: 0.1.0 | |
| Summary: ACT Model Inference Server for Real-time Robot Control | |
| Requires-Python: >=3.12 | |
| Description-Content-Type: text/markdown | |
| Requires-Dist: aiofiles>=24.1.0 | |
| Requires-Dist: aiortc>=1.13.0 | |
| Requires-Dist: av>=14.4.0 | |
| Requires-Dist: einops>=0.7.0 | |
| Requires-Dist: fastapi>=0.115.12 | |
| Requires-Dist: gradio>=5.34.2 | |
| Requires-Dist: httpx>=0.28.1 | |
| Requires-Dist: huggingface-hub>=0.32.4 | |
| Requires-Dist: imageio[ffmpeg]>=2.37.0 | |
| Requires-Dist: lerobot | |
| Requires-Dist: robothub-transport-server-client | |
| Requires-Dist: numpy>=1.26.4 | |
| Requires-Dist: opencv-python>=4.11.0.86 | |
| Requires-Dist: opencv-python-headless>=4.11.0.86 | |
| Requires-Dist: psutil>=7.0.0 | |
| Requires-Dist: pydantic>=2.11.5 | |
| Requires-Dist: python-multipart>=0.0.20 | |
| Requires-Dist: torch>=2.2.2 | |
| Requires-Dist: torchvision>=0.17.2 | |
| Requires-Dist: tqdm>=4.67.1 | |
| Requires-Dist: transformers>=4.52.4 | |
| Requires-Dist: uvicorn[standard]>=0.34.3 | |
| Requires-Dist: websockets>=15.0.1 | |
| title: LeRobot Arena - AI Inference Server | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: docker | |
| app_port: 7860 | |
| suggested_hardware: t4-small | |
| suggested_storage: medium | |
| short_description: Real-time ACT model inference server for robot control | |
| tags: | |
| - robotics | |
| - ai | |
| - inference | |
| - control | |
| - act-model | |
| - transformer | |
| - real-time | |
| - gradio | |
| - fastapi | |
| - computer-vision | |
| pinned: false | |
| fullWidth: true | |
| # Inference Server | |
| π€ **Real-time ACT Model Inference Server for Robot Control** | |
| This server provides ACT (Action Chunking Transformer) model inference for robotics applications using the transport server communication system. It includes a user-friendly Gradio web interface for easy setup and management. | |
| ## β¨ Features | |
| - **Real-time AI Inference**: Run ACT models for robot control at 20Hz control frequency | |
| - **Multi-Camera Support**: Handle multiple camera streams with different names | |
| - **Web Interface**: User-friendly Gradio UI for setup and monitoring | |
| - **Session Management**: Create, start, stop, and monitor inference sessions | |
| - **Automatic Timeout**: Sessions automatically cleanup after 10 minutes of inactivity | |
| - **Debug Tools**: Built-in debugging and monitoring endpoints | |
| - **Flexible Configuration**: Support for custom model paths, camera configurations | |
| - **No External Dependencies**: Direct Python execution without subprocess calls | |
| ## π Quick Start | |
| ### Prerequisites | |
| - Python 3.12+ | |
| - UV package manager (recommended) | |
| - Trained ACT model | |
| - Transport server running | |
| ### 1. Installation | |
| ```bash | |
| cd backend/ai-server | |
| # Install dependencies using uv (recommended) | |
| uv sync | |
| # Or using pip | |
| pip install -e . | |
| ``` | |
| ### 2. Launch the Application | |
| #### **π Simple Integrated Mode (Recommended)** | |
| ```bash | |
| # Everything runs in one process - no subprocess issues! | |
| python launch_simple.py | |
| # Or using the CLI | |
| python -m inference_server.cli --simple | |
| ``` | |
| This will: | |
| - Run everything on `http://localhost:7860` | |
| - Direct session management (no HTTP API calls) | |
| - No external subprocess dependencies | |
| - Most robust and simple deployment! | |
| #### **π§ Development Mode (Separate Processes)** | |
| ```bash | |
| # Traditional approach with separate server and UI | |
| python -m inference_server.cli | |
| ``` | |
| This will: | |
| - Start the AI server on `http://localhost:8001` | |
| - Launch the Gradio UI on `http://localhost:7860` | |
| - Better for development and debugging | |
| ### 3. Using the Web Interface | |
| 1. **Check Server Status**: The interface will automatically check if the AI server is running | |
| 2. **Configure Your Robot**: Enter your model path and camera setup | |
| 3. **Create & Start Session**: Click the button to set up AI control | |
| 4. **Monitor Performance**: Use the status panel to monitor inference | |
| ## π― Workflow Guide | |
| ### Step 1: AI Server | |
| - The server status will be displayed at the top | |
| - Click "Start Server" if it's not already running | |
| - Use "Check Status" to verify connectivity | |
| ### Step 2: Set Up Robot AI | |
| - **Session Name**: Give your session a unique name (e.g., "my-robot-01") | |
| - **AI Model Path**: Path to your trained ACT model (e.g., "./checkpoints/act_so101_beyond") | |
| - **Camera Names**: Comma-separated list of camera names (e.g., "front,wrist,overhead") | |
| - Click "Create & Start AI Control" to begin | |
| ### Step 3: Control Session | |
| - The session ID will be auto-filled after creation | |
| - Use Start/Stop buttons to control inference | |
| - Click "Status" to see detailed performance metrics | |
| ## π οΈ Advanced Usage | |
| ### CLI Options | |
| ```bash | |
| # Simple integrated mode (recommended) | |
| python -m inference_server.cli --simple | |
| # Development mode (separate processes) | |
| python -m inference_server.cli | |
| # Launch only the server | |
| python -m inference_server.cli --server-only | |
| # Launch only the UI (server must be running separately) | |
| python -m inference_server.cli --ui-only | |
| # Custom ports | |
| python -m inference_server.cli --server-port 8002 --ui-port 7861 | |
| # Enable public sharing | |
| python -m inference_server.cli --share | |
| # For deployment (recommended) | |
| python -m inference_server.cli --simple --host 0.0.0.0 --share | |
| ``` | |
| ### API Endpoints | |
| The server provides a REST API for programmatic access: | |
| - `GET /health` - Server health check | |
| - `POST /sessions` - Create new session | |
| - `GET /sessions` - List all sessions | |
| - `GET /sessions/{id}` - Get session details | |
| - `POST /sessions/{id}/start` - Start inference | |
| - `POST /sessions/{id}/stop` - Stop inference | |
| - `POST /sessions/{id}/restart` - Restart inference | |
| - `DELETE /sessions/{id}` - Delete session | |
| #### Debug Endpoints | |
| - `GET /debug/system` - System information (CPU, memory, GPU) | |
| - `GET /debug/sessions/{id}/queue` - Action queue details | |
| - `POST /debug/sessions/{id}/reset` - Reset session state | |
| ### Configuration | |
| #### Joint Value Convention | |
| - All joint inputs/outputs use **NORMALIZED VALUES** | |
| - Most joints: -100 to +100 (RANGE_M100_100) | |
| - Gripper: 0 to 100 (RANGE_0_100) | |
| - This matches the training data format exactly | |
| #### Camera Support | |
| - Supports arbitrary number of camera streams | |
| - Each camera has a unique name (e.g., "front", "wrist", "overhead") | |
| - All camera streams are synchronized for inference | |
| - Images expected in RGB format, uint8 [0-255] | |
| ## π Monitoring | |
| ### Session Status Indicators | |
| - π’ **Running**: Inference active and processing | |
| - π‘ **Ready**: Session created but inference not started | |
| - π΄ **Stopped**: Inference stopped | |
| - π **Initializing**: Session being set up | |
| ### Smart Session Control | |
| The UI now provides intelligent feedback: | |
| - βΉοΈ **Already Running**: When trying to start a running session | |
| - βΉοΈ **Already Stopped**: When trying to stop a stopped session | |
| - π‘ **Smart Suggestions**: Context-aware tips based on current status | |
| ### Performance Metrics | |
| - **Inferences**: Total number of model inferences performed | |
| - **Commands Sent**: Joint commands sent to robot | |
| - **Queue Length**: Actions waiting in the queue | |
| - **Errors**: Number of errors encountered | |
| - **Data Flow**: Images and joint states received | |
| ## π³ Docker Usage | |
| ### Build the Image | |
| ```bash | |
| cd services/inference-server | |
| docker build -t inference-server . | |
| ``` | |
| ### Run the Container | |
| ```bash | |
| # Basic usage | |
| docker run -p 7860:7860 inference-server | |
| # With environment variables | |
| docker run -p 7860:7860 \ | |
| -e DEFAULT_ARENA_SERVER_URL=http://your-server.com \ | |
| -e DEFAULT_MODEL_PATH=./checkpoints/your-model \ | |
| inference-server | |
| # With GPU support | |
| docker run --gpus all -p 7860:7860 inference-server | |
| ``` | |
| ## π§ Troubleshooting | |
| ### Common Issues | |
| 1. **Server Won't Start** | |
| - Check if port 8001 is available | |
| - Verify model path exists and is accessible | |
| - Check dependencies are installed correctly | |
| 2. **Session Creation Fails** | |
| - Verify model path is correct | |
| - Check Arena server is running on specified URL | |
| - Ensure camera names match your robot configuration | |
| 3. **Poor Performance** | |
| - Monitor system resources in the debug panel | |
| - Check if GPU is being used for inference | |
| - Verify control/inference frequency settings | |
| 4. **Connection Issues** | |
| - Verify Arena server URL is correct | |
| - Check network connectivity | |
| - Ensure workspace/room IDs are valid | |
| ### Debug Mode | |
| Enable debug mode for detailed logging: | |
| ```bash | |
| uv run python -m lerobot_arena_ai_server.cli --debug | |
| ``` | |
| ### System Requirements | |
| - **CPU**: Multi-core recommended for 30Hz control | |
| - **Memory**: 8GB+ RAM recommended | |
| - **GPU**: CUDA-compatible GPU for fast inference (optional but recommended) | |
| - **Network**: Stable connection to Arena server | |
| ## π Architecture | |
| ### Integrated Mode (Recommended) | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββ | |
| β Single Application β β LeRobot Arena β | |
| β βββββββββββββββ βββββββββββββββ βββββΊβ (Port 8000) β | |
| β β Gradio UI β β AI Server β β βββββββββββββββββββ | |
| β β (/) β β (/api/*) β β β | |
| β βββββββββββββββ βββββββββββββββ β β | |
| β (Port 7860) β Robot/Cameras | |
| βββββββββββββββββββββββββββββββββββββββ | |
| β | |
| Web Browser | |
| ``` | |
| ### Development Mode | |
| ``` | |
| βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ | |
| β Gradio UI β β AI Server β β LeRobot Arena β | |
| β (Port 7860) βββββΊβ (Port 8001) βββββΊβ (Port 8000) β | |
| βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ | |
| β β β | |
| β β β | |
| Web Browser ACT Model Robot/Cameras | |
| Inference | |
| ``` | |
| ### Data Flow | |
| 1. **Camera Data**: Robot cameras β Arena β AI Server | |
| 2. **Joint State**: Robot joints β Arena β AI Server | |
| 3. **AI Inference**: Images + Joint State β ACT Model β Actions | |
| 4. **Control Commands**: Actions β Arena β Robot | |
| ### Session Lifecycle | |
| 1. **Create**: Set up rooms in Arena, load ACT model | |
| 2. **Start**: Begin inference loop (3Hz) and control loop (30Hz) | |
| 3. **Running**: Process camera/joint data, generate actions | |
| 4. **Stop**: Pause inference, maintain connections | |
| 5. **Delete**: Clean up resources, disconnect from Arena | |
| ## π€ Contributing | |
| 1. Follow the existing code style | |
| 2. Add tests for new features | |
| 3. Update documentation | |
| 4. Submit pull requests | |
| ## π License | |
| This project follows the same license as the parent LeRobot Arena project. | |
| For more information, see the [LeRobot Arena documentation](../../README.md). | |