Spaces:

blanchon
/

RobotHub-InferenceServer

Runtime error

App Files Files Community

RobotHub-InferenceServer / README_INTEGRATION.md

blanchon

Update

63ed3a7 6 months ago

preview code

raw

history blame

3.86 kB

🤖 Integrated Inference Server

This is an integrated ACT Model Inference Server that combines FastAPI and Gradio on a single port, perfect for deployment and development.

🚀 Quick Start

# Install dependencies
uv sync

# Run the integrated server
uv run python launch_simple.py --host 0.0.0.0 --port 7860

📡 Access Points

Once running, you can access:

🎨 Gradio UI: http://localhost:7860/
📖 API Documentation: http://localhost:7860/api/docs
🔄 Health Check: http://localhost:7860/api/health
📋 OpenAPI Schema: http://localhost:7860/api/openapi.json

🏗️ Architecture

Integration Approach

Single Process: Everything runs in one Python process
Single Port: Both API and UI on the same port (7860)
FastAPI at /api: Full REST API with automatic documentation
Gradio at /: User-friendly web interface
Direct Session Management: UI communicates directly with session manager (no HTTP overhead)

Key Components

simple_integrated.py: Main integration logic
- Creates FastAPI app and mounts it at /api
- Creates Gradio interface and mounts it at /
- Provides SimpleServerManager for direct session access
launch_simple.py: Entry point script
- Handles command-line arguments
- Starts the integrated application
main.py: Core FastAPI application
- Session management endpoints
- Policy loading and inference
- OpenAPI documentation

🔧 Features

For UI Users

✅ Simple Interface: Create and manage AI sessions through web UI
✅ Real-time Status: Live session monitoring and control
✅ Direct Performance: No HTTP overhead for UI operations

For API Users

✅ Full REST API: Complete programmatic access
✅ Interactive Docs: Automatic Swagger/OpenAPI documentation
✅ Standard Endpoints: /sessions, /health, etc.
✅ CORS Enabled: Ready for frontend integration

For Deployment

✅ Single Port: Easy to deploy behind reverse proxy
✅ Docker Ready: Dockerfile included
✅ Health Checks: Built-in monitoring endpoints
✅ HuggingFace Spaces: Perfect for cloud deployment

📋 API Usage Examples

Health Check

curl http://localhost:7860/api/health

Create Session

curl -X POST http://localhost:7860/api/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "my-robot",
    "policy_path": "./checkpoints/act_so101_beyond",
    "camera_names": ["front"],
    "arena_server_url": "http://localhost:8000"
  }'

Start Inference

curl -X POST http://localhost:7860/api/sessions/my-robot/start

Get Session Status

curl http://localhost:7860/api/sessions/my-robot

🐳 Docker Usage

# Build
docker build -t inference-server .

# Run
docker run -p 7860:7860 inference-server

🔍 Testing

Run the integration test to verify everything works:

uv run python test_integration.py

💡 Development Tips

Use Both Interfaces

Development: Use Gradio UI for quick testing and setup
Production: Use REST API for automated systems
Integration: Both can run simultaneously

Session Management

UI uses direct session manager access (faster)
API uses HTTP endpoints (standard REST)
Both share the same underlying session data

Debugging

Check logs for startup issues
Use /api/health to verify API is working
Visit /api/docs for interactive API testing

🚀 Benefits of This Approach

Flexibility: Use UI or API as needed
Performance: Direct access for UI, standard REST for API
Deployment: Single port, single process
Documentation: Auto-generated API docs
Development: Fast iteration with integrated setup