RobotHub-InferenceServer / README_INTEGRATION.md
blanchon's picture
Update
63ed3a7
|
raw
history blame
3.86 kB

πŸ€– Integrated Inference Server

This is an integrated ACT Model Inference Server that combines FastAPI and Gradio on a single port, perfect for deployment and development.

πŸš€ Quick Start

# Install dependencies
uv sync

# Run the integrated server
uv run python launch_simple.py --host 0.0.0.0 --port 7860

πŸ“‘ Access Points

Once running, you can access:

πŸ—οΈ Architecture

Integration Approach

  • Single Process: Everything runs in one Python process
  • Single Port: Both API and UI on the same port (7860)
  • FastAPI at /api: Full REST API with automatic documentation
  • Gradio at /: User-friendly web interface
  • Direct Session Management: UI communicates directly with session manager (no HTTP overhead)

Key Components

  1. simple_integrated.py: Main integration logic

    • Creates FastAPI app and mounts it at /api
    • Creates Gradio interface and mounts it at /
    • Provides SimpleServerManager for direct session access
  2. launch_simple.py: Entry point script

    • Handles command-line arguments
    • Starts the integrated application
  3. main.py: Core FastAPI application

    • Session management endpoints
    • Policy loading and inference
    • OpenAPI documentation

πŸ”§ Features

For UI Users

  • βœ… Simple Interface: Create and manage AI sessions through web UI
  • βœ… Real-time Status: Live session monitoring and control
  • βœ… Direct Performance: No HTTP overhead for UI operations

For API Users

  • βœ… Full REST API: Complete programmatic access
  • βœ… Interactive Docs: Automatic Swagger/OpenAPI documentation
  • βœ… Standard Endpoints: /sessions, /health, etc.
  • βœ… CORS Enabled: Ready for frontend integration

For Deployment

  • βœ… Single Port: Easy to deploy behind reverse proxy
  • βœ… Docker Ready: Dockerfile included
  • βœ… Health Checks: Built-in monitoring endpoints
  • βœ… HuggingFace Spaces: Perfect for cloud deployment

πŸ“‹ API Usage Examples

Health Check

curl http://localhost:7860/api/health

Create Session

curl -X POST http://localhost:7860/api/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "my-robot",
    "policy_path": "./checkpoints/act_so101_beyond",
    "camera_names": ["front"],
    "arena_server_url": "http://localhost:8000"
  }'

Start Inference

curl -X POST http://localhost:7860/api/sessions/my-robot/start

Get Session Status

curl http://localhost:7860/api/sessions/my-robot

🐳 Docker Usage

# Build
docker build -t inference-server .

# Run
docker run -p 7860:7860 inference-server

πŸ” Testing

Run the integration test to verify everything works:

uv run python test_integration.py

πŸ’‘ Development Tips

Use Both Interfaces

  • Development: Use Gradio UI for quick testing and setup
  • Production: Use REST API for automated systems
  • Integration: Both can run simultaneously

Session Management

  • UI uses direct session manager access (faster)
  • API uses HTTP endpoints (standard REST)
  • Both share the same underlying session data

Debugging

  • Check logs for startup issues
  • Use /api/health to verify API is working
  • Visit /api/docs for interactive API testing

πŸš€ Benefits of This Approach

  1. Flexibility: Use UI or API as needed
  2. Performance: Direct access for UI, standard REST for API
  3. Deployment: Single port, single process
  4. Documentation: Auto-generated API docs
  5. Development: Fast iteration with integrated setup