Spaces:
Runtime error
Runtime error
π€ Integrated Inference Server
This is an integrated ACT Model Inference Server that combines FastAPI and Gradio on a single port, perfect for deployment and development.
π Quick Start
# Install dependencies
uv sync
# Run the integrated server
uv run python launch_simple.py --host 0.0.0.0 --port 7860
π‘ Access Points
Once running, you can access:
- π¨ Gradio UI: http://localhost:7860/
- π API Documentation: http://localhost:7860/api/docs
- π Health Check: http://localhost:7860/api/health
- π OpenAPI Schema: http://localhost:7860/api/openapi.json
ποΈ Architecture
Integration Approach
- Single Process: Everything runs in one Python process
- Single Port: Both API and UI on the same port (7860)
- FastAPI at
/api: Full REST API with automatic documentation - Gradio at
/: User-friendly web interface - Direct Session Management: UI communicates directly with session manager (no HTTP overhead)
Key Components
simple_integrated.py: Main integration logic- Creates FastAPI app and mounts it at
/api - Creates Gradio interface and mounts it at
/ - Provides
SimpleServerManagerfor direct session access
- Creates FastAPI app and mounts it at
launch_simple.py: Entry point script- Handles command-line arguments
- Starts the integrated application
main.py: Core FastAPI application- Session management endpoints
- Policy loading and inference
- OpenAPI documentation
π§ Features
For UI Users
- β Simple Interface: Create and manage AI sessions through web UI
- β Real-time Status: Live session monitoring and control
- β Direct Performance: No HTTP overhead for UI operations
For API Users
- β Full REST API: Complete programmatic access
- β Interactive Docs: Automatic Swagger/OpenAPI documentation
- β
Standard Endpoints:
/sessions,/health, etc. - β CORS Enabled: Ready for frontend integration
For Deployment
- β Single Port: Easy to deploy behind reverse proxy
- β Docker Ready: Dockerfile included
- β Health Checks: Built-in monitoring endpoints
- β HuggingFace Spaces: Perfect for cloud deployment
π API Usage Examples
Health Check
curl http://localhost:7860/api/health
Create Session
curl -X POST http://localhost:7860/api/sessions \
-H "Content-Type: application/json" \
-d '{
"session_id": "my-robot",
"policy_path": "./checkpoints/act_so101_beyond",
"camera_names": ["front"],
"arena_server_url": "http://localhost:8000"
}'
Start Inference
curl -X POST http://localhost:7860/api/sessions/my-robot/start
Get Session Status
curl http://localhost:7860/api/sessions/my-robot
π³ Docker Usage
# Build
docker build -t inference-server .
# Run
docker run -p 7860:7860 inference-server
π Testing
Run the integration test to verify everything works:
uv run python test_integration.py
π‘ Development Tips
Use Both Interfaces
- Development: Use Gradio UI for quick testing and setup
- Production: Use REST API for automated systems
- Integration: Both can run simultaneously
Session Management
- UI uses direct session manager access (faster)
- API uses HTTP endpoints (standard REST)
- Both share the same underlying session data
Debugging
- Check logs for startup issues
- Use
/api/healthto verify API is working - Visit
/api/docsfor interactive API testing
π Benefits of This Approach
- Flexibility: Use UI or API as needed
- Performance: Direct access for UI, standard REST for API
- Deployment: Single port, single process
- Documentation: Auto-generated API docs
- Development: Fast iteration with integrated setup