Spaces:
Sleeping
Sleeping
π Quick Start Guide
Unified Architecture API
The project now uses a unified architecture where every interface goes through the REST API.
βββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Gradio UI (app.py / app_ui.py) β
β β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β
β HTTP/REST
β
ββββββββββββββββββββΌβββββββββββββββββββββββββββ
β β
β FastAPI Server (app_api.py) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββ€
β Detection Service β
β ββ RF-DETR (detection) β
β ββ CLIP (classification) β
β ββ OCR (text extraction) β
β ββ BLIP (visual description) β
βββββββββββββββββββββββββββββββββββββββββββββββ
π― 3 Ways to Launch
Option 1: Automatic Launch (Recommended for tests)
One command starts everything:
python app.py
What happens:
- β Starts the API in the background (port 8000)
- β Waits until the API is ready
- β Launches the Gradio interface (port 7860)
- β Handles clean shutdown with Ctrl+C
Access:
- Gradio Interface: http://localhost:7860
- API Docs: http://localhost:8000/docs
Option 2: Manual Launch (2 terminals)
For more control and debugging:
Terminal 1 - API Server:
python app_api.py
Terminal 2 - Gradio UI:
python app_ui.py
Access:
- Gradio Interface: http://localhost:7860
- API Docs: http://localhost:8000/docs
Option 3: API Only
To use only the API (integration, scripts, etc.):
python app_api.py
Test the API:
# Health check
curl http://localhost:8000/health
# Detect elements
curl -X POST "http://localhost:8000/detect" \
-F "image=@screenshot.png" \
-F "confidence_threshold=0.35" \
-F "enable_clip=true" \
-F "enable_ocr=true"
Interactive documentation:
- OpenAPI Docs: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
π§ Configuration
Environment Variables
API Server:
export UVICORN_HOST="0.0.0.0" # Default: 0.0.0.0
export UVICORN_PORT="8000" # Default: 8000
Gradio UI:
export GRADIO_SERVER_NAME="0.0.0.0" # Default: 0.0.0.0
export GRADIO_SERVER_PORT="7860" # Default: 7860
export CU1_API_URL="http://localhost:8000" # API URL
Example with custom ports:
# API on port 9000, UI on port 9001
export UVICORN_PORT="9000"
export GRADIO_SERVER_PORT="9001"
export CU1_API_URL="http://localhost:9000"
python app.py
π§ͺ Quick Tests
Test 1: Make sure the API works
# In one terminal
python app_api.py
# In another terminal
curl http://localhost:8000/health
Expected result:
{
"status": "healthy",
"cuda_available": false,
"device": "cpu"
}
Test 2: Test detection via the interface
python app.py
- Open http://localhost:7860
- Upload an image
- Click "π Detect Elements"
- Check the results
Test 3: Test detection through the API
# Start the API
python app_api.py
# In another terminal, test with curl
curl -X POST "http://localhost:8000/detect" \
-F "image=@votre_image.png" \
-F "confidence_threshold=0.35" \
-F "enable_ocr=true" \
| jq .
π Troubleshooting
Issue: "Connection Error - Cannot connect to API"
Solution:
- Make sure the API is running:
curl http://localhost:8000/health - Check the ports: no conflict with other apps
- Check the API logs for errors
Issue: "Port already in use"
Solution:
# Find the process that uses the port
lsof -i :8000 # or :7860
# Kill the process
kill -9 <PID>
# Or use a different port
export UVICORN_PORT="9000"
export GRADIO_SERVER_PORT="9001"
Issue: "Module not found"
Solution:
# Reinstall dependencies
pip install -r requirements.txt
Issue: Models slow to load
Reason: The first startup downloads the models
Solution: Be patient, the models are cached after the first download
- RF-DETR model (~few MB)
- CLIP model (~600 MB)
- BLIP model (~1 GB)
- EasyOCR models (~100 MB)
π Monitoring
API logs
The logs appear in the terminal where you launched app_api.py
UI logs
The logs appear in the terminal where you launched app.py or app_ui.py
Metrics
Visit http://localhost:8000/docs to view the API statistics
β Benefits of the Unified Architecture
- Single code path β Easier to maintain
- Consistent behavior β Same results everywhere
- Easy to test β Only one API to test
- Scalable β Can separate API and UI on different servers
- Simplified debugging β Logs centralized in the API
π― For Developers
Code Architecture
.
βββ app.py # β¨ Unified launcher (API + UI)
βββ app_api.py # FastAPI server
βββ app_ui.py # Gradio UI client (manual)
β
βββ api/
β βββ endpoints.py # FastAPI endpoints
β
βββ detection/
β βββ service.py # Detection service
β βββ service_factory.py # Singleton pattern
β βββ image_utils.py # Image utilities
β βββ ocr_handler.py # OCR-only processing
β βββ response_builder.py # Response formatting
β
βββ ui/
βββ detection_wrapper.py # Detection wrappers
βββ gradio_interface.py # Gradio interface (API client)
βββ shared_interface.py # Shared UI components
Request Flow
1. User uploads image in Gradio
β
2. `detect_with_api()` sends an HTTP POST to `/detect`
β
3. API endpoint validates the request
β
4. `DetectionService.analyze()` processes the image
β
5. Response formatted with `response_builder`
β
6. JSON returned to Gradio UI
β
7. UI displays annotated image + results
π Notes
- Thread Safety: The service uses a singleton but passes parameters directly to
analyze()to avoid race conditions - Performance: The first call is slow (model loading), then fast
- Memory: Models use ~2-3 GB of RAM
- GPU: Automatic CUDA/MPS detection if available
π Next Steps
- Test locally:
python app.py - Explore the API: http://localhost:8000/docs
- Customize: Adjust parameters in the interface
- Deploy: See
DEPLOYMENT.mdfor production
Happy testing! π