Spaces:

AI-DrivenTesting
/

CU1-X

Configuration error

App Files Files Community

abdelkader commited on Dec 7, 2025

Commit

bf5eae6

1 Parent(s): fdc3a9e

fix bug

Browse files

Files changed (23) hide show

API_USAGE.md +0 -233
DEPLOYMENT.md +0 -164
QUICK_DEPLOY.md +0 -54
README_DEPLOYMENT.md +0 -81
START.md +0 -314
UNIFIED_ARCHITECTURE.md +0 -443
__init__.py +0 -35
api/endpoints.py +21 -9
app.py +29 -142
app_api.py +0 -58
app_ui.py +0 -80
check_hf_space.sh +0 -286
deploy_hf_space.sh +0 -210
detection/response_builder.py +102 -0
docs/PREPROCESSING_GUIDE.md +0 -466
docs/START.md +0 -314
docs/UNIFIED_ARCHITECTURE.md +0 -443
examples/api_example.py +0 -94
requirements-api-client.txt +0 -8
requirements-full.txt +0 -40
requirements.txt +7 -1
ui/detection_wrapper.py +52 -45
ui/shared_interface.py +1 -3

API_USAGE.md DELETED Viewed

@@ -1,233 +0,0 @@
-# 🔌 API Usage Guide - Hugging Face Spaces
-Sur Hugging Face Spaces, **seul Gradio est exposé publiquement**. L'API FastAPI (port 8000) n'est pas accessible depuis l'extérieur.
-**Mais Gradio expose automatiquement une API REST native!** 🎉
-## 📡 Accéder à l'API depuis l'extérieur
-### Option 1: API Gradio Native (Recommandé)
-Gradio expose automatiquement une API REST à l'endpoint `/api/predict`.
-#### Python avec `gradio_client`:
-```python
-from gradio_client import Client
-# Remplacez par votre Space URL
-client = Client("AI-DrivenTesting/CU1-X")
-# Appeler l'API
-result = client.predict(
-    "screenshot.png",           # image (filepath or PIL Image)
-    0.35,                       # confidence_threshold (float)
-    2,                          # thickness (int)
-    True,                       # enable_clip (bool)
-    True,                       # enable_ocr (bool)
-    False,                      # enable_blip (bool)
-    False,                      # ocr_only (bool)
-    "Only image & button",      # blip_scope (str)
-    False,                      # preprocess (bool)
-    "RF-DETR Optimized (Recommended)",  # preprocess_mode (str)
-    "standard",                 # preprocess_preset (str)
-    api_name="/predict"
-)
-# Résultat: (annotated_image, summary, detections_json)
-annotated_image, summary, detections_json = result
-print(detections_json)
-```
-#### REST API (curl):
-```bash
-# Pour un Space public
-curl -X POST "https://AI-DrivenTesting-CU1-X.hf.space/api/predict" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "data": [
-      "screenshot.png",  # Base64 encoded image or URL
-      0.35,
-      2,
-      true,
-      true,
-      false,
-      false,
-      "Only image & button",
-      false,
-      "RF-DETR Optimized (Recommended)",
-      "standard"
-    ]
-  }'
-```
-**Note:** Pour les images, vous devez soit:
-- Utiliser une URL publique vers l'image
-- Encoder l'image en base64
-- Utiliser `gradio_client` qui gère ça automatiquement
-#### REST API avec Python `requests`:
-```python
-import requests
-import base64
-from PIL import Image
-import io
-# Encoder l'image en base64
-def image_to_base64(image_path):
-    with open(image_path, "rb") as f:
-        return base64.b64encode(f.read()).decode()
-# Appeler l'API
-url = "https://AI-DrivenTesting-CU1-X.hf.space/api/predict"
-image_b64 = image_to_base64("screenshot.png")
-response = requests.post(
-    url,
-    json={
-        "data": [
-            f"data:image/png;base64,{image_b64}",
-            0.35,
-            2,
-            True,
-            True,
-            False,
-            False,
-            "Only image & button",
-            False,
-            "RF-DETR Optimized (Recommended)",
-            "standard"
-        ]
-    },
-    timeout=120
-)
-result = response.json()
-print(result)
-```
-### Option 2: API FastAPI (Interne uniquement)
-L'API FastAPI sur le port 8000 **n'est PAS accessible depuis l'extérieur** du Space HF.
-Elle fonctionne uniquement:
-- ✅ En local (`python app.py`)
-- ✅ Entre les processus internes du Space
-- ❌ **PAS depuis l'extérieur du Space**
-## 🔑 Authentification
-### Spaces Publics
-- Aucune authentification requise
-- API accessible directement
-### Spaces Privés
-- Nécessite un token Hugging Face
-- Ajoutez le header: `Authorization: Bearer <HF_TOKEN>`
-```python
-from gradio_client import Client
-client = Client(
-    "AI-DrivenTesting/CU1-X",
-    hf_token="your_hf_token_here"  # Pour les Spaces privés
-)
-```
-## 📊 Paramètres de l'API
-| Paramètre | Type | Description | Valeur par défaut |
-|-----------|------|-------------|-------------------|
-| `image` | file/str | Image à analyser | - |
-| `confidence_threshold` | float | Seuil de confiance (0.1-0.9) | 0.35 |
-| `thickness` | int | Épaisseur des boîtes (1-6) | 2 |
-| `enable_clip` | bool | Activer classification CLIP | False |
-| `enable_ocr` | bool | Activer extraction OCR | True |
-| `enable_blip` | bool | Activer descriptions BLIP | False |
-| `ocr_only` | bool | Mode OCR seul (skip detection) | False |
-| `blip_scope` | str | Portée BLIP ("Only image & button" ou "All elements") | "Only image & button" |
-| `preprocess` | bool | Activer preprocessing | False |
-| `preprocess_mode` | str | Mode preprocessing | "RF-DETR Optimized (Recommended)" |
-| `preprocess_preset` | str | Preset preprocessing | "standard" |
-## 📝 Format de Réponse
-```json
-{
-  "annotated_image": "base64_encoded_image",
-  "summary": "Markdown summary text",
-  "detections_json": {
-    "success": true,
-    "detections": [...],
-    "total_detections": 10,
-    "image_size": {"width": 1080, "height": 1920},
-    "parameters": {...},
-    "type_distribution": {...}
-  }
-}
-```
-## 🚀 Exemples Complets
-### Exemple 1: Détection Simple
-```python
-from gradio_client import Client
-client = Client("AI-DrivenTesting/CU1-X")
-result = client.predict(
-    "screenshot.png",
-    0.35, 2, False, True, False, False, "Only image & button",
-    False, "RF-DETR Optimized (Recommended)", "standard",
-    api_name="/predict"
-)
-annotated_image, summary, detections = result
-print(f"Found {detections['total_detections']} elements")
-```
-### Exemple 2: Détection Complète avec CLIP
-```python
-result = client.predict(
-    "screenshot.png",
-    0.35, 2, True, True, False, False, "Only image & button",
-    False, "RF-DETR Optimized (Recommended)", "standard",
-    api_name="/predict"
-)
-```
-### Exemple 3: OCR Seulement
-```python
-result = client.predict(
-    "screenshot.png",
-    0.35, 2, False, True, False, True, "Only image & button",
-    False, "RF-DETR Optimized (Recommended)", "standard",
-    api_name="/predict"
-)
-```
-## ⚠️ Limitations HF Spaces
-1. **Timeout:** 60 secondes par défaut (peut être augmenté dans Settings)
-2. **Mémoire:** Limite selon le hardware choisi
-3. **CPU/GPU:** Performance dépend du hardware sélectionné
-4. **API FastAPI:** Non accessible depuis l'extérieur
-## 🔗 Liens Utiles
-- [Gradio Client Docs](https://www.gradio.app/guides/getting-started-with-the-python-client)
-- [HF Spaces API Docs](https://huggingface.co/docs/hub/spaces-sdks-gradio#api-tab)
-- [HF Authentication](https://huggingface.co/docs/hub/security-tokens)
-## 💡 Tips
-- Utilisez `gradio_client` pour une meilleure gestion des images
-- Pour les gros fichiers, utilisez des URLs publiques
-- Activez le preprocessing pour des résultats cohérents sur différents devices
-- Mode OCR-only est plus rapide si vous voulez juste le texte

DEPLOYMENT.md DELETED Viewed

@@ -1,164 +0,0 @@
-# 🚀 Guide de Déploiement Hugging Face Spaces
-## 📋 Scripts Disponibles
-### 1. `check_hf_space.sh` - Vérification Pré-Déploiement
-Vérifie que tout est prêt avant de déployer:
-```bash
-./check_hf_space.sh
-```
-**Vérifie:**
-- ✅ Python version (>= 3.12)
-- ✅ Fichiers requis (app.py, requirements.txt, etc.)
-- ✅ Répertoires requis (detection/, api/, ui/, rfdetr/)
-- ✅ model.pth présent et tracké par Git LFS
-- ✅ Configuration Git LFS
-- ✅ Métadonnées README.md (frontmatter YAML)
-- ✅ requirements.txt complet
-- ✅ Syntaxe Python valide
-- ✅ Configuration Git et remote HF
-- ✅ Connexion Hugging Face CLI
-### 2. `deploy_hf_space.sh` - Déploiement Automatique
-Déploie automatiquement vers Hugging Face Spaces:
-```bash
-./deploy_hf_space.sh
-```
-**Fait automatiquement:**
-- ✅ Configure Git LFS pour model.pth
-- ✅ Vérifie/configure le remote HF
-- ✅ Vérifie la connexion HF
-- ✅ Met à jour requirements.txt si nécessaire
-- ✅ Stage tous les fichiers
-- ✅ Commit avec message descriptif
-- ✅ Push vers HF Spaces
-- ✅ Affiche l'URL du Space
-## 🎯 Workflow Recommandé
-### Étape 1: Vérifier
-```bash
-./check_hf_space.sh
-```
-**Résultat attendu:**
-```
-✅ All checks passed! Ready to deploy! ✨
-```
-### Étape 2: Déployer
-```bash
-./deploy_hf_space.sh
-```
-Le script va:
-1. Vérifier Git LFS
-2. Configurer le remote si nécessaire
-3. Vérifier la connexion HF
-4. Commit et push
-5. Afficher l'URL du Space
-### Étape 3: Suivre le Build
-Le script affichera l'URL de votre Space:
-```
-https://huggingface.co/spaces/YOUR_USERNAME/CU1-X
-```
-Cliquez sur **"Logs"** pour voir le build en direct.
-## 📡 Accéder à l'API
-Une fois déployé, votre API est accessible via:
-### API Gradio Native
-```python
-from gradio_client import Client
-client = Client("AI-DrivenTesting/CU1-X")
-result = client.predict(
-    "screenshot.png",
-    0.35, 2, True, True, False, False, "Only image & button",
-    False, "RF-DETR Optimized (Recommended)", "standard",
-    api_name="/predict"
-)
-```
-**Voir:** `API_USAGE.md` pour plus de détails
-## 🔧 Dépannage
-### Erreur: "Git LFS not installed"
-```bash
-# macOS
-brew install git-lfs
-git lfs install
-# Linux
-sudo apt install git-lfs
-git lfs install
-```
-### Erreur: "Not logged in"
-```bash
-hf login
-# OU
-huggingface-cli login
-```
-### Erreur: "model.pth not tracked by LFS"
-```bash
-git lfs track "*.pth"
-git add .gitattributes model.pth
-git commit -m "Add model with LFS"
-```
-### Erreur: "No remote configured"
-Le script `deploy_hf_space.sh` vous demandera de configurer le remote automatiquement.
-Ou manuellement:
-```bash
-git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/CU1-X
-```
-## 📊 Checklist Rapide
-Avant de déployer:
-- [ ] `./check_hf_space.sh` passe tous les tests
-- [ ] Git LFS installé et configuré
-- [ ] Connecté à Hugging Face (`hf login`)
-- [ ] model.pth présent (~510MB)
-- [ ] Remote HF configuré
-Pour déployer:
-```bash
-./deploy_hf_space.sh
-```
-## 🎉 Après le Déploiement
-Votre Space sera accessible à:
-- **Interface Web:** `https://huggingface.co/spaces/YOUR_USERNAME/CU1-X`
-- **API:** `https://YOUR_USERNAME-CU1-X.hf.space/api/predict`
-**Temps de build:** 5-10 minutes (première fois)
----
-**Besoin d'aide?** Consultez `API_USAGE.md` pour utiliser l'API!

QUICK_DEPLOY.md DELETED Viewed

@@ -1,54 +0,0 @@
-# ⚡ Déploiement Rapide - 2 Commandes
-## 🚀 Déployer en 2 Étapes
-### 1️⃣ Vérifier que tout est OK
-```bash
-./check_hf_space.sh
-```
-**Résultat attendu:** ✅ All checks passed!
-### 2️⃣ Déployer vers HF Spaces
-```bash
-./deploy_hf_space.sh
-```
-**C'est tout!** 🎉
-## 📡 Après le Déploiement
-Votre Space sera accessible à:
-- **Web UI:** https://huggingface.co/spaces/AI-DrivenTesting/CU1-X
-- **API:** https://AI-DrivenTesting-CU1-X.hf.space/api/predict
-## 🔌 Utiliser l'API
-```python
-from gradio_client import Client
-client = Client("AI-DrivenTesting/CU1-X")
-result = client.predict(
-    "screenshot.png",
-    0.35, 2, True, True, False, False, "Only image & button",
-    False, "RF-DETR Optimized (Recommended)", "standard",
-    api_name="/predict"
-)
-annotated_image, summary, detections = result
-print(detections)
-```
-**Voir:** `API_USAGE.md` pour plus d'exemples
-## ⏱️ Temps de Build
-- **Premier build:** 5-10 minutes
-- **Builds suivants:** 2-3 minutes
----
-**C'est tout! Simple et rapide! 🚀**

README_DEPLOYMENT.md DELETED Viewed

@@ -1,81 +0,0 @@
-# 📦 Résumé - Fichiers de Déploiement HF Spaces
-## ✅ Fichiers Créés
-### 🚀 Scripts de Déploiement
-1. **`check_hf_space.sh`** - Script de vérification pré-déploiement
-   - Vérifie 10 points critiques
-   - Affiche warnings et erreurs
-   - Exit code 0 si OK, 1 si erreurs
-2. **`deploy_hf_space.sh`** - Script de déploiement automatique
-   - Configure Git LFS automatiquement
-   - Vérifie/configure remote HF
-   - Commit et push vers HF Spaces
-   - Affiche l'URL du Space
-### 📚 Documentation
-1. **`API_USAGE.md`** - Guide complet d'utilisation de l'API
-   - Comment utiliser l'API Gradio native
-   - Exemples Python et REST
-   - Paramètres et format de réponse
-2. **`DEPLOYMENT.md`** - Guide de déploiement détaillé
-   - Workflow étape par étape
-   - Dépannage
-   - Checklist
-3. **`QUICK_DEPLOY.md`** - Guide ultra-rapide
-   - 2 commandes pour déployer
-   - Exemple API rapide
-### 📝 Configuration
-1. **`requirements-full.txt`** - Toutes les dépendances
-2. **`requirements.txt`** - Copie de requirements-full.txt (pour HF)
-3. **`.gitattributes`** - Configuration Git LFS pour *.pth
-4. **`README.md`** - Mis à jour avec métadonnées HF Spaces
-### 💡 Exemples
-1. **`examples/api_example.py`** - Exemple Python d'utilisation de l'API
-## 🎯 Utilisation
-### Vérifier avant déploiement:
-```bash
-./check_hf_space.sh
-```
-### Déployer:
-```bash
-./deploy_hf_space.sh
-```
-## 📊 État Actuel
-✅ **Tout est prêt!**
-- ✅ Tous les fichiers requis présents
-- ✅ model.pth tracké par Git LFS
-- ✅ Git LFS configuré
-- ✅ README.md avec métadonnées HF
-- ✅ requirements.txt complet
-- ✅ Remote HF configuré
-- ✅ Connecté à Hugging Face
-**Prochaine étape:** `./deploy_hf_space.sh`
-## 🔗 URLs Importantes
-Une fois déployé:
-- **Space:** https://huggingface.co/spaces/AI-DrivenTesting/CU1-X
-- **API:** https://AI-DrivenTesting-CU1-X.hf.space/api/predict
-- **Logs:** https://huggingface.co/spaces/AI-DrivenTesting/CU1-X/logs
----
-**Tout est prêt pour le déploiement! 🚀**

START.md DELETED Viewed

@@ -1,314 +0,0 @@
-# 🚀 Quick Start Guide
-## Unified Architecture API
-The project now uses a **unified architecture** where every interface goes through the REST API.
-```
-┌─────────────────────────────────────────────┐
-│                                             │
-│  Gradio UI (app.py / app_ui.py)            │
-│                                             │
-└──────────────────┬──────────────────────────┘
-                   │
-                   │ HTTP/REST
-                   │
-┌──────────────────▼──────────────────────────┐
-│                                             │
-│  FastAPI Server (app_api.py)                │
-│                                             │
-├─────────────────────────────────────────────┤
-│  Detection Service                          │
-│  ├─ RF-DETR (detection)                     │
-│  ├─ CLIP (classification)                   │
-│  ├─ OCR (text extraction)                   │
-│  └─ BLIP (visual description)               │
-└─────────────────────────────────────────────┘
-```
----
-## 🎯 3 Ways to Launch
-### Option 1: Automatic Launch (Recommended for tests)
-**One command starts everything:**
-```bash
-python app.py
-```
-**What happens:**
-1. ✅ Starts the API in the background (port 8000)
-2. ✅ Waits until the API is ready
-3. ✅ Launches the Gradio interface (port 7860)
-4. ✅ Handles clean shutdown with Ctrl+C
-**Access:**
-- Gradio Interface: http://localhost:7860
-- API Docs: http://localhost:8000/docs
----
-### Option 2: Manual Launch (2 terminals)
-**For more control and debugging:**
-**Terminal 1 - API Server:**
-```bash
-python app_api.py
-```
-**Terminal 2 - Gradio UI:**
-```bash
-python app_ui.py
-```
-**Access:**
-- Gradio Interface: http://localhost:7860
-- API Docs: http://localhost:8000/docs
----
-### Option 3: API Only
-**To use only the API (integration, scripts, etc.):**
-```bash
-python app_api.py
-```
-**Test the API:**
-```bash
-# Health check
-curl http://localhost:8000/health
-# Detect elements
-curl -X POST "http://localhost:8000/detect" \
-  -F "image=@screenshot.png" \
-  -F "confidence_threshold=0.35" \
-  -F "enable_clip=true" \
-  -F "enable_ocr=true"
-```
-**Interactive documentation:**
-- OpenAPI Docs: http://localhost:8000/docs
-- ReDoc: http://localhost:8000/redoc
----
-## 🔧 Configuration
-### Environment Variables
-**API Server:**
-```bash
-export UVICORN_HOST="0.0.0.0"       # Default: 0.0.0.0
-export UVICORN_PORT="8000"          # Default: 8000
-```
-**Gradio UI:**
-```bash
-export GRADIO_SERVER_NAME="0.0.0.0" # Default: 0.0.0.0
-export GRADIO_SERVER_PORT="7860"    # Default: 7860
-export CU1_API_URL="http://localhost:8000"  # API URL
-```
-**Example with custom ports:**
-```bash
-# API on port 9000, UI on port 9001
-export UVICORN_PORT="9000"
-export GRADIO_SERVER_PORT="9001"
-export CU1_API_URL="http://localhost:9000"
-python app.py
-```
----
-## 🧪 Quick Tests
-### Test 1: Make sure the API works
-```bash
-# In one terminal
-python app_api.py
-# In another terminal
-curl http://localhost:8000/health
-```
-**Expected result:**
-```json
-{
-  "status": "healthy",
-  "cuda_available": false,
-  "device": "cpu"
-}
-```
----
-### Test 2: Test detection via the interface
-```bash
-python app.py
-```
-1. Open http://localhost:7860
-2. Upload an image
-3. Click "🔍 Detect Elements"
-4. Check the results
----
-### Test 3: Test detection through the API
-```bash
-# Start the API
-python app_api.py
-# In another terminal, test with curl
-curl -X POST "http://localhost:8000/detect" \
-  -F "image=@votre_image.png" \
-  -F "confidence_threshold=0.35" \
-  -F "enable_ocr=true" \
-  | jq .
-```
----
-## 🐛 Troubleshooting
-### Issue: "Connection Error - Cannot connect to API"
-**Solution:**
-1. Make sure the API is running: `curl http://localhost:8000/health`
-2. Check the ports: no conflict with other apps
-3. Check the API logs for errors
-### Issue: "Port already in use"
-**Solution:**
-```bash
-# Find the process that uses the port
-lsof -i :8000  # or :7860
-# Kill the process
-kill -9 <PID>
-# Or use a different port
-export UVICORN_PORT="9000"
-export GRADIO_SERVER_PORT="9001"
-```
-### Issue: "Module not found"
-**Solution:**
-```bash
-# Reinstall dependencies
-pip install -r requirements.txt
-```
-### Issue: Models slow to load
-**Reason:** The first startup downloads the models
-**Solution:** Be patient, the models are cached after the first download
-- RF-DETR model (~few MB)
-- CLIP model (~600 MB)
-- BLIP model (~1 GB)
-- EasyOCR models (~100 MB)
----
-## 📊 Monitoring
-### API logs
-The logs appear in the terminal where you launched `app_api.py`
-### UI logs
-The logs appear in the terminal where you launched `app.py` or `app_ui.py`
-### Metrics
-Visit http://localhost:8000/docs to view the API statistics
----
-## ✅ Benefits of the Unified Architecture
-1. **Single code path** → Easier to maintain
-2. **Consistent behavior** → Same results everywhere
-3. **Easy to test** → Only one API to test
-4. **Scalable** → Can separate API and UI on different servers
-5. **Simplified debugging** → Logs centralized in the API
----
-## 🎯 For Developers
-### Code Architecture
-```
-.
-├── app.py              # ✨ Unified launcher (API + UI)
-├── app_api.py          # FastAPI server
-├── app_ui.py           # Gradio UI client (manual)
-│
-├── api/
-│   └── endpoints.py    # FastAPI endpoints
-│
-├── detection/
-│   ├── service.py           # Detection service
-│   ├── service_factory.py   # Singleton pattern
-│   ├── image_utils.py       # Image utilities
-│   ├── ocr_handler.py       # OCR-only processing
-│   └── response_builder.py  # Response formatting
-│
-└── ui/
-    ├── detection_wrapper.py   # Detection wrappers
-    ├── gradio_interface.py    # Gradio interface (API client)
-    └── shared_interface.py    # Shared UI components
-```
-### Request Flow
-```
-1. User uploads image in Gradio
-                ↓
-2. `detect_with_api()` sends an HTTP POST to `/detect`
-                ↓
-3. API endpoint validates the request
-                ↓
-4. `DetectionService.analyze()` processes the image
-                ↓
-5. Response formatted with `response_builder`
-                ↓
-6. JSON returned to Gradio UI
-                ↓
-7. UI displays annotated image + results
-```
----
-## 📝 Notes
-- **Thread Safety:** The service uses a singleton but passes parameters directly to `analyze()` to avoid race conditions
-- **Performance:** The first call is slow (model loading), then fast
-- **Memory:** Models use ~2-3 GB of RAM
-- **GPU:** Automatic CUDA/MPS detection if available
----
-## 🚀 Next Steps
-1. **Test locally:** `python app.py`
-2. **Explore the API:** http://localhost:8000/docs
-3. **Customize:** Adjust parameters in the interface
-4. **Deploy:** See `DEPLOYMENT.md` for production
-Happy testing! 🎉

UNIFIED_ARCHITECTURE.md DELETED Viewed

@@ -1,443 +0,0 @@
-# 🎯 Unified Architecture - Technical Documentation
-## Date
-2025-11-10
-## Objective
-Unify the architecture so that **all interfaces** go through the REST API, removing the duality between "HF Spaces" mode and "Production" mode.
----
-## ✅ What Changed
-### BEFORE (Dual Architecture)
-```
-┌─────────────────────────────────────────────────┐
-│  Mode 1: HF Spaces (app.py)                    │
-│  └─> DIRECT access to DetectionService         │
-│      (no API)                                  │
-└─────────────────────────────────────────────────┘
-┌─────────────────────────────────────────────────┐
-│  Mode 2: Production (app_ui.py)                │
-│  └─> Access via HTTP API                       │
-│      (microservices architecture)              │
-└─────────────────────────────────────────────────┘
-```
-**Problems:**
-- ❌ Two different code paths
-- ❌ Potentially different behaviors
-- ❌ Complex maintenance (two modes to test)
-- ❌ Bugs possible in one mode but not the other
----
-### AFTER (Unified Architecture)
-```
-┌─────────────────────────────────────────────────┐
-│                                                 │
-│  ALL INTERFACES                                │
-│  (app.py, app_ui.py, etc.)                     │
-│                                                 │
-└────────────────────┬────────────────────────────┘
-                     │
-                     │ HTTP/REST
-                     │ (detect_with_api)
-                     │
-┌────────────────────▼────────────────────────────┐
-│                                                 │
-│  FastAPI Server                                 │
-│  (api/endpoints.py)                             │
-│                                                 │
-├─────────────────────────────────────────────────┤
-│  Detection Service                              │
-│  (detection/service.py)                         │
-│                                                 │
-└─────────────────────────────────────────────────┘
-```
-**Benefits:**
-- ✅ One single code path
-- ✅ Consistent behavior everywhere
-- ✅ Simplified maintenance
-- ✅ Unified tests
-- ✅ Easier debugging
----
-## 📝 File Changes
-### 1. `app.py` - Major Transformation
-**BEFORE:**
-```python
-from ui.detection_wrapper import detect_with_service
-demo = create_interface(
-    detection_fn=detect_with_service,  # Direct access
-    title_suffix="Hugging Face Spaces Mode",
-    show_api_info=False
-)
-```
-**AFTER:**
-```python
-from ui.detection_wrapper import detect_with_api
-# Launch the API as a subprocess
-api_process = start_api_server()
-# UI uses the API
-detection_fn = partial(detect_with_api, api_url=API_URL)
-demo = create_interface(
-    detection_fn=detection_fn,  # Via API
-    title_suffix="Unified API Mode",
-    show_api_info=True,
-    api_url=API_URL
-)
-```
-**New features:**
-- 🚀 Automatically starts the API in the background
-- ⏳ Waits until the API is ready (health check)
-- 🛑 Handles clean shutdown (Ctrl+C)
-- 📡 Displays access URLs
----
-### 2. `app_api.py` - Dynamic Configuration
-**Additions:**
-```python
-# Support environment variables
-host = os.getenv("UVICORN_HOST", "0.0.0.0")
-port = int(os.getenv("UVICORN_PORT", "8000"))
-```
-**Allows:**
-- Port configuration through environment variables
-- Usage by the subprocess in app.py
----
-### 3. Documentation
-**New files:**
-- ✨ `START.md` - Complete quick start guide
-- ✨ `UNIFIED_ARCHITECTURE.md` - This document
-- ✨ `test_unified_architecture.py` - Validation tests
-**Updated files:**
-- 📝 `README.md` - Updated Quick Start section
-- 📝 `README.md` - Updated HF Spaces section
----
-## 🚀 How to Use
-### Mode 1: Automatic Launch (Recommended)
-**One command:**
-```bash
-python app.py
-```
-**What happens:**
-1. Starts the API as a subprocess (port 8000)
-2. Waits for the health check
-3. Launches the Gradio UI (port 7860)
-4. Both communicate via HTTP
-**Clean shutdown:**
-- Ctrl+C stops the UI AND the API automatically
----
-### Mode 2: Manual Launch (Debug)
-**Two terminals:**
-```bash
-# Terminal 1
-python app_api.py
-# Terminal 2
-python app_ui.py
-```
-**Useful for:**
-- Viewing logs separately
-- Restarting the UI without restarting the API
-- Advanced debugging
----
-### Mode 3: API Only
-```bash
-python app_api.py
-```
-**Good for:**
-- External integrations
-- Python scripts
-- API tests
----
-## 🧪 Tests and Validation
-### Automated Test Script
-```bash
-python test_unified_architecture.py
-```
-**Checks:**
-- ✅ All required files exist
-- ✅ Valid Python syntax
-- ✅ `app.py` uses `detect_with_api`
-- ✅ No direct service access from the UI
-- ✅ Consistent architecture
-### Test Results
-```
-✅✅✅ ALL TESTS PASS!
-📊 Unified architecture summary:
-   - ✅ `app.py` launches the API as a subprocess
-   - ✅ All interfaces use `detect_with_api`
-   - ✅ Consistent architecture everywhere
-   - ✅ No direct service access from the UI
-```
----
-## 🔄 Unified Request Flow
-### Before (Dual Mode)
-**HF Spaces Mode:**
-```
-User → Gradio → detect_with_service() → DetectionService.analyze()
-```
-**Production Mode:**
-```
-User → Gradio → detect_with_api() → HTTP → API → DetectionService.analyze()
-```
-### After (Unified Mode)
-**All modes:**
-```
-User → Gradio → detect_with_api() → HTTP → API → DetectionService.analyze()
-```
----
-## 📊 Technical Benefits
-### 1. Maintainability
-**BEFORE:**
-- 2 code paths to maintain
-- Tests to run for each mode
-- Regression risk in one mode
-**AFTER:**
-- Only 1 code path
-- Unified tests
-- Guaranteed identical behavior
----
-### 2. Debugging
-**BEFORE:**
-- Bug in `app.py`? Check `detect_with_service`
-- Bug in `app_ui.py`? Check `detect_with_api`
-- Different per mode
-**AFTER:**
-- All bugs go through the API
-- Logs centralized in the API
-- A single place to debug
----
-### 3. Scalability
-**BEFORE:**
-- HF Spaces mode: monolithic
-- Production mode: scalable
-- Different behaviors
-**AFTER:**
-- Same architecture everywhere
-- Can easily separate API/UI on different servers
-- Load balancing possible
----
-### 4. Testing
-**BEFORE:**
-```bash
-# Test HF Spaces
-pytest test_app.py
-# Test Production
-pytest test_api.py
-pytest test_ui.py
-```
-**AFTER:**
-```bash
-# Single test suite
-pytest test_api.py  # Tests the entire logic
-```
----
-## 🔧 Configuration
-### Environment Variables
-```bash
-# API Server
-export UVICORN_HOST="0.0.0.0"
-export UVICORN_PORT="8000"
-# Gradio UI
-export GRADIO_SERVER_NAME="0.0.0.0"
-export GRADIO_SERVER_PORT="7860"
-export CU1_API_URL="http://localhost:8000"
-```
-### Example: Custom Ports
-```bash
-# API on port 9000, UI on port 9001
-export UVICORN_PORT="9000"
-export GRADIO_SERVER_PORT="9001"
-export CU1_API_URL="http://localhost:9000"
-python app.py
-```
----
-## 🎯 Impact on Existing Code
-### No Breaking Changes
-- ✅ `app_api.py` still works on its own
-- ✅ `app_ui.py` still works on its own
-- ✅ Python APIs (`DetectionService`) are unchanged
-- ✅ Existing scripts keep working
-### What’s New
-- ✨ `app.py` now launches the API automatically
-- ✨ Consistent architecture everywhere
-- ✨ Better documentation
----
-## 📈 Metrics
-| Metric | Before | After | Improvement |
-|----------|-------|-------|--------------|
-| **Code paths** | 2 | 1 | -50% |
-| **Testing complexity** | High | Low | -60% |
-| **Bug risk** | Medium | Low | -70% |
-| **Debugging ease** | Medium | High | +80% |
----
-## 🚨 Points to Watch
-### 1. Performance
-**Impact:** Negligible (~10-50ms of extra HTTP latency)
-**Why it’s OK:**
-- Models take 30-60 seconds
-- 50ms HTTP latency = 0.1% of total time
-- Negligible compared to processing
----
-### 2. Memory
-**Before (HF Spaces mode):** 1 process
-**After:** 2 processes (API + UI)
-**Impact:** +100-200 MB (Gradio UI overhead)
-**Why it’s OK:**
-- Models already use 2-3 GB
-- +200 MB = 7% overhead
-- Acceptable for architectural consistency
----
-### 3. Deployment
-**HF Spaces:** No change
-- The `app.py` file handles everything
-- Automatically launches API + UI
-- Works out of the box
-**Docker:** Possible update
-- See `DEPLOYMENT.md` for details
-- May require 2 containers or a supervisor
----
-## 🎓 Lessons Learned
-### 1. Dual Architecture = Bad Idea
-Having two modes (HF Spaces vs Production) seemed convenient at first but created more problems than it solved.
-### 2. HTTP Overhead Is Negligible
-The HTTP overhead is so small compared to ML processing that it’s negligible. The clean architecture is worth the cost.
-### 3. Unified Tests = Better Quality
-Having a single code path makes testing much easier and reduces bugs.
----
-## ✅ Conclusion
-Unifying the architecture to a 100% API model is a **success**:
-✅ **Cleaner code** - Single path
-✅ **Easier to maintain** - Less complexity
-✅ **Easier to test** - Unified tests
-✅ **Consistent behavior** - Same results everywhere
-✅ **No breaking changes** - Backward compatible
-**Result:** Professional, scalable, and maintainable architecture! 🚀
----
-## 📚 Related Documentation
-- 📖 [START.md](START.md) - Quick start guide
-- 📖 [README.md](README.md) - Main documentation
-- 📖 [DEPLOYMENT.md](DEPLOYMENT.md) - Deployment guide
-- 🧪 [test_unified_architecture.py](test_unified_architecture.py) - Tests
----
-**Questions?** Check [START.md](START.md) or open an issue on GitHub.

__init__.py DELETED Viewed

@@ -1,35 +0,0 @@
-"""
-CU-1 UI Element Detector
-A powerful UI element detection library for identifying and extracting
-information from user interface screenshots.
-"""
-try:
-    # When imported as a proper package
-    from .cu1_detector import (
-        CU1Detector,
-        predict,
-        get_predictions_json,
-        get_prediction_image,
-        get_detector
-    )
-except Exception:
-    # Fallback for direct import context (e.g., pytest collecting project root)
-    from cu1_detector import (  # type: ignore
-        CU1Detector,
-        predict,
-        get_predictions_json,
-        get_prediction_image,
-        get_detector
-    )
-__version__ = "1.0.0"
-__all__ = [
-    "CU1Detector",
-    "predict",
-    "get_predictions_json",
-    "get_prediction_image",
-    "get_detector"
-]

api/endpoints.py CHANGED Viewed

@@ -188,6 +188,7 @@ async def detect_ui_elements(
                 detail="When ocr_only=true, enable_clip and enable_blip must be false"
             )
         # OCR-only path: Bypass detection service
         if ocr_only:
             detections = ocr_handler.process_ocr_only(pil_image)
@@ -197,15 +198,25 @@ async def detect_ui_elements(
                 thickness=line_thickness,
                 return_format="numpy"
             )
-            return response_builder.build_ocr_only_response(
-                detections=detections,
-                image_width=pil_image.width,
-                image_height=pil_image.height,
                 annotated_image=annotated,
                 confidence_threshold=confidence_threshold,
-                line_thickness=line_thickness
             )
         # Standard detection path: Use detection service
         import time
         start_time = time.time()
@@ -248,8 +259,9 @@ async def detect_ui_elements(
         total_time = time.time() - start_time
         print(f"[API] Total detection time: {total_time:.2f}s")
-        # Build response
-        return response_builder.build_detection_response(
             analysis=analysis,
             image=pil_image,
             annotated_image=annotated,
@@ -259,9 +271,9 @@ async def detect_ui_elements(
             enable_ocr=enable_ocr,
             enable_blip=enable_blip,
             blip_scope=blip_scope,
-            ocr_only=False,
-            include_annotated_image=True
         )
     except HTTPException:
         raise

                 detail="When ocr_only=true, enable_clip and enable_blip must be false"
             )
         # OCR-only path: Bypass detection service
         if ocr_only:
             detections = ocr_handler.process_ocr_only(pil_image)
                 thickness=line_thickness,
                 return_format="numpy"
             )
+            # Build analysis structure for simplified response
+            analysis = {
+                "detections": detections,
+                "image_size": {"width": pil_image.width, "height": pil_image.height}
+            }
+            return response_builder.build_simplified_response(
+                analysis=analysis,
+                image=pil_image,
                 annotated_image=annotated,
                 confidence_threshold=confidence_threshold,
+                line_thickness=line_thickness,
+                enable_clip=False,
+                enable_ocr=True,
+                enable_blip=False,
+                blip_scope=None,
+                ocr_only=True
             )
         # Standard detection path: Use detection service
         import time
         start_time = time.time()
         total_time = time.time() - start_time
         print(f"[API] Total detection time: {total_time:.2f}s")
+        # Build response using simplified format
+        return response_builder.build_simplified_response(
             analysis=analysis,
             image=pil_image,
             annotated_image=annotated,
             enable_ocr=enable_ocr,
             enable_blip=enable_blip,
             blip_scope=blip_scope,
+            ocr_only=False
         )
     except HTTPException:
         raise

app.py CHANGED Viewed

@@ -1,197 +1,84 @@
 """
-Unified Entry Point - API Architecture
-This file now uses a unified API-based architecture for all deployments.
-Both local development and Hugging Face Spaces use the same API layer.
-Architecture:
-    1. Starts API server in background (subprocess)
-    2. Starts Gradio UI that connects to the API
-    3. Everything goes through HTTP/REST
-Benefits:
-    - Single code path to maintain
-    - Consistent behavior everywhere
-    - Easy to test and debug
-    - Proper separation of concerns
 Usage:
     python app.py
-The script will automatically:
-    - Start the API server on http://localhost:8000
-    - Start the Gradio UI on http://localhost:7860
 """
 import os
 os.environ['PYTORCH_ENABLE_MPS_FALLBACK'] = '1'
-import subprocess
-import time
 import sys
-import signal
-import requests
-from functools import partial
-# Use shared UI components
 from ui.shared_interface import create_interface
-from ui.detection_wrapper import detect_with_api
 # Configuration
-API_HOST = os.getenv("API_HOST", "0.0.0.0")
-API_PORT = int(os.getenv("API_PORT", "8000"))
-API_URL = f"http://localhost:{API_PORT}"
 UI_HOST = os.getenv("GRADIO_SERVER_NAME", "0.0.0.0")
 UI_PORT = int(os.getenv("GRADIO_SERVER_PORT", "7860"))
-def start_api_server():
-    """Start the API server in a subprocess"""
-    print("🚀 Starting API server...")
-    # Start API server as subprocess
-    api_process = subprocess.Popen(
-        [sys.executable, "app_api.py"],
-        env={**os.environ, "UVICORN_HOST": API_HOST, "UVICORN_PORT": str(API_PORT)},
-        stdout=subprocess.PIPE,
-        stderr=subprocess.STDOUT,
-        text=True,
-        bufsize=1
-    )
-    # Wait for API to be ready
-    max_wait = 60  # seconds
-    wait_interval = 0.5
-    elapsed = 0
-    print(f"⏳ Waiting for API server at {API_URL}...")
-    while elapsed < max_wait:
-        try:
-            response = requests.get(f"{API_URL}/health", timeout=2)
-            if response.status_code == 200:
-                print(f"✅ API server ready at {API_URL}")
-                # Optional: Warmup models to avoid timeout on first request
-                # This is especially useful for CPU-only environments
-                warmup_enabled = os.getenv("CU1_WARMUP_MODELS", "true").lower() in {"1", "true", "yes", "y"}
-                if warmup_enabled:
-                    print("🔥 Warming up models (this may take 1-3 minutes on first run)...")
-                    try:
-                        warmup_timeout = int(os.getenv("CU1_WARMUP_TIMEOUT", "180"))  # 3 minutes default
-                        warmup_response = requests.post(f"{API_URL}/warmup", timeout=warmup_timeout)
-                        if warmup_response.status_code == 200:
-                            print("✅ Models warmed up successfully!")
-                        else:
-                            print(f"⚠️  Warmup returned status {warmup_response.status_code}, continuing anyway...")
-                    except requests.exceptions.Timeout:
-                        print("⚠️  Warmup timed out, but API is ready. First request may be slower.")
-                    except requests.exceptions.RequestException as e:
-                        print(f"⚠️  Warmup failed: {e}, but API is ready. First request may be slower.")
-                return api_process
-        except requests.exceptions.RequestException:
-            pass
-        time.sleep(wait_interval)
-        elapsed += wait_interval
-        # Check if process died
-        if api_process.poll() is not None:
-            print("❌ API server failed to start!")
-            print("\nAPI server output:")
-            if api_process.stdout:
-                print(api_process.stdout.read())
-            sys.exit(1)
-    print(f"❌ API server did not start within {max_wait} seconds")
-    api_process.terminate()
-    sys.exit(1)
 def main():
-    """Main entry point - Unified API architecture"""
     print("=" * 70)
-    print("🎯 CU-1 UI Element Detector - Unified API Mode")
     print("=" * 70)
-    print("\n📡 Architecture: All traffic goes through API layer")
-    print(f"   - API Server: {API_URL}")
     print(f"   - Gradio UI: http://localhost:{UI_PORT}")
     print("\n🏗️  Benefits:")
-    print("   - Single code path (easier to maintain)")
-    print("   - Consistent behavior everywhere")
-    print("   - Proper microservices architecture")
     print("=" * 70 + "\n")
-    # Start API server in background
-    api_process = start_api_server()
-    # Setup cleanup on exit
-    def cleanup(signum=None, frame=None):
-        print("\n\n🛑 Shutting down...")
-        if api_process and api_process.poll() is None:
-            print("   Stopping API server...")
-            api_process.terminate()
-            try:
-                api_process.wait(timeout=5)
-            except subprocess.TimeoutExpired:
-                api_process.kill()
-        print("   Goodbye! 👋")
-        sys.exit(0)
-    signal.signal(signal.SIGINT, cleanup)
-    signal.signal(signal.SIGTERM, cleanup)
     try:
-        # Create Gradio interface with API detection function
-        detection_fn = partial(detect_with_api, api_url=API_URL)
         demo = create_interface(
-            detection_fn=detection_fn,
-            title_suffix="Unified API Mode",
-            show_api_info=True,
-            api_url=API_URL
         )
         print(f"\n🎨 Starting Gradio UI on http://localhost:{UI_PORT}...\n")
         # Launch Gradio with automatic port fallback
         # API is automatically exposed at /api/predict for HF Spaces
-        # Configure queue with longer timeout for CPU processing and model loading
         try:
-            demo.queue(
-                max_size=10,  # Allow up to 10 queued requests
-                default_concurrency_limit=1  # Process one at a time to avoid memory issues
-            ).launch(
                 server_name=UI_HOST,
                 server_port=UI_PORT,
-                share=False,
-                max_threads=1  # Single thread to avoid memory issues
             )
         except OSError as e:
             if "Cannot find empty port" in str(e):
                 print(f"⚠️  Port {UI_PORT} is busy, trying to find a free port...")
-                demo.queue(
-                    max_size=10,
-                    default_concurrency_limit=1
-                ).launch(
                     server_name=UI_HOST,
                     server_port=None,  # Auto-select free port
-                    share=False,
-                    max_threads=1
                 )
             else:
                 raise
     except KeyboardInterrupt:
-        cleanup()
     except Exception as e:
         print(f"\n❌ Error: {e}")
-        cleanup()
-    finally:
-        cleanup()
 if __name__ == "__main__":

 """
+Unified Entry Point - Direct Mode for HuggingFace Spaces
+Simplified architecture for HuggingFace Spaces:
+    - Direct service access (no API subprocess)
+    - Faster and more reliable
+    - No HTTP overhead
+For production with separated API/UI, use:
+    - python app_api.py (API server)
+    - python app_ui.py (UI client)
 Usage:
     python app.py
 """
 import os
 os.environ['PYTORCH_ENABLE_MPS_FALLBACK'] = '1'
 import sys
+# Use shared UI components with DIRECT service access
 from ui.shared_interface import create_interface
+from ui.detection_wrapper import detect_with_service
 # Configuration
 UI_HOST = os.getenv("GRADIO_SERVER_NAME", "0.0.0.0")
 UI_PORT = int(os.getenv("GRADIO_SERVER_PORT", "7860"))
 def main():
+    """Main entry point - Direct service mode for HuggingFace Spaces"""
     print("=" * 70)
+    print("🎯 CU-1 UI Element Detector - Direct Mode")
     print("=" * 70)
+    print("\n📡 Architecture: Direct service access (optimized for HF Spaces)")
     print(f"   - Gradio UI: http://localhost:{UI_PORT}")
     print("\n🏗️  Benefits:")
+    print("   - Faster (no HTTP overhead)")
+    print("   - More reliable (no subprocess)")
+    print("   - Simpler architecture")
     print("=" * 70 + "\n")
     try:
+        # Create Gradio interface with DIRECT detection function
         demo = create_interface(
+            detection_fn=detect_with_service,
+            title_suffix="Direct Mode",
+            show_api_info=False
         )
         print(f"\n🎨 Starting Gradio UI on http://localhost:{UI_PORT}...\n")
         # Launch Gradio with automatic port fallback
         # API is automatically exposed at /api/predict for HF Spaces
         try:
+            demo.queue().launch(
                 server_name=UI_HOST,
                 server_port=UI_PORT,
+                share=False
             )
         except OSError as e:
             if "Cannot find empty port" in str(e):
                 print(f"⚠️  Port {UI_PORT} is busy, trying to find a free port...")
+                demo.queue().launch(
                     server_name=UI_HOST,
                     server_port=None,  # Auto-select free port
+                    share=False
                 )
             else:
                 raise
     except KeyboardInterrupt:
+        print("\n\n🛑 Shutting down... Goodbye! 👋")
+        sys.exit(0)
     except Exception as e:
         print(f"\n❌ Error: {e}")
+        import traceback
+        traceback.print_exc()
+        sys.exit(1)
 if __name__ == "__main__":

app_api.py DELETED Viewed

@@ -1,58 +0,0 @@
-"""
-API Server Entry Point
-Starts the FastAPI server for UI element detection.
-Usage:
-    python app_api.py
-The API will be available at:
-    - Root: http://localhost:8000
-    - Detect endpoint: http://localhost:8000/detect
-    - Health check: http://localhost:8000/health
-    - Interactive docs: http://localhost:8000/docs
-"""
-import os
-os.environ['PYTORCH_ENABLE_MPS_FALLBACK'] = '1'
-import uvicorn
-from api.endpoints import app
-def main():
-    """Start the API server"""
-    # Get configuration from environment
-    host = os.getenv("UVICORN_HOST", "0.0.0.0")
-    port = int(os.getenv("UVICORN_PORT", "8000"))
-    print("=" * 70)
-    print("🚀 CU-1 UI Element Detector - API Server")
-    print("=" * 70)
-    print("\n📐 Architecture:")
-    print("  RF-DETR: Detects UI elements (single class)")
-    print("  CLIP:    Classifies elements into 6 types")
-    print("  OCR:     Extracts text content")
-    print("  BLIP:    Generates visual descriptions")
-    print(f"\n📡 API Endpoints:")
-    print(f"  - Root:   http://localhost:{port}")
-    print(f"  - Detect: http://localhost:{port}/detect")
-    print(f"  - Health: http://localhost:{port}/health")
-    print(f"  - Warmup: http://localhost:{port}/warmup (preload models)")
-    print(f"  - Docs:   http://localhost:{port}/docs")
-    print("\n💡 Tip: The Gradio UI connects to this API")
-    print("  Run 'python app_ui.py' in another terminal")
-    print("  Or run 'python app.py' to start both automatically")
-    print("=" * 70 + "\n")
-    uvicorn.run(
-        app,
-        host=host,
-        port=port,
-        log_level="info"
-    )
-if __name__ == "__main__":
-    main()

app_ui.py DELETED Viewed

@@ -1,80 +0,0 @@
-"""
-Gradio UI Server Entry Point
-Starts the Gradio web interface for UI element detection.
-IMPORTANT: The API server must be running for this to work!
-Usage:
-    # Terminal 1: Start API server
-    python app_api.py
-    # Terminal 2: Start UI server
-    python app_ui.py
-The UI will be available at:
-    - Gradio Interface: http://localhost:7860
-Configuration:
-    Set environment variables to customize:
-    - CU1_API_URL: API endpoint (default: http://localhost:8000)
-    - GRADIO_SERVER_NAME: Server host (default: 0.0.0.0)
-    - GRADIO_SERVER_PORT: Server port (default: 7860)
-    - GRADIO_SHARE: Enable sharing (default: false)
-"""
-import os
-os.environ['PYTORCH_ENABLE_MPS_FALLBACK'] = '1'
-from ui.gradio_interface import create_gradio_interface
-def main():
-    """Start the Gradio UI server"""
-    api_url = os.getenv("CU1_API_URL", "http://localhost:8000")
-    print("=" * 70)
-    print("🎨 CU-1 UI Element Detector - Gradio UI")
-    print("=" * 70)
-    print("\n⚠️  IMPORTANT: Make sure the API server is running!")
-    print("  If not started, run in another terminal:")
-    print("    python app_api.py")
-    print(f"\n🔗 API Connection: {api_url}")
-    print("  Change with: export CU1_API_URL=http://your-api:8000")
-    print("\n📱 Gradio Interface: http://localhost:7860")
-    print("\n🏗️  Architecture:")
-    print("  This UI is a CLIENT of the API (service-oriented)")
-    print("  All detection logic runs in the API server")
-    print("  UI communicates via HTTP/REST")
-    print("=" * 70 + "\n")
-    demo = create_gradio_interface()
-    # Read configuration from environment
-    server_name = os.getenv("GRADIO_SERVER_NAME", "0.0.0.0")
-    port_env = os.getenv("GRADIO_SERVER_PORT") or os.getenv("PORT")
-    server_port = int(port_env) if port_env and port_env.isdigit() else 7860
-    share_env = os.getenv("GRADIO_SHARE", "false").lower()
-    share = share_env in {"1", "true", "yes", "y"}
-    try:
-        demo.queue().launch(
-            server_name=server_name,
-            server_port=server_port,
-            share=share
-        )
-    except OSError as e:
-        if "Cannot find empty port" in str(e):
-            print(f"\n⚠️  Port {server_port} is busy, trying to find a free port...")
-            demo.queue().launch(
-                server_name=server_name,
-                server_port=None,  # Auto-select free port
-                share=share
-            )
-        else:
-            raise
-if __name__ == "__main__":
-    main()

check_hf_space.sh DELETED Viewed

@@ -1,286 +0,0 @@
-#!/bin/bash
-# Script de vérification pour Hugging Face Spaces
-# Vérifie que tout est prêt pour le déploiement
-set -e
-# Colors
-RED='\033[0;31m'
-GREEN='\033[0;32m'
-YELLOW='\033[1;33m'
-BLUE='\033[0;34m'
-NC='\033[0m'
-print_info() { echo -e "${BLUE}ℹ️  $1${NC}"; }
-print_success() { echo -e "${GREEN}✅ $1${NC}"; }
-print_warning() { echo -e "${YELLOW}⚠️  $1${NC}"; }
-print_error() { echo -e "${RED}❌ $1${NC}"; }
-FAILURES=0
-WARNINGS=0
-echo ""
-print_info "🔍 Hugging Face Spaces Pre-Deployment Check"
-echo "================================================"
-echo ""
-# Test 1: Python version
-print_info "Test 1: Python version..."
-PYTHON_VERSION=$(python --version 2>&1 | awk '{print $2}')
-PYTHON_MAJOR=$(echo $PYTHON_VERSION | cut -d. -f1)
-PYTHON_MINOR=$(echo $PYTHON_VERSION | cut -d. -f2)
-if [ "$PYTHON_MAJOR" -ge 3 ] && [ "$PYTHON_MINOR" -ge 12 ]; then
-    print_success "Python $PYTHON_VERSION (>= 3.12)"
-else
-    print_warning "Python $PYTHON_VERSION (recommended: >= 3.12)"
-    WARNINGS=$((WARNINGS + 1))
-fi
-echo ""
-# Test 2: Required files
-print_info "Test 2: Required files..."
-REQUIRED_FILES=(
-    "app.py"
-    "app_api.py"
-    "app_ui.py"
-    "requirements.txt"
-    "README.md"
-    ".gitattributes"
-)
-for file in "${REQUIRED_FILES[@]}"; do
-    if [ -f "$file" ]; then
-        print_success "$file exists"
-    else
-        print_error "$file NOT FOUND"
-        FAILURES=$((FAILURES + 1))
-    fi
-done
-echo ""
-# Test 3: Required directories
-print_info "Test 3: Required directories..."
-REQUIRED_DIRS=(
-    "detection"
-    "api"
-    "ui"
-    "rfdetr"
-)
-for dir in "${REQUIRED_DIRS[@]}"; do
-    if [ -d "$dir" ]; then
-        print_success "$dir/ exists"
-    else
-        print_error "$dir/ NOT FOUND"
-        FAILURES=$((FAILURES + 1))
-    fi
-done
-echo ""
-# Test 4: model.pth
-print_info "Test 4: Model weights (model.pth)..."
-if [ -f "model.pth" ]; then
-    SIZE=$(du -h model.pth | cut -f1)
-    SIZE_BYTES=$(stat -f%z model.pth 2>/dev/null || stat -c%s model.pth)
-    if [ $SIZE_BYTES -gt 100000000 ]; then  # > 100MB
-        print_success "model.pth exists ($SIZE)"
-        # Check Git LFS
-        if git lfs ls-files | grep -q "model.pth"; then
-            print_success "model.pth tracked by Git LFS"
-        else
-            print_warning "model.pth NOT tracked by Git LFS (will fail on push)"
-            WARNINGS=$((WARNINGS + 1))
-        fi
-    else
-        print_warning "model.pth size: $SIZE (seems small, verify it's correct)"
-        WARNINGS=$((WARNINGS + 1))
-    fi
-else
-    print_error "model.pth NOT FOUND"
-    FAILURES=$((FAILURES + 1))
-fi
-echo ""
-# Test 5: Git LFS
-print_info "Test 5: Git LFS configuration..."
-if command -v git-lfs &> /dev/null; then
-    print_success "Git LFS installed"
-    if git lfs env &> /dev/null; then
-        print_success "Git LFS initialized"
-    else
-        print_warning "Git LFS not initialized"
-        WARNINGS=$((WARNINGS + 1))
-    fi
-    if grep -q "*.pth.*lfs" .gitattributes 2>/dev/null; then
-        print_success ".gitattributes tracks *.pth"
-    else
-        print_error ".gitattributes doesn't track *.pth"
-        FAILURES=$((FAILURES + 1))
-    fi
-else
-    print_error "Git LFS not installed!"
-    print_info "   Install: brew install git-lfs (macOS) or sudo apt install git-lfs (Linux)"
-    FAILURES=$((FAILURES + 1))
-fi
-echo ""
-# Test 6: README.md frontmatter
-print_info "Test 6: README.md frontmatter (HF Spaces metadata)..."
-if [ -f "README.md" ]; then
-    if head -n 1 README.md | grep -q "^---$"; then
-        print_success "README.md has YAML frontmatter"
-        # Check key fields
-        if grep -q "^sdk: gradio" README.md; then
-            print_success "sdk: gradio found"
-        else
-            print_warning "sdk: gradio not found"
-            WARNINGS=$((WARNINGS + 1))
-        fi
-        if grep -q "^app_file: app.py" README.md; then
-            print_success "app_file: app.py found"
-        else
-            print_warning "app_file: app.py not found"
-            WARNINGS=$((WARNINGS + 1))
-        fi
-        if grep -q "^python_version:" README.md; then
-            print_success "python_version specified"
-        else
-            print_warning "python_version not specified"
-            WARNINGS=$((WARNINGS + 1))
-        fi
-    else
-        print_error "README.md missing YAML frontmatter"
-        FAILURES=$((FAILURES + 1))
-    fi
-else
-    print_error "README.md not found"
-    FAILURES=$((FAILURES + 1))
-fi
-echo ""
-# Test 7: requirements.txt
-print_info "Test 7: requirements.txt..."
-if [ -f "requirements.txt" ]; then
-    if [ -s "requirements.txt" ]; then
-        LINE_COUNT=$(wc -l < requirements.txt)
-        if [ $LINE_COUNT -gt 5 ]; then
-            print_success "requirements.txt looks complete ($LINE_COUNT lines)"
-        else
-            print_warning "requirements.txt seems minimal ($LINE_COUNT lines)"
-            WARNINGS=$((WARNINGS + 1))
-        fi
-        # Check for critical dependencies
-        if grep -q "gradio" requirements.txt; then
-            print_success "gradio found in requirements.txt"
-        else
-            print_error "gradio NOT found in requirements.txt"
-            FAILURES=$((FAILURES + 1))
-        fi
-        if grep -q "torch" requirements.txt; then
-            print_success "torch found in requirements.txt"
-        else
-            print_warning "torch not found (may be needed)"
-            WARNINGS=$((WARNINGS + 1))
-        fi
-    else
-        print_error "requirements.txt is empty"
-        FAILURES=$((FAILURES + 1))
-    fi
-else
-    print_error "requirements.txt not found"
-    FAILURES=$((FAILURES + 1))
-fi
-echo ""
-# Test 8: Python syntax
-print_info "Test 8: Python syntax validation..."
-for pyfile in app.py app_api.py app_ui.py; do
-    if [ -f "$pyfile" ]; then
-        if python -m py_compile "$pyfile" 2>/dev/null; then
-            print_success "$pyfile syntax valid"
-        else
-            print_error "$pyfile has syntax errors"
-            FAILURES=$((FAILURES + 1))
-        fi
-    fi
-done
-echo ""
-# Test 9: Git repository
-print_info "Test 9: Git repository..."
-if [ -d ".git" ]; then
-    print_success "Git repository initialized"
-    # Check remote
-    if git remote -v | grep -q "huggingface.co"; then
-        REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "unknown")
-        print_success "HF Space remote configured: $REMOTE_URL"
-    else
-        print_warning "No Hugging Face remote configured"
-        WARNINGS=$((WARNINGS + 1))
-    fi
-    # Check for uncommitted changes
-    if [ -n "$(git status --porcelain)" ]; then
-        print_warning "Uncommitted changes detected"
-        WARNINGS=$((WARNINGS + 1))
-    else
-        print_success "All changes committed"
-    fi
-else
-    print_warning "Not a git repository (will need git init)"
-    WARNINGS=$((WARNINGS + 1))
-fi
-echo ""
-# Test 10: Hugging Face CLI
-print_info "Test 10: Hugging Face CLI..."
-if command -v huggingface-cli &> /dev/null || command -v hf &> /dev/null; then
-    print_success "Hugging Face CLI installed"
-    # Check login
-    if huggingface-cli whoami &> /dev/null 2>&1 || hf auth whoami &> /dev/null 2>&1; then
-        USERNAME=$(huggingface-cli whoami 2>/dev/null || hf auth whoami 2>/dev/null | head -n1)
-        print_success "Logged in as: $USERNAME"
-    else
-        print_warning "Not logged in to Hugging Face"
-        print_info "   Run: huggingface-cli login or hf login"
-        WARNINGS=$((WARNINGS + 1))
-    fi
-else
-    print_warning "Hugging Face CLI not installed"
-    print_info "   Install: pip install huggingface-hub"
-    WARNINGS=$((WARNINGS + 1))
-fi
-echo ""
-# Summary
-echo "================================================"
-if [ $FAILURES -eq 0 ] && [ $WARNINGS -eq 0 ]; then
-    print_success "All checks passed! Ready to deploy! ✨"
-    echo ""
-    print_info "Next step: Run ./deploy_hf_space.sh"
-    exit 0
-elif [ $FAILURES -eq 0 ]; then
-    print_warning "$WARNINGS warning(s) found"
-    echo ""
-    print_info "You can proceed, but consider fixing warnings"
-    print_info "Next step: Run ./deploy_hf_space.sh"
-    exit 0
-else
-    print_error "$FAILURES critical error(s) and $WARNINGS warning(s)"
-    echo ""
-    print_info "Please fix the errors before deploying"
-    exit 1
-fi

deploy_hf_space.sh DELETED Viewed

@@ -1,210 +0,0 @@
-#!/bin/bash
-# Script de déploiement pour Hugging Face Spaces
-# Build et push le Space vers Hugging Face
-set -e
-# Colors
-RED='\033[0;31m'
-GREEN='\033[0;32m'
-YELLOW='\033[1;33m'
-BLUE='\033[0;34m'
-NC='\033[0m'
-print_info() { echo -e "${BLUE}ℹ️  $1${NC}"; }
-print_success() { echo -e "${GREEN}✅ $1${NC}"; }
-print_warning() { echo -e "${YELLOW}⚠️  $1${NC}"; }
-print_error() { echo -e "${RED}❌ $1${NC}"; }
-echo ""
-print_info "🚀 Deploying CU1-X to Hugging Face Spaces"
-echo "================================================"
-echo ""
-# Check if we're in a git repo
-if [ ! -d ".git" ]; then
-    print_error "Not a git repository!"
-    print_info "Initializing git repository..."
-    git init
-    print_success "Git repository initialized"
-fi
-# Check Git LFS
-print_info "Configuring Git LFS..."
-if ! command -v git-lfs &> /dev/null; then
-    print_error "Git LFS not installed!"
-    print_info "Install with: brew install git-lfs (macOS) or sudo apt install git-lfs (Linux)"
-    exit 1
-fi
-git lfs install > /dev/null 2>&1 || true
-# Ensure model.pth is tracked
-if [ -f "model.pth" ]; then
-    if ! git lfs ls-files | grep -q "model.pth"; then
-        print_info "Adding model.pth to Git LFS..."
-        git lfs track "*.pth"
-        git add .gitattributes
-        print_success "model.pth configured for Git LFS"
-    else
-        print_success "model.pth already tracked by Git LFS"
-    fi
-else
-    print_error "model.pth not found!"
-    exit 1
-fi
-# Check HF remote
-print_info "Checking Hugging Face remote..."
-if git remote | grep -q "origin"; then
-    REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "")
-    if echo "$REMOTE_URL" | grep -q "huggingface.co"; then
-        print_success "HF Space remote configured: $REMOTE_URL"
-        SPACE_URL=$(echo "$REMOTE_URL" | sed -E 's|.*spaces/([^/]+)/([^/]+).*|\1/\2|')
-        print_info "Space URL: https://huggingface.co/spaces/$SPACE_URL"
-    else
-        print_warning "Remote exists but doesn't look like HF Space"
-        print_info "Current remote: $REMOTE_URL"
-    fi
-else
-    print_warning "No remote configured"
-    print_info "You'll need to add a remote:"
-    print_info "  git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME"
-    read -p "Do you want to configure it now? (y/n) " -n 1 -r
-    echo
-    if [[ $REPLY =~ ^[Yy]$ ]]; then
-        read -p "Enter your HF username: " HF_USERNAME
-        read -p "Enter your Space name: " SPACE_NAME
-        git remote add origin "https://huggingface.co/spaces/$HF_USERNAME/$SPACE_NAME"
-        print_success "Remote configured"
-        SPACE_URL="$HF_USERNAME/$SPACE_NAME"
-    else
-        print_error "Cannot deploy without remote"
-        exit 1
-    fi
-fi
-# Check login
-print_info "Checking Hugging Face login..."
-if command -v hf &> /dev/null; then
-    if hf auth whoami &> /dev/null 2>&1; then
-        USERNAME=$(hf auth whoami 2>/dev/null | head -n1)
-        print_success "Logged in as: $USERNAME"
-    else
-        print_warning "Not logged in"
-        print_info "Logging in..."
-        hf login
-    fi
-elif command -v huggingface-cli &> /dev/null; then
-    if huggingface-cli whoami &> /dev/null 2>&1; then
-        USERNAME=$(huggingface-cli whoami 2>/dev/null | head -n1)
-        print_success "Logged in as: $USERNAME"
-    else
-        print_warning "Not logged in"
-        print_info "Logging in..."
-        huggingface-cli login
-    fi
-else
-    print_error "Hugging Face CLI not found!"
-    print_info "Install: pip install huggingface-hub"
-    exit 1
-fi
-# Ensure requirements.txt is complete
-print_info "Checking requirements.txt..."
-if [ -f "requirements-full.txt" ] && [ -f "requirements.txt" ]; then
-    FULL_LINES=$(wc -l < requirements-full.txt)
-    CURRENT_LINES=$(wc -l < requirements.txt)
-    if [ $CURRENT_LINES -lt $FULL_LINES ]; then
-        print_warning "requirements.txt seems incomplete"
-        read -p "Use requirements-full.txt? (y/n) " -n 1 -r
-        echo
-        if [[ $REPLY =~ ^[Yy]$ ]]; then
-            cp requirements-full.txt requirements.txt
-            print_success "Updated requirements.txt from requirements-full.txt"
-        fi
-    fi
-fi
-# Stage all files
-print_info "Staging files..."
-git add .
-print_success "Files staged"
-# Check if there are changes
-if [ -z "$(git status --porcelain)" ]; then
-    print_warning "No changes to commit"
-    print_info "Everything is already up to date"
-else
-    # Show what will be committed
-    print_info "Changes to commit:"
-    git status --short
-    # Commit
-    print_info "Creating commit..."
-    COMMIT_MSG="Deploy CU1-X to Hugging Face Spaces
-- Multi-model AI pipeline (RF-DETR, CLIP, OCR, BLIP)
-- Unified API architecture
-- Gradio web interface
-- Full model weights included via Git LFS
-- Ready for production deployment"
-    git commit -m "$COMMIT_MSG" || {
-        print_error "Commit failed"
-        exit 1
-    }
-    print_success "Changes committed"
-fi
-# Push to Hugging Face
-print_info "Pushing to Hugging Face Spaces..."
-print_warning "This may take several minutes (model.pth is 510MB)..."
-echo ""
-BRANCH=$(git branch --show-current 2>/dev/null || echo "main")
-if git push -u origin "$BRANCH" 2>&1 | tee /tmp/hf_push.log; then
-    print_success "Push completed successfully!"
-    echo ""
-    echo "================================================"
-    print_success "🎉 Deployment Successful!"
-    echo "================================================"
-    echo ""
-    if [ -n "$SPACE_URL" ]; then
-        print_info "Your Space is deploying at:"
-        echo "   https://huggingface.co/spaces/$SPACE_URL"
-        echo ""
-        print_info "Build progress:"
-        echo "   https://huggingface.co/spaces/$SPACE_URL/logs"
-        echo ""
-        print_info "Once deployed, your app will be at:"
-        echo "   https://huggingface.co/spaces/$SPACE_URL"
-        echo ""
-        print_info "API endpoint:"
-        echo "   https://$SPACE_URL.hf.space/api/predict"
-        echo ""
-    fi
-    print_warning "First build may take 5-10 minutes"
-    print_info "HF Spaces will automatically:"
-    print_info "  - Install dependencies from requirements.txt"
-    print_info "  - Download models (CLIP, BLIP, EasyOCR)"
-    print_info "  - Start app.py"
-    print_info "  - Expose Gradio interface and API"
-    echo ""
-    print_success "All done! 🎉"
-else
-    print_error "Push failed!"
-    echo ""
-    print_info "Common issues:"
-    print_info "1. Authentication failed: Run 'hf login' or 'huggingface-cli login'"
-    print_info "2. Git LFS error: Ensure Git LFS is installed and model.pth is tracked"
-    print_info "3. Network error: Check your internet connection"
-    echo ""
-    print_info "Check the error above for details"
-    exit 1
-fi

detection/response_builder.py CHANGED Viewed

	@@ -210,3 +210,105 @@ def build_ocr_only_response(
210
211	return response
212

     return response
+def build_simplified_response(
+    analysis: Dict,
+    image: Image.Image,
+    annotated_image: Optional[np.ndarray] = None,
+    confidence_threshold: float = 0.35,
+    line_thickness: int = 2,
+    enable_clip: bool = False,
+    enable_ocr: bool = True,
+    enable_blip: bool = False,
+    blip_scope: Optional[str] = None,
+    ocr_only: bool = False
+) -> Dict:
+    """
+    Build simplified detection response for API/UI with format:
+    {
+      "detections": {
+        "icon 0": {"type": "text", "bbox": [x1, y1, x2, y2], "interactivity": false, "content": "..."},
+        "icon 1": {"type": "icon", "bbox": [x1, y1, x2, y2], "interactivity": true, "content": "..."}
+      },
+      "annotated_image": {"mime": "image/png", "base64": "..."}
+    }
+    Args:
+        analysis: Detection analysis results from DetectionService or OCR handler
+        image: Original PIL Image
+        annotated_image: Optional annotated image (numpy array, RGB)
+        confidence_threshold: Confidence threshold used
+        enable_clip: Whether CLIP classification was enabled
+        enable_ocr: Whether OCR was enabled
+        enable_blip: Whether BLIP was enabled
+        blip_scope: BLIP scope ("icons" or "all")
+        ocr_only: Whether this was OCR-only mode
+    Returns:
+        Simplified response dictionary with detections dict and annotated_image
+    """
+    # Extract detections
+    detections = analysis.get("detections", [])
+    image_width = analysis.get("image_size", {}).get("width", image.width)
+    image_height = analysis.get("image_size", {}).get("height", image.height)
+    # Interactive element types (buttons, inputs, icons, navigation, list items)
+    interactive_types = {"button", "input", "icon", "navigation", "list_item"}
+    # Build simplified detections dict
+    simplified_detections = {}
+    for idx, det in enumerate(detections):
+        # Get bounding box and normalize to 0-1 coordinates
+        box = det.get("box", {})
+        x1 = box.get("x1", 0) / image_width
+        y1 = box.get("y1", 0) / image_height
+        x2 = box.get("x2", 0) / image_width
+        y2 = box.get("y2", 0) / image_height
+        # Get type from CLIP classification
+        element_type = det.get("class_name", "")
+        if not element_type:
+            # Fallback: if no CLIP classification, default to "text" if has text, else "icon"
+            element_type = "text" if det.get("text", "").strip() else "icon"
+        # Determine interactivity based on type
+        is_interactive = element_type in interactive_types
+        # Fuse text and description into content
+        text = det.get("text", "").strip()
+        description = det.get("description", "").strip()
+        # Content priority: text first, then description
+        if text:
+            content = text
+        elif description:
+            content = description
+        else:
+            content = ""
+        # Build simplified detection entry
+        simplified_detections[f"icon {idx}"] = {
+            "type": element_type,
+            "bbox": [round(x1, 4), round(y1, 4), round(x2, 4), round(y2, 4)],
+            "interactivity": is_interactive,
+            "content": content
+        }
+    # Build response
+    response = {
+        "detections": simplified_detections
+    }
+    # Add annotated image if provided
+    if annotated_image is not None:
+        img_bgr = cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR)
+        ok, png_bytes = cv2.imencode(".png", img_bgr)
+        if ok:
+            annotated_b64 = base64.b64encode(png_bytes.tobytes()).decode("ascii")
+            response["annotated_image"] = {
+                "mime": "image/png",
+                "base64": annotated_b64
+            }
+    return response

docs/PREPROCESSING_GUIDE.md DELETED Viewed

@@ -1,466 +0,0 @@
-# 📷 Image Preprocessing Guide - Cross-Device Consistency
-## Problem
-Screenshots from different devices (Samsung, Google Pixel, Oppo, Xiaomi, etc.) show variations that can affect detection:
-### 🎨 Color Variations
-| Device | Color Profile | Impact |
-|----------|---------------|--------|
-| **Samsung** | "Vivid" mode (saturated) | Very bright colors, can affect CLIP |
-| **Google Pixel** | sRGB (neutral) | Accurate but less vibrant colors |
-| **Oppo/Xiaomi** | Varies by mode | Variable saturation |
-### 📊 Other Variations
-1. **Screen calibration**
-   - Different color temperature
-   - Different gamma (brightness)
-   - Variable contrast
-2. **Compression**
-   - PNG vs JPEG
-   - Compression level
-   - Compression artifacts
-3. **Impact on detection**
-   - ❌ Variable confidence scores
-   - ❌ Less precise OCR
-   - ❌ CLIP may classify differently
----
-## ✅ Solution: Automatic Preprocessing
-### Preprocessing Pipeline
-```
-Original Screenshot
-        ↓
-1. Denoising (removes JPEG/PNG artifacts)
-        ↓
-2. Color normalization (→ standard sRGB)
-        ↓
-3. Brightness normalization
-        ↓
-4. CLAHE (improves local contrast)
-        ↓
-5. Optional: Sharpening (improves OCR)
-        ↓
-Standardized Screenshot
-```
----
-## 🚀 Usage
-### Option 1: Via API
-```bash
-curl -X POST "http://localhost:8000/detect" \
-  -F "image=@samsung_screenshot.png" \
-  -F "preprocess=true" \
-  -F "preprocess_preset=standard"
-```
-### Option 2: Via Python
-```python
-from detection.service import DetectionService
-service = DetectionService()
-# With preprocessing
-results = service.analyze(
-    "samsung_screenshot.png",
-    preprocess=True,
-    preprocess_preset="standard"
-)
-print(f"Preprocessed: {results['preprocessed']}")
-print(f"Detections: {len(results['detections'])}")
-```
-### Option 3: Via Standalone Module
-```python
-from detection.image_preprocessing import preprocess_screenshot
-from PIL import Image
-# Preprocess the image
-img_preprocessed = preprocess_screenshot(
-    "oppo_screenshot.png",
-    preset="standard"
-)
-# Use it with your pipeline
-results = detector.analyze(img_preprocessed)
-```
----
-## 🎛️ Available Presets
-### 1. **standard** (Recommended)
-Balance between normalization and preserving the original image.
-```python
-preprocess=True, preprocess_preset="standard"
-```
-**Enables:**
-- ✅ Denoising (medium strength)
-- ✅ Color normalization
-- ✅ Brightness normalization
-- ✅ CLAHE (adaptive contrast)
-- ❌ Sharpening
-**Use for:**
-- General detection
-- Screenshots with variable quality
-- Cross-device consistency
----
-### 2. **aggressive**
-Maximum normalization for very different screenshots.
-```python
-preprocess=True, preprocess_preset="aggressive"
-```
-**Enables:**
-- ✅ Denoising (high strength)
-- ✅ Color normalization
-- ✅ Brightness normalization
-- ✅ CLAHE (adaptive contrast)
-- ✅ Sharpening (improves sharpness)
-**Use for:**
-- Blurry screenshots
-- Major differences between devices
-- When "standard" is not enough
----
-### 3. **minimal**
-Light preprocessing, preserves the original image.
-```python
-preprocess=True, preprocess_preset="minimal"
-```
-**Enables:**
-- ✅ Denoising (low strength)
-- ✅ Brightness normalization
-- ❌ Color normalization
-- ❌ CLAHE
-- ❌ Sharpening
-**Use for:**
-- Screenshots already high quality
-- When you want minimal changes
-- Tests and comparisons
----
-### 4. **ocr_optimized**
-Optimized specifically for OCR text extraction.
-```python
-preprocess=True, preprocess_preset="ocr_optimized"
-```
-**Enables:**
-- ✅ Denoising
-- ✅ Color normalization
-- ✅ Brightness normalization
-- ✅ CLAHE (improves text contrast)
-- ✅ Sharpening (sharper text)
-**Use for:**
-- OCR as a priority
-- Blurry or small text
-- Improving OCR accuracy
----
-## 📊 Preset Comparison
-| Preset | Denoising | Color Normalization | Brightness | CLAHE | Sharpening | Use case |
-|--------|-----------|---------------------|------------|-------|-----------|-------------|
-| **minimal** | ✅ Light | ❌ | ✅ | ❌ | ❌ | High-quality images |
-| **standard** | ✅ Medium | ✅ | ✅ | ✅ | ❌ | General use (recommended) |
-| **aggressive** | ✅ Strong | ✅ | ✅ | ✅ | ✅ | Significant differences |
-| **ocr_optimized** | ✅ Medium | ✅ | ✅ | ✅ | ✅ | OCR priority |
----
-## 🔬 Practical Examples
-### Example 1: Samsung vs Pixel comparison
-**Without preprocessing:**
-```python
-# Samsung (saturated colors)
-samsung_results = detector.analyze("samsung.png", preprocess=False)
-print(samsung_results['detections'][0]['confidence'])  # 0.72
-# Pixel (neutral colors)
-pixel_results = detector.analyze("pixel.png", preprocess=False)
-print(pixel_results['detections'][0]['confidence'])    # 0.68
-```
-**With preprocessing:**
-```python
-# Samsung (normalized)
-samsung_results = detector.analyze("samsung.png", preprocess=True)
-print(samsung_results['detections'][0]['confidence'])  # 0.74
-# Pixel (normalized)
-pixel_results = detector.analyze("pixel.png", preprocess=True)
-print(pixel_results['detections'][0]['confidence'])    # 0.74
-```
-**Result:** More consistent confidence scores! ✅
----
-### Example 2: OCR improvement
-```python
-# Without preprocessing
-results_before = detector.analyze(
-    "oppo_blurry.png",
-    extract_text=True,
-    preprocess=False
-)
-print(results_before['detections'][0]['text'])  # "L0gin"  ❌
-# With OCR-optimized
-results_after = detector.analyze(
-    "oppo_blurry.png",
-    extract_text=True,
-    preprocess=True,
-    preprocess_preset="ocr_optimized"
-)
-print(results_after['detections'][0]['text'])   # "Login"  ✅
-```
----
-### Example 3: Batch processing
-```python
-from detection.image_preprocessing import preprocess_screenshot
-from pathlib import Path
-screenshots = Path("screenshots").glob("*.png")
-for screenshot in screenshots:
-    # Preprocess
-    img = preprocess_screenshot(screenshot, preset="standard")
-    # Detect
-    results = detector.analyze(
-        img,
-        confidence_threshold=0.35,
-        use_clip=True,
-        preprocess=False  # Already preprocessed
-    )
-    print(f"{screenshot.name}: {len(results['detections'])} detections")
-```
----
-## ⚙️ Advanced Configuration
-### Create a custom preset
-```python
-from detection.image_preprocessing import ImagePreprocessor
-# Create your own preset
-custom_preprocessor = ImagePreprocessor(
-    target_colorspace="srgb",
-    normalize_contrast=True,
-    normalize_brightness=True,
-    denoise=True,
-    enhance_sharpness=False,
-    clahe_enabled=True,
-    target_size=(1080, 1920)  # Optional: resize
-)
-# Use it
-img_preprocessed = custom_preprocessor.preprocess("image.png")
-```
----
-## 📈 Performance Impact
-### Processing time
-| Preset | Additional Time | Impact |
-|--------|-----------------|--------|
-| **minimal** | ~50-100ms | Negligible |
-| **standard** | ~100-200ms | Acceptable |
-| **aggressive** | ~200-400ms | Moderate |
-| **ocr_optimized** | ~150-300ms | Acceptable |
-**Note:** Total detection time is 30-60 seconds, so preprocessing overhead is negligible (<1% of total time).
-### Accuracy
-| Metric | Without Preprocessing | With Standard | Improvement |
-|----------|-------------------|---------------|--------------|
-| **Cross-device consistency** | 65% | 92% | +27% |
-| **OCR accuracy** | 82% | 94% | +12% |
-| **Detection confidence** | Variable (±15%) | Stable (±3%) | +400% |
----
-## 🎯 Recommendations
-### When should you enable preprocessing?
-✅ **ALWAYS enable it** if:
-- You test on multiple devices
-- Your screenshots come from different sources
-- You want consistent results
-- OCR is a priority
-⚠️ **Optional** if:
-- All your screenshots come from the same device
-- You already standardized your captures
-- Processing time is critical
-❌ **Not necessary** if:
-- You use synthetic images
-- You are testing the RF-DETR model itself
-- You need the exact original image
----
-### Which preset should you choose?
-```
-📱 Production screenshots → standard
-🔬 Cross-device tests     → standard or aggressive
-📝 OCR priority           → ocr_optimized
-⚡ Critical performance   → minimal
-🔧 Experimentation        → aggressive (understand the limits)
-```
----
-## 🐛 Troubleshooting
-### Preprocessing changes the image too much
-→ Use `preset="minimal"`
-### OCR is still inaccurate
-→ Use `preset="ocr_optimized"` and check the quality of the source image
-### Results still vary a lot
-→ Use `preset="aggressive"` and check for resolution differences
-### Preprocessing is too slow
-→ Preprocessing is already optimized. If it's critical, use `preset="minimal"` or disable it.
----
-## 📚 Technical References
-### Algorithms Used
-1. **Denoising**: `cv2.fastNlMeansDenoisingColored`
-   - Removes JPEG/PNG artifacts
-   - Preserves important edges
-2. **Color normalization**: LAB conversion + normalization
-   - Perceptually uniform color space
-   - Reduces the impact of color profiles
-3. **CLAHE**: `cv2.createCLAHE`
-   - Improves local contrast
-   - Preserves overall appearance
-4. **Sharpening**: Unsharp Mask
-   - Improves sharpness
-   - Useful for OCR
----
-## 💡 Practical Tips
-### 1. Test without preprocessing first
-```python
-# Test without preprocessing
-results_before = detector.analyze(image, preprocess=False)
-# Test with preprocessing
-results_after = detector.analyze(image, preprocess=True, preprocess_preset="standard")
-# Compare
-print(f"Before: {len(results_before['detections'])} detections")
-print(f"After: {len(results_after['detections'])} detections")
-```
-### 2. Save preprocessed images
-```python
-from PIL import Image
-from detection.image_preprocessing import preprocess_screenshot
-# Preprocess and save
-img_preprocessed = preprocess_screenshot("original.png", preset="standard")
-Image.fromarray(img_preprocessed).save("preprocessed.png")
-```
-### 3. Batch testing
-```bash
-# Script to test every preset
-for preset in minimal standard aggressive ocr_optimized; do
-  curl -X POST "http://localhost:8000/detect" \
-    -F "image=@test.png" \
-    -F "preprocess=true" \
-    -F "preprocess_preset=$preset" \
-    > results_$preset.json
-done
-```
----
-## ✅ Summary
-Image preprocessing is **highly recommended** for:
-- ✅ Cross-device consistency
-- ✅ Improved OCR
-- ✅ Stable results
-- ✅ Negligible overhead (<1% of total time)
-**Recommended preset:** `standard` (good balance)
-**Enable it:**
-```python
-results = detector.analyze(
-    image,
-    preprocess=True,  # ← Turn me on!
-    preprocess_preset="standard"
-)
-```
-Now your results will be consistent whether you test on Samsung, Pixel, Oppo, or any other device! 🎉

docs/START.md DELETED Viewed

@@ -1,314 +0,0 @@
-# 🚀 Quick Start Guide
-## Unified Architecture API
-The project now uses a **unified architecture** where every interface goes through the REST API.
-```
-┌─────────────────────────────────────────────┐
-│                                             │
-│  Gradio UI (app.py / app_ui.py)            │
-│                                             │
-└──────────────────┬──────────────────────────┘
-                   │
-                   │ HTTP/REST
-                   │
-┌──────────────────▼──────────────────────────┐
-│                                             │
-│  FastAPI Server (app_api.py)                │
-│                                             │
-├─────────────────────────────────────────────┤
-│  Detection Service                          │
-│  ├─ RF-DETR (detection)                     │
-│  ├─ CLIP (classification)                   │
-│  ├─ OCR (text extraction)                   │
-│  └─ BLIP (visual description)               │
-└─────────────────────────────────────────────┘
-```
----
-## 🎯 3 Ways to Launch
-### Option 1: Automatic Launch (Recommended for tests)
-**One command starts everything:**
-```bash
-python app.py
-```
-**What happens:**
-1. ✅ Starts the API in the background (port 8000)
-2. ✅ Waits until the API is ready
-3. ✅ Launches the Gradio interface (port 7860)
-4. ✅ Handles clean shutdown with Ctrl+C
-**Access:**
-- Gradio Interface: http://localhost:7860
-- API Docs: http://localhost:8000/docs
----
-### Option 2: Manual Launch (2 terminals)
-**For more control and debugging:**
-**Terminal 1 - API Server:**
-```bash
-python app_api.py
-```
-**Terminal 2 - Gradio UI:**
-```bash
-python app_ui.py
-```
-**Access:**
-- Gradio Interface: http://localhost:7860
-- API Docs: http://localhost:8000/docs
----
-### Option 3: API Only
-**To use only the API (integration, scripts, etc.):**
-```bash
-python app_api.py
-```
-**Test the API:**
-```bash
-# Health check
-curl http://localhost:8000/health
-# Detect elements
-curl -X POST "http://localhost:8000/detect" \
-  -F "image=@screenshot.png" \
-  -F "confidence_threshold=0.35" \
-  -F "enable_clip=true" \
-  -F "enable_ocr=true"
-```
-**Interactive documentation:**
-- OpenAPI Docs: http://localhost:8000/docs
-- ReDoc: http://localhost:8000/redoc
----
-## 🔧 Configuration
-### Environment Variables
-**API Server:**
-```bash
-export UVICORN_HOST="0.0.0.0"       # Default: 0.0.0.0
-export UVICORN_PORT="8000"          # Default: 8000
-```
-**Gradio UI:**
-```bash
-export GRADIO_SERVER_NAME="0.0.0.0" # Default: 0.0.0.0
-export GRADIO_SERVER_PORT="7860"    # Default: 7860
-export CU1_API_URL="http://localhost:8000"  # API URL
-```
-**Example with custom ports:**
-```bash
-# API on port 9000, UI on port 9001
-export UVICORN_PORT="9000"
-export GRADIO_SERVER_PORT="9001"
-export CU1_API_URL="http://localhost:9000"
-python app.py
-```
----
-## 🧪 Quick Tests
-### Test 1: Make sure the API works
-```bash
-# In one terminal
-python app_api.py
-# In another terminal
-curl http://localhost:8000/health
-```
-**Expected result:**
-```json
-{
-  "status": "healthy",
-  "cuda_available": false,
-  "device": "cpu"
-}
-```
----
-### Test 2: Test detection via the interface
-```bash
-python app.py
-```
-1. Open http://localhost:7860
-2. Upload an image
-3. Click "🔍 Detect Elements"
-4. Check the results
----
-### Test 3: Test detection through the API
-```bash
-# Start the API
-python app_api.py
-# In another terminal, test with curl
-curl -X POST "http://localhost:8000/detect" \
-  -F "image=@votre_image.png" \
-  -F "confidence_threshold=0.35" \
-  -F "enable_ocr=true" \
-  | jq .
-```
----
-## 🐛 Troubleshooting
-### Issue: "Connection Error - Cannot connect to API"
-**Solution:**
-1. Make sure the API is running: `curl http://localhost:8000/health`
-2. Check the ports: no conflict with other apps
-3. Check the API logs for errors
-### Issue: "Port already in use"
-**Solution:**
-```bash
-# Find the process that uses the port
-lsof -i :8000  # or :7860
-# Kill the process
-kill -9 <PID>
-# Or use a different port
-export UVICORN_PORT="9000"
-export GRADIO_SERVER_PORT="9001"
-```
-### Issue: "Module not found"
-**Solution:**
-```bash
-# Reinstall dependencies
-pip install -r requirements.txt
-```
-### Issue: Models slow to load
-**Reason:** The first startup downloads the models
-**Solution:** Be patient, the models are cached after the first download
-- RF-DETR model (~few MB)
-- CLIP model (~600 MB)
-- BLIP model (~1 GB)
-- EasyOCR models (~100 MB)
----
-## 📊 Monitoring
-### API logs
-The logs appear in the terminal where you launched `app_api.py`
-### UI logs
-The logs appear in the terminal where you launched `app.py` or `app_ui.py`
-### Metrics
-Visit http://localhost:8000/docs to view the API statistics
----
-## ✅ Benefits of the Unified Architecture
-1. **Single code path** → Easier to maintain
-2. **Consistent behavior** → Same results everywhere
-3. **Easy to test** → Only one API to test
-4. **Scalable** → Can separate API and UI on different servers
-5. **Simplified debugging** → Logs centralized in the API
----
-## 🎯 For Developers
-### Code Architecture
-```
-.
-├── app.py              # ✨ Unified launcher (API + UI)
-├── app_api.py          # FastAPI server
-├── app_ui.py           # Gradio UI client (manual)
-│
-├── api/
-│   └── endpoints.py    # FastAPI endpoints
-│
-├── detection/
-│   ├── service.py           # Detection service
-│   ├── service_factory.py   # Singleton pattern
-│   ├── image_utils.py       # Image utilities
-│   ├── ocr_handler.py       # OCR-only processing
-│   └── response_builder.py  # Response formatting
-│
-└── ui/
-    ├── detection_wrapper.py   # Detection wrappers
-    ├── gradio_interface.py    # Gradio interface (API client)
-    └── shared_interface.py    # Shared UI components
-```
-### Request Flow
-```
-1. User uploads image in Gradio
-                ↓
-2. `detect_with_api()` sends an HTTP POST to `/detect`
-                ↓
-3. API endpoint validates the request
-                ↓
-4. `DetectionService.analyze()` processes the image
-                ↓
-5. Response formatted with `response_builder`
-                ↓
-6. JSON returned to Gradio UI
-                ↓
-7. UI displays annotated image + results
-```
----
-## 📝 Notes
-- **Thread Safety:** The service uses a singleton but passes parameters directly to `analyze()` to avoid race conditions
-- **Performance:** The first call is slow (model loading), then fast
-- **Memory:** Models use ~2-3 GB of RAM
-- **GPU:** Automatic CUDA/MPS detection if available
----
-## 🚀 Next Steps
-1. **Test locally:** `python app.py`
-2. **Explore the API:** http://localhost:8000/docs
-3. **Customize:** Adjust parameters in the interface
-4. **Deploy:** See `DEPLOYMENT.md` for production
-Happy testing! 🎉

docs/UNIFIED_ARCHITECTURE.md DELETED Viewed

@@ -1,443 +0,0 @@
-# 🎯 Unified Architecture - Technical Documentation
-## Date
-2025-11-10
-## Objective
-Unify the architecture so that **all interfaces** go through the REST API, removing the duality between "HF Spaces" mode and "Production" mode.
----
-## ✅ What Changed
-### BEFORE (Dual Architecture)
-```
-┌─────────────────────────────────────────────────┐
-│  Mode 1: HF Spaces (app.py)                    │
-│  └─> DIRECT access to DetectionService         │
-│      (no API)                                  │
-└─────────────────────────────────────────────────┘
-┌─────────────────────────────────────────────────┐
-│  Mode 2: Production (app_ui.py)                │
-│  └─> Access via HTTP API                       │
-│      (microservices architecture)              │
-└─────────────────────────────────────────────────┘
-```
-**Problems:**
-- ❌ Two different code paths
-- ❌ Potentially different behaviors
-- ❌ Complex maintenance (two modes to test)
-- ❌ Bugs possible in one mode but not the other
----
-### AFTER (Unified Architecture)
-```
-┌─────────────────────────────────────────────────┐
-│                                                 │
-│  ALL INTERFACES                                │
-│  (app.py, app_ui.py, etc.)                     │
-│                                                 │
-└────────────────────┬────────────────────────────┘
-                     │
-                     │ HTTP/REST
-                     │ (detect_with_api)
-                     │
-┌────────────────────▼────────────────────────────┐
-│                                                 │
-│  FastAPI Server                                 │
-│  (api/endpoints.py)                             │
-│                                                 │
-├─────────────────────────────────────────────────┤
-│  Detection Service                              │
-│  (detection/service.py)                         │
-│                                                 │
-└─────────────────────────────────────────────────┘
-```
-**Benefits:**
-- ✅ One single code path
-- ✅ Consistent behavior everywhere
-- ✅ Simplified maintenance
-- ✅ Unified tests
-- ✅ Easier debugging
----
-## 📝 File Changes
-### 1. `app.py` - Major Transformation
-**BEFORE:**
-```python
-from ui.detection_wrapper import detect_with_service
-demo = create_interface(
-    detection_fn=detect_with_service,  # Direct access
-    title_suffix="Hugging Face Spaces Mode",
-    show_api_info=False
-)
-```
-**AFTER:**
-```python
-from ui.detection_wrapper import detect_with_api
-# Launch the API as a subprocess
-api_process = start_api_server()
-# UI uses the API
-detection_fn = partial(detect_with_api, api_url=API_URL)
-demo = create_interface(
-    detection_fn=detection_fn,  # Via API
-    title_suffix="Unified API Mode",
-    show_api_info=True,
-    api_url=API_URL
-)
-```
-**New features:**
-- 🚀 Automatically starts the API in the background
-- ⏳ Waits until the API is ready (health check)
-- 🛑 Handles clean shutdown (Ctrl+C)
-- 📡 Displays access URLs
----
-### 2. `app_api.py` - Dynamic Configuration
-**Additions:**
-```python
-# Support environment variables
-host = os.getenv("UVICORN_HOST", "0.0.0.0")
-port = int(os.getenv("UVICORN_PORT", "8000"))
-```
-**Allows:**
-- Port configuration through environment variables
-- Usage by the subprocess in app.py
----
-### 3. Documentation
-**New files:**
-- ✨ `START.md` - Complete quick start guide
-- ✨ `UNIFIED_ARCHITECTURE.md` - This document
-- ✨ `test_unified_architecture.py` - Validation tests
-**Updated files:**
-- 📝 `README.md` - Updated Quick Start section
-- 📝 `README.md` - Updated HF Spaces section
----
-## 🚀 How to Use
-### Mode 1: Automatic Launch (Recommended)
-**One command:**
-```bash
-python app.py
-```
-**What happens:**
-1. Starts the API as a subprocess (port 8000)
-2. Waits for the health check
-3. Launches the Gradio UI (port 7860)
-4. Both communicate via HTTP
-**Clean shutdown:**
-- Ctrl+C stops the UI AND the API automatically
----
-### Mode 2: Manual Launch (Debug)
-**Two terminals:**
-```bash
-# Terminal 1
-python app_api.py
-# Terminal 2
-python app_ui.py
-```
-**Useful for:**
-- Viewing logs separately
-- Restarting the UI without restarting the API
-- Advanced debugging
----
-### Mode 3: API Only
-```bash
-python app_api.py
-```
-**Good for:**
-- External integrations
-- Python scripts
-- API tests
----
-## 🧪 Tests and Validation
-### Automated Test Script
-```bash
-python test_unified_architecture.py
-```
-**Checks:**
-- ✅ All required files exist
-- ✅ Valid Python syntax
-- ✅ `app.py` uses `detect_with_api`
-- ✅ No direct service access from the UI
-- ✅ Consistent architecture
-### Test Results
-```
-✅✅✅ ALL TESTS PASS!
-📊 Unified architecture summary:
-   - ✅ `app.py` launches the API as a subprocess
-   - ✅ All interfaces use `detect_with_api`
-   - ✅ Consistent architecture everywhere
-   - ✅ No direct service access from the UI
-```
----
-## 🔄 Unified Request Flow
-### Before (Dual Mode)
-**HF Spaces Mode:**
-```
-User → Gradio → detect_with_service() → DetectionService.analyze()
-```
-**Production Mode:**
-```
-User → Gradio → detect_with_api() → HTTP → API → DetectionService.analyze()
-```
-### After (Unified Mode)
-**All modes:**
-```
-User → Gradio → detect_with_api() → HTTP → API → DetectionService.analyze()
-```
----
-## 📊 Technical Benefits
-### 1. Maintainability
-**BEFORE:**
-- 2 code paths to maintain
-- Tests to run for each mode
-- Regression risk in one mode
-**AFTER:**
-- Only 1 code path
-- Unified tests
-- Guaranteed identical behavior
----
-### 2. Debugging
-**BEFORE:**
-- Bug in `app.py`? Check `detect_with_service`
-- Bug in `app_ui.py`? Check `detect_with_api`
-- Different per mode
-**AFTER:**
-- All bugs go through the API
-- Logs centralized in the API
-- A single place to debug
----
-### 3. Scalability
-**BEFORE:**
-- HF Spaces mode: monolithic
-- Production mode: scalable
-- Different behaviors
-**AFTER:**
-- Same architecture everywhere
-- Can easily separate API/UI on different servers
-- Load balancing possible
----
-### 4. Testing
-**BEFORE:**
-```bash
-# Test HF Spaces
-pytest test_app.py
-# Test Production
-pytest test_api.py
-pytest test_ui.py
-```
-**AFTER:**
-```bash
-# Single test suite
-pytest test_api.py  # Tests the entire logic
-```
----
-## 🔧 Configuration
-### Environment Variables
-```bash
-# API Server
-export UVICORN_HOST="0.0.0.0"
-export UVICORN_PORT="8000"
-# Gradio UI
-export GRADIO_SERVER_NAME="0.0.0.0"
-export GRADIO_SERVER_PORT="7860"
-export CU1_API_URL="http://localhost:8000"
-```
-### Example: Custom Ports
-```bash
-# API on port 9000, UI on port 9001
-export UVICORN_PORT="9000"
-export GRADIO_SERVER_PORT="9001"
-export CU1_API_URL="http://localhost:9000"
-python app.py
-```
----
-## 🎯 Impact on Existing Code
-### No Breaking Changes
-- ✅ `app_api.py` still works on its own
-- ✅ `app_ui.py` still works on its own
-- ✅ Python APIs (`DetectionService`) are unchanged
-- ✅ Existing scripts keep working
-### What’s New
-- ✨ `app.py` now launches the API automatically
-- ✨ Consistent architecture everywhere
-- ✨ Better documentation
----
-## 📈 Metrics
-| Metric | Before | After | Improvement |
-|----------|-------|-------|--------------|
-| **Code paths** | 2 | 1 | -50% |
-| **Testing complexity** | High | Low | -60% |
-| **Bug risk** | Medium | Low | -70% |
-| **Debugging ease** | Medium | High | +80% |
----
-## 🚨 Points to Watch
-### 1. Performance
-**Impact:** Negligible (~10-50ms of extra HTTP latency)
-**Why it’s OK:**
-- Models take 30-60 seconds
-- 50ms HTTP latency = 0.1% of total time
-- Negligible compared to processing
----
-### 2. Memory
-**Before (HF Spaces mode):** 1 process
-**After:** 2 processes (API + UI)
-**Impact:** +100-200 MB (Gradio UI overhead)
-**Why it’s OK:**
-- Models already use 2-3 GB
-- +200 MB = 7% overhead
-- Acceptable for architectural consistency
----
-### 3. Deployment
-**HF Spaces:** No change
-- The `app.py` file handles everything
-- Automatically launches API + UI
-- Works out of the box
-**Docker:** Possible update
-- See `DEPLOYMENT.md` for details
-- May require 2 containers or a supervisor
----
-## 🎓 Lessons Learned
-### 1. Dual Architecture = Bad Idea
-Having two modes (HF Spaces vs Production) seemed convenient at first but created more problems than it solved.
-### 2. HTTP Overhead Is Negligible
-The HTTP overhead is so small compared to ML processing that it’s negligible. The clean architecture is worth the cost.
-### 3. Unified Tests = Better Quality
-Having a single code path makes testing much easier and reduces bugs.
----
-## ✅ Conclusion
-Unifying the architecture to a 100% API model is a **success**:
-✅ **Cleaner code** - Single path
-✅ **Easier to maintain** - Less complexity
-✅ **Easier to test** - Unified tests
-✅ **Consistent behavior** - Same results everywhere
-✅ **No breaking changes** - Backward compatible
-**Result:** Professional, scalable, and maintainable architecture! 🚀
----
-## 📚 Related Documentation
-- 📖 [START.md](START.md) - Quick start guide
-- 📖 [README.md](README.md) - Main documentation
-- 📖 [DEPLOYMENT.md](DEPLOYMENT.md) - Deployment guide
-- 🧪 [test_unified_architecture.py](test_unified_architecture.py) - Tests
----
-**Questions?** Check [START.md](START.md) or open an issue on GitHub.

examples/api_example.py DELETED Viewed

@@ -1,94 +0,0 @@
-"""
-Example: Using CU1-X API from Hugging Face Space
-This example shows how to call the CU1-X API deployed on Hugging Face Spaces.
-"""
-from gradio_client import Client
-import json
-# Configuration
-SPACE_URL = "AI-DrivenTesting/CU1-X"  # Remplacez par votre Space URL
-def detect_ui_elements(image_path: str):
-    """
-    Détecte les éléments UI dans une image via l'API HF Space
-    Args:
-        image_path: Chemin vers l'image à analyser
-    Returns:
-        Tuple (annotated_image, summary, detections_json)
-    """
-    # Créer le client Gradio
-    client = Client(SPACE_URL)
-    # Appeler l'API
-    result = client.predict(
-        image_path,                          # image
-        0.35,                                # confidence_threshold
-        2,                                   # thickness
-        True,                                # enable_clip (classification)
-        True,                                # enable_ocr (extraction texte)
-        False,                               # enable_blip (descriptions)
-        False,                               # ocr_only
-        "Only image & button",              # blip_scope
-        False,                               # preprocess
-        "RF-DETR Optimized (Recommended)",  # preprocess_mode
-        "standard",                          # preprocess_preset
-        api_name="/predict"
-    )
-    # Déballer les résultats
-    annotated_image, summary, detections_json = result
-    return annotated_image, summary, detections_json
-def main():
-    """Exemple d'utilisation"""
-    print("🚀 CU1-X API Example")
-    print("=" * 50)
-    # Chemin vers votre image de test
-    test_image = "screenshot.png"  # Remplacez par votre image
-    try:
-        print(f"\n📤 Uploading image: {test_image}")
-        print("⏳ Processing... (this may take 30-60 seconds)")
-        # Appeler l'API
-        annotated_image, summary, detections = detect_ui_elements(test_image)
-        # Afficher les résultats
-        print("\n✅ Detection completed!")
-        print("\n📊 Summary:")
-        print(summary)
-        print("\n🔍 Detections:")
-        if isinstance(detections, str):
-            detections = json.loads(detections)
-        print(f"   Total: {detections.get('total_detections', 0)} elements")
-        if 'type_distribution' in detections:
-            print("\n📈 Type Distribution:")
-            for elem_type, count in detections['type_distribution'].items():
-                print(f"   {elem_type}: {count}")
-        print("\n💾 Saving annotated image...")
-        # annotated_image est un fichier temporaire, vous pouvez le copier
-        print(f"   Annotated image saved at: {annotated_image}")
-    except Exception as e:
-        print(f"\n❌ Error: {e}")
-        print("\nTroubleshooting:")
-        print("1. Vérifiez que votre Space est déployé et en ligne")
-        print("2. Vérifiez que SPACE_URL est correct")
-        print("3. Assurez-vous d'avoir installé: pip install gradio_client")
-if __name__ == "__main__":
-    main()

requirements-api-client.txt DELETED Viewed

@@ -1,8 +0,0 @@
-# Requirements for accessing HF Spaces API
-# Install this if you want to use the API client examples
-gradio_client>=0.10.0
-requests>=2.31.0
-pillow>=10.0.0
-aiohttp>=3.9.0  # For async examples

requirements-full.txt DELETED Viewed

@@ -1,40 +0,0 @@
-# Full requirements for CU1-X UI Element Detector
-# Use this file for deployment to Hugging Face Spaces or production
-# Core dependencies
-gradio[oauth]==4.44.1
-# Deep Learning frameworks
-torch==2.4.1
-torchvision==0.19.1
-# Computer Vision & Image Processing
-opencv-python-headless==4.10.0.84
-pillow==10.4.0
-numpy==1.26.4
-supervision==0.23.0
-# OCR & Text Recognition
-easyocr==1.7.1
-# Transformers & AI Models
-transformers==4.44.2
-# RF-DETR Detection Model
-rfdetr==1.0.4
-# API Framework
-fastapi==0.115.0
-uvicorn[standard]==0.30.6
-# HTTP Clients
-requests==2.32.3
-aiohttp==3.10.5
-# Testing
-pytest==8.3.3
-# Utilities
-python-multipart==0.0.9  # For FastAPI file uploads
-python-dotenv==1.0.1     # For environment variables

requirements.txt CHANGED Viewed

@@ -18,7 +18,13 @@ transformers==4.35.2
 peft==0.6.2
 accelerate==0.25.0
-# API
 fastapi==0.115.0
 uvicorn[standard]==0.30.6

 peft==0.6.2
 accelerate==0.25.0
+# RF-DETR Detection Model
+rfdetr==1.0.4
+# COCO evaluation tools (required by RF-DETR)
+pycocotools==2.0.8
+# API Framework
 fastapi==0.115.0
 uvicorn[standard]==0.30.6

ui/detection_wrapper.py CHANGED Viewed

@@ -61,21 +61,28 @@ def detect_with_service(
                 return_format="pil"
             )
-            json_payload = response_builder.build_ocr_only_response(
-                detections=detections,
-                image_width=image.width,
-                image_height=image.height,
                 annotated_image=None,
                 confidence_threshold=confidence_threshold,
-                line_thickness=line_thickness
-            )
-            summary_text = response_builder.format_summary_text(
-                detections=detections,
-                parameters=json_payload["parameters"],
                 ocr_only=True
             )
             return annotated, summary_text, json_payload
         # Standard detection path
@@ -105,30 +112,29 @@ def detect_with_service(
             analysis=analysis
         )
-        # Build JSON response
-        json_payload = {
-            "success": True,
-            "detections": analysis["detections"],
-            "total_detections": len(analysis["detections"]),
-            "image_size": analysis["image_size"],
-            "parameters": {
-                "confidence_threshold": confidence_threshold,
-                "enable_clip": enable_clip,
-                "enable_ocr": enable_ocr,
-                "enable_blip": enable_blip,
-                "blip_scope": scope_value if enable_blip else None,
-                "ocr_only": False,
-                "line_thickness": line_thickness
-            },
-            "type_distribution": response_builder.build_type_distribution(analysis["detections"]) if enable_clip else None
-        }
-        # Build summary text
-        summary_text = response_builder.format_summary_text(
-            detections=analysis["detections"],
-            parameters=json_payload["parameters"],
             ocr_only=False
         )
         return annotated, summary_text, json_payload
@@ -199,9 +205,9 @@ def detect_with_api(
             'preprocess_preset': preprocess_preset
         }
-        # Call API
-        # Use configurable timeout (default 300s = 5min for CPU processing and model loading)
-        timeout_seconds = int(os.getenv("CU1_API_TIMEOUT", "300"))
         try:
             response = requests.post(
                 f"{api_url}/detect",
@@ -227,21 +233,22 @@ Cannot connect to API server at `{api_url}`
 You can change this by setting the `CU1_API_URL` environment variable.
 """, None
         except requests.exceptions.Timeout:
-            timeout_seconds = int(os.getenv("CU1_API_TIMEOUT", "300"))
             return None, f"""❌ **Timeout Error**
 The API request timed out after {timeout_seconds} seconds.
-This might happen with:
-- Very large images
-- First run (models need to download - can take 2-5 minutes)
-- CPU-only processing (slower than GPU)
-**Try:**
-- Using a smaller image
-- Waiting for model downloads to complete (check API server logs)
-- Checking API server logs for errors
-- Increasing timeout: export CU1_API_TIMEOUT=600 (10 minutes)
 """, None
         except requests.exceptions.HTTPError as e:
             error_detail = "Unknown error"

                 return_format="pil"
             )
+            # Build analysis structure for simplified response
+            analysis = {
+                "detections": detections,
+                "image_size": {"width": image.width, "height": image.height}
+            }
+            json_payload = response_builder.build_simplified_response(
+                analysis=analysis,
+                image=image,
                 annotated_image=None,
                 confidence_threshold=confidence_threshold,
+                line_thickness=line_thickness,
+                enable_clip=False,
+                enable_ocr=True,
+                enable_blip=False,
+                blip_scope=None,
                 ocr_only=True
             )
+            detections_list = list(json_payload.get("detections", {}).values())
+            summary_text = f"**OCR-only mode**\n**Total OCR texts:** {len(detections_list)}"
             return annotated, summary_text, json_payload
         # Standard detection path
             analysis=analysis
         )
+        # Build JSON response using simplified format
+        json_payload = response_builder.build_simplified_response(
+            analysis=analysis,
+            image=image,
+            annotated_image=None,  # Don't include in JSON (already have PIL image)
+            confidence_threshold=confidence_threshold,
+            line_thickness=line_thickness,
+            enable_clip=enable_clip,
+            enable_ocr=enable_ocr,
+            enable_blip=enable_blip,
+            blip_scope=scope_value,
             ocr_only=False
         )
+        # Build summary text from detections
+        detections_list = list(json_payload.get("detections", {}).values())
+        summary_lines = [f"**Total detections:** {len(detections_list)}", ""]
+        summary_lines.append("**Settings:**")
+        summary_lines.append(f"- Confidence threshold: {confidence_threshold:.2f}")
+        summary_lines.append(f"- CLIP classification: {'✅ Enabled' if enable_clip else '❌ Disabled'}")
+        summary_lines.append(f"- OCR text extraction: {'✅ Enabled' if enable_ocr else '❌ Disabled'}")
+        summary_lines.append(f"- BLIP description: {'✅ Enabled' if enable_blip else '❌ Disabled'}")
+        summary_text = "\n".join(summary_lines)
         return annotated, summary_text, json_payload
             'preprocess_preset': preprocess_preset
         }
+        # Call API with extended timeout for HuggingFace Spaces CPU processing
+        # Default: 600s (10 minutes) to handle model loading on first run
+        timeout_seconds = int(os.getenv("CU1_API_TIMEOUT", "600"))
         try:
             response = requests.post(
                 f"{api_url}/detect",
 You can change this by setting the `CU1_API_URL` environment variable.
 """, None
         except requests.exceptions.Timeout:
+            timeout_seconds = int(os.getenv("CU1_API_TIMEOUT", "600"))
             return None, f"""❌ **Timeout Error**
 The API request timed out after {timeout_seconds} seconds.
+**Most likely cause:** First-time model initialization on HuggingFace Spaces
+**What to do:**
+1. Wait 2-3 minutes and try again (models are loading in background)
+2. Check the "Logs" tab in HuggingFace Spaces to see progress
+3. If you see "[API] Starting detection..." in logs, the API is working
+**For debugging:**
+- Check if you see initialization messages in logs
+- Look for "Loading RF-DETR model..." or "Loading OCR reader..."
+- These operations can take 2-5 minutes on CPU the first time
 """, None
         except requests.exceptions.HTTPError as e:
             error_detail = "Unknown error"

ui/shared_interface.py CHANGED Viewed

@@ -262,8 +262,6 @@ def create_interface(
         # Connect detection button
         # api_name exposes this function as /api/predict endpoint for Hugging Face Spaces
-        # max_time increases Gradio's function timeout (default is 60s, we set to 300s = 5min)
-        max_time_seconds = int(os.getenv("GRADIO_MAX_TIME", "300"))  # 5 minutes default
         detect_button.click(
             fn=detection_fn,
             inputs=[
@@ -281,7 +279,7 @@ def create_interface(
             ],
             outputs=[output_image, summary_output, json_output],
             api_name="predict",  # Expose as /api/predict endpoint
-            max_time=max_time_seconds  # Increase Gradio function timeout
         )
         # Build footer markdown

         # Connect detection button
         # api_name exposes this function as /api/predict endpoint for Hugging Face Spaces
         detect_button.click(
             fn=detection_fn,
             inputs=[
             ],
             outputs=[output_image, summary_output, json_output],
             api_name="predict",  # Expose as /api/predict endpoint
+            show_progress="full"  # Show progress to user during long operations
         )
         # Build footer markdown