# AI Image Processing ## Overview An AI-powered image processing API with multiple features: - Image enhancement/upscaling using Real-ESRGAN - Background removal using BiRefNet via rembg - Noise reduction using OpenCV Non-Local Means Denoising - Document scanning with auto-crop, alignment, and HD enhancement - FastAPI backend with automatic Swagger API documentation - Simple web frontend for testing ## Current State - **Local Preview**: Running with simple processing (no heavy AI models due to size constraints) - **Full AI Mode**: Available when deployed to Hugging Face Spaces ## Project Structure ``` / ├── app.py # Full FastAPI app for Hugging Face deployment ├── app_local.py # Lightweight local preview server ├── enhancer.py # Real-ESRGAN model wrapper (for HF deployment) ├── document_scanner.py # Document scanning with OpenCV (auto-crop, align, enhance) ├── templates/ │ └── index.html # Frontend interface ├── requirements.txt # Dependencies for Hugging Face Spaces ├── Dockerfile # Docker configuration for HF Spaces ├── README.md # Hugging Face Spaces configuration ├── uploads/ # Temporary upload storage └── outputs/ # Processed image outputs ``` ## API Endpoints - `GET /` - Web frontend - `GET /docs` - Swagger API documentation - `GET /health` - Health check - `GET /model-info` - Model information - `GET /progress/{job_id}` - Get async job progress - `GET /result/{job_id}` - Get completed job result - `POST /enhance` - Enhance/upscale image (Real-ESRGAN) - supports `async_mode=true` - `POST /enhance/async` - Start async enhancement with progress tracking - `POST /enhance/base64` - Enhance image (returns base64) - `POST /remove-background` - Remove image background (BiRefNet) - supports `async_mode=true` - `POST /remove-background/async` - Start async background removal with progress - `POST /remove-background/base64` - Remove background (returns base64) - `POST /denoise` - Reduce image noise (OpenCV NLM) - supports `async_mode=true` - `POST /denoise/async` - Start async denoising with progress - `POST /denoise/base64` - Denoise image (returns base64) - `POST /docscan` - Scan document (auto-crop, align, HD enhance) - supports `async_mode=true` - `POST /docscan/async` - Start async document scan with progress - `POST /docscan/base64` - Scan document (returns base64) ## Document Scanner Features The `/docscan` endpoint provides: - **Auto-detection**: Edge detection using Canny algorithm - **Auto-crop**: Contour detection and perspective correction - **Alignment**: Four-point perspective transform - **Contrast**: CLAHE (Contrast Limited Adaptive Histogram Equalization) - **Denoising**: Bilateral filter (preserves edges while reducing noise) - **Sharpening**: Unsharp masking for crisp text - **HD Upscaling**: Optional Real-ESRGAN enhancement (1-4x scale) ## Deploying to Hugging Face Spaces 1. Create a new Space on Hugging Face 2. Select "Docker" as the SDK 3. Upload all files: `app.py`, `enhancer.py`, `document_scanner.py`, `templates/`, `requirements.txt`, `Dockerfile`, `README.md` 4. The Space will auto-build the container and download AI models ## Progress Tracking API All image processing endpoints support async mode with progress tracking: 1. Start a job with `async_mode=true` or use the `/async` endpoints 2. Receive a `job_id` in the response 3. Poll `/progress/{job_id}` to get current progress (0-100%) 4. When complete, fetch result from `/result/{job_id}` Example response from progress endpoint: ```json { "job_id": "abc-123", "status": "processing", "progress": 45.0, "message": "Enhancing image (4 tiles)...", "current_step": 2, "total_steps": 5 } ``` ## Performance Optimizations - **Tile processing**: Images processed in 256px tiles for memory efficiency - **Max input size**: 512x512 for enhance (auto-resized), 2048x2048 for docscan - **Default scale**: 2x (faster than 4x, still good quality) - **Background threading**: Server stays responsive during processing ## Recent Changes - 2025-11-28: Fixed frontend progress bar and image return issues - Added visual progress bar with percentage display to frontend - Updated frontend to use async endpoints for all features - Added polling logic with proper error handling and timeouts - Fixed image result fetching with 202 status handling - All endpoints now show real-time progress during processing - 2025-11-28: Added progress tracking and async processing - New progress_tracker.py module with thread-safe job tracking - /progress/{job_id} and /result/{job_id} endpoints - async_mode parameter on all image processing endpoints - Dedicated /async endpoints for each feature - Automatic cleanup of old jobs and result files - Performance optimizations: tile=256, max_size=512, default scale=2 - 2025-11-28: Added document scanning feature - Auto-crop with edge detection and contour finding - Perspective correction for skewed documents - CLAHE contrast enhancement - Bilateral filter denoising (preserves details) - Unsharp mask sharpening - Optional HD upscaling with Real-ESRGAN - 2025-11-28: Added background removal and noise reduction features - BiRefNet integration via rembg for background removal - OpenCV Non-Local Means Denoising - Updated frontend with feature tabs - Updated API documentation - 2025-11-28: Initial creation of AI Image Enhancer API - FastAPI backend with Swagger docs - Real-ESRGAN integration for Hugging Face - Simple frontend for testing - Lightweight local preview mode - Docker configuration for HF Spaces deployment