Spaces:
Sleeping
Sleeping
AI Image Processing
Overview
An AI-powered image processing API with multiple features:
- Image enhancement/upscaling using Real-ESRGAN
- Background removal using BiRefNet via rembg
- Noise reduction using OpenCV Non-Local Means Denoising
- Document scanning with auto-crop, alignment, and HD enhancement
- FastAPI backend with automatic Swagger API documentation
- Simple web frontend for testing
Current State
- Local Preview: Running with simple processing (no heavy AI models due to size constraints)
- Full AI Mode: Available when deployed to Hugging Face Spaces
Project Structure
/
βββ app.py # Full FastAPI app for Hugging Face deployment
βββ app_local.py # Lightweight local preview server
βββ enhancer.py # Real-ESRGAN model wrapper (for HF deployment)
βββ document_scanner.py # Document scanning with OpenCV (auto-crop, align, enhance)
βββ templates/
β βββ index.html # Frontend interface
βββ requirements.txt # Dependencies for Hugging Face Spaces
βββ Dockerfile # Docker configuration for HF Spaces
βββ README.md # Hugging Face Spaces configuration
βββ uploads/ # Temporary upload storage
βββ outputs/ # Processed image outputs
API Endpoints
GET /- Web frontendGET /docs- Swagger API documentationGET /health- Health checkGET /model-info- Model informationGET /progress/{job_id}- Get async job progressGET /result/{job_id}- Get completed job resultPOST /enhance- Enhance/upscale image (Real-ESRGAN) - supportsasync_mode=truePOST /enhance/async- Start async enhancement with progress trackingPOST /enhance/base64- Enhance image (returns base64)POST /remove-background- Remove image background (BiRefNet) - supportsasync_mode=truePOST /remove-background/async- Start async background removal with progressPOST /remove-background/base64- Remove background (returns base64)POST /denoise- Reduce image noise (OpenCV NLM) - supportsasync_mode=truePOST /denoise/async- Start async denoising with progressPOST /denoise/base64- Denoise image (returns base64)POST /docscan- Scan document (auto-crop, align, HD enhance) - supportsasync_mode=truePOST /docscan/async- Start async document scan with progressPOST /docscan/base64- Scan document (returns base64)
Document Scanner Features
The /docscan endpoint provides:
- Auto-detection: Edge detection using Canny algorithm
- Auto-crop: Contour detection and perspective correction
- Alignment: Four-point perspective transform
- Contrast: CLAHE (Contrast Limited Adaptive Histogram Equalization)
- Denoising: Bilateral filter (preserves edges while reducing noise)
- Sharpening: Unsharp masking for crisp text
- HD Upscaling: Optional Real-ESRGAN enhancement (1-4x scale)
Deploying to Hugging Face Spaces
- Create a new Space on Hugging Face
- Select "Docker" as the SDK
- Upload all files:
app.py,enhancer.py,document_scanner.py,templates/,requirements.txt,Dockerfile,README.md - The Space will auto-build the container and download AI models
Progress Tracking API
All image processing endpoints support async mode with progress tracking:
- Start a job with
async_mode=trueor use the/asyncendpoints - Receive a
job_idin the response - Poll
/progress/{job_id}to get current progress (0-100%) - When complete, fetch result from
/result/{job_id}
Example response from progress endpoint:
{
"job_id": "abc-123",
"status": "processing",
"progress": 45.0,
"message": "Enhancing image (4 tiles)...",
"current_step": 2,
"total_steps": 5
}
Performance Optimizations
- Tile processing: Images processed in 256px tiles for memory efficiency
- Max input size: 512x512 for enhance (auto-resized), 2048x2048 for docscan
- Default scale: 2x (faster than 4x, still good quality)
- Background threading: Server stays responsive during processing
Recent Changes
- 2025-11-28: Fixed frontend progress bar and image return issues
- Added visual progress bar with percentage display to frontend
- Updated frontend to use async endpoints for all features
- Added polling logic with proper error handling and timeouts
- Fixed image result fetching with 202 status handling
- All endpoints now show real-time progress during processing
- 2025-11-28: Added progress tracking and async processing
- New progress_tracker.py module with thread-safe job tracking
- /progress/{job_id} and /result/{job_id} endpoints
- async_mode parameter on all image processing endpoints
- Dedicated /async endpoints for each feature
- Automatic cleanup of old jobs and result files
- Performance optimizations: tile=256, max_size=512, default scale=2
- 2025-11-28: Added document scanning feature
- Auto-crop with edge detection and contour finding
- Perspective correction for skewed documents
- CLAHE contrast enhancement
- Bilateral filter denoising (preserves details)
- Unsharp mask sharpening
- Optional HD upscaling with Real-ESRGAN
- 2025-11-28: Added background removal and noise reduction features
- BiRefNet integration via rembg for background removal
- OpenCV Non-Local Means Denoising
- Updated frontend with feature tabs
- Updated API documentation
- 2025-11-28: Initial creation of AI Image Enhancer API
- FastAPI backend with Swagger docs
- Real-ESRGAN integration for Hugging Face
- Simple frontend for testing
- Lightweight local preview mode
- Docker configuration for HF Spaces deployment