Spaces:
Sleeping
Sleeping
File size: 5,702 Bytes
5215ce9 b754303 5215ce9 4ca6349 b754303 5215ce9 b754303 5215ce9 b754303 4ca6349 b754303 73e78aa b754303 5215ce9 b754303 6df4679 5215ce9 6df4679 5215ce9 6df4679 5215ce9 6df4679 4ca6349 b754303 73e78aa 4ca6349 5215ce9 b754303 6df4679 b754303 2142ae0 6df4679 4ca6349 5215ce9 b754303 73e78aa | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | # AI Image Processing
## Overview
An AI-powered image processing API with multiple features:
- Image enhancement/upscaling using Real-ESRGAN
- Background removal using BiRefNet via rembg
- Noise reduction using OpenCV Non-Local Means Denoising
- Document scanning with auto-crop, alignment, and HD enhancement
- FastAPI backend with automatic Swagger API documentation
- Simple web frontend for testing
## Current State
- **Local Preview**: Running with simple processing (no heavy AI models due to size constraints)
- **Full AI Mode**: Available when deployed to Hugging Face Spaces
## Project Structure
```
/
βββ app.py # Full FastAPI app for Hugging Face deployment
βββ app_local.py # Lightweight local preview server
βββ enhancer.py # Real-ESRGAN model wrapper (for HF deployment)
βββ document_scanner.py # Document scanning with OpenCV (auto-crop, align, enhance)
βββ templates/
β βββ index.html # Frontend interface
βββ requirements.txt # Dependencies for Hugging Face Spaces
βββ Dockerfile # Docker configuration for HF Spaces
βββ README.md # Hugging Face Spaces configuration
βββ uploads/ # Temporary upload storage
βββ outputs/ # Processed image outputs
```
## API Endpoints
- `GET /` - Web frontend
- `GET /docs` - Swagger API documentation
- `GET /health` - Health check
- `GET /model-info` - Model information
- `GET /progress/{job_id}` - Get async job progress
- `GET /result/{job_id}` - Get completed job result
- `POST /enhance` - Enhance/upscale image (Real-ESRGAN) - supports `async_mode=true`
- `POST /enhance/async` - Start async enhancement with progress tracking
- `POST /enhance/base64` - Enhance image (returns base64)
- `POST /remove-background` - Remove image background (BiRefNet) - supports `async_mode=true`
- `POST /remove-background/async` - Start async background removal with progress
- `POST /remove-background/base64` - Remove background (returns base64)
- `POST /denoise` - Reduce image noise (OpenCV NLM) - supports `async_mode=true`
- `POST /denoise/async` - Start async denoising with progress
- `POST /denoise/base64` - Denoise image (returns base64)
- `POST /docscan` - Scan document (auto-crop, align, HD enhance) - supports `async_mode=true`
- `POST /docscan/async` - Start async document scan with progress
- `POST /docscan/base64` - Scan document (returns base64)
## Document Scanner Features
The `/docscan` endpoint provides:
- **Auto-detection**: Edge detection using Canny algorithm
- **Auto-crop**: Contour detection and perspective correction
- **Alignment**: Four-point perspective transform
- **Contrast**: CLAHE (Contrast Limited Adaptive Histogram Equalization)
- **Denoising**: Bilateral filter (preserves edges while reducing noise)
- **Sharpening**: Unsharp masking for crisp text
- **HD Upscaling**: Optional Real-ESRGAN enhancement (1-4x scale)
## Deploying to Hugging Face Spaces
1. Create a new Space on Hugging Face
2. Select "Docker" as the SDK
3. Upload all files: `app.py`, `enhancer.py`, `document_scanner.py`, `templates/`, `requirements.txt`, `Dockerfile`, `README.md`
4. The Space will auto-build the container and download AI models
## Progress Tracking API
All image processing endpoints support async mode with progress tracking:
1. Start a job with `async_mode=true` or use the `/async` endpoints
2. Receive a `job_id` in the response
3. Poll `/progress/{job_id}` to get current progress (0-100%)
4. When complete, fetch result from `/result/{job_id}`
Example response from progress endpoint:
```json
{
"job_id": "abc-123",
"status": "processing",
"progress": 45.0,
"message": "Enhancing image (4 tiles)...",
"current_step": 2,
"total_steps": 5
}
```
## Performance Optimizations
- **Tile processing**: Images processed in 256px tiles for memory efficiency
- **Max input size**: 512x512 for enhance (auto-resized), 2048x2048 for docscan
- **Default scale**: 2x (faster than 4x, still good quality)
- **Background threading**: Server stays responsive during processing
## Recent Changes
- 2025-11-28: Fixed frontend progress bar and image return issues
- Added visual progress bar with percentage display to frontend
- Updated frontend to use async endpoints for all features
- Added polling logic with proper error handling and timeouts
- Fixed image result fetching with 202 status handling
- All endpoints now show real-time progress during processing
- 2025-11-28: Added progress tracking and async processing
- New progress_tracker.py module with thread-safe job tracking
- /progress/{job_id} and /result/{job_id} endpoints
- async_mode parameter on all image processing endpoints
- Dedicated /async endpoints for each feature
- Automatic cleanup of old jobs and result files
- Performance optimizations: tile=256, max_size=512, default scale=2
- 2025-11-28: Added document scanning feature
- Auto-crop with edge detection and contour finding
- Perspective correction for skewed documents
- CLAHE contrast enhancement
- Bilateral filter denoising (preserves details)
- Unsharp mask sharpening
- Optional HD upscaling with Real-ESRGAN
- 2025-11-28: Added background removal and noise reduction features
- BiRefNet integration via rembg for background removal
- OpenCV Non-Local Means Denoising
- Updated frontend with feature tabs
- Updated API documentation
- 2025-11-28: Initial creation of AI Image Enhancer API
- FastAPI backend with Swagger docs
- Real-ESRGAN integration for Hugging Face
- Simple frontend for testing
- Lightweight local preview mode
- Docker configuration for HF Spaces deployment
|