Imgenhance / replit.md
zaysrwk
Update documentation to reflect recent frontend fixes and changes
2142ae0
# AI Image Processing
## Overview
An AI-powered image processing API with multiple features:
- Image enhancement/upscaling using Real-ESRGAN
- Background removal using BiRefNet via rembg
- Noise reduction using OpenCV Non-Local Means Denoising
- Document scanning with auto-crop, alignment, and HD enhancement
- FastAPI backend with automatic Swagger API documentation
- Simple web frontend for testing
## Current State
- **Local Preview**: Running with simple processing (no heavy AI models due to size constraints)
- **Full AI Mode**: Available when deployed to Hugging Face Spaces
## Project Structure
```
/
β”œβ”€β”€ app.py # Full FastAPI app for Hugging Face deployment
β”œβ”€β”€ app_local.py # Lightweight local preview server
β”œβ”€β”€ enhancer.py # Real-ESRGAN model wrapper (for HF deployment)
β”œβ”€β”€ document_scanner.py # Document scanning with OpenCV (auto-crop, align, enhance)
β”œβ”€β”€ templates/
β”‚ └── index.html # Frontend interface
β”œβ”€β”€ requirements.txt # Dependencies for Hugging Face Spaces
β”œβ”€β”€ Dockerfile # Docker configuration for HF Spaces
β”œβ”€β”€ README.md # Hugging Face Spaces configuration
β”œβ”€β”€ uploads/ # Temporary upload storage
└── outputs/ # Processed image outputs
```
## API Endpoints
- `GET /` - Web frontend
- `GET /docs` - Swagger API documentation
- `GET /health` - Health check
- `GET /model-info` - Model information
- `GET /progress/{job_id}` - Get async job progress
- `GET /result/{job_id}` - Get completed job result
- `POST /enhance` - Enhance/upscale image (Real-ESRGAN) - supports `async_mode=true`
- `POST /enhance/async` - Start async enhancement with progress tracking
- `POST /enhance/base64` - Enhance image (returns base64)
- `POST /remove-background` - Remove image background (BiRefNet) - supports `async_mode=true`
- `POST /remove-background/async` - Start async background removal with progress
- `POST /remove-background/base64` - Remove background (returns base64)
- `POST /denoise` - Reduce image noise (OpenCV NLM) - supports `async_mode=true`
- `POST /denoise/async` - Start async denoising with progress
- `POST /denoise/base64` - Denoise image (returns base64)
- `POST /docscan` - Scan document (auto-crop, align, HD enhance) - supports `async_mode=true`
- `POST /docscan/async` - Start async document scan with progress
- `POST /docscan/base64` - Scan document (returns base64)
## Document Scanner Features
The `/docscan` endpoint provides:
- **Auto-detection**: Edge detection using Canny algorithm
- **Auto-crop**: Contour detection and perspective correction
- **Alignment**: Four-point perspective transform
- **Contrast**: CLAHE (Contrast Limited Adaptive Histogram Equalization)
- **Denoising**: Bilateral filter (preserves edges while reducing noise)
- **Sharpening**: Unsharp masking for crisp text
- **HD Upscaling**: Optional Real-ESRGAN enhancement (1-4x scale)
## Deploying to Hugging Face Spaces
1. Create a new Space on Hugging Face
2. Select "Docker" as the SDK
3. Upload all files: `app.py`, `enhancer.py`, `document_scanner.py`, `templates/`, `requirements.txt`, `Dockerfile`, `README.md`
4. The Space will auto-build the container and download AI models
## Progress Tracking API
All image processing endpoints support async mode with progress tracking:
1. Start a job with `async_mode=true` or use the `/async` endpoints
2. Receive a `job_id` in the response
3. Poll `/progress/{job_id}` to get current progress (0-100%)
4. When complete, fetch result from `/result/{job_id}`
Example response from progress endpoint:
```json
{
"job_id": "abc-123",
"status": "processing",
"progress": 45.0,
"message": "Enhancing image (4 tiles)...",
"current_step": 2,
"total_steps": 5
}
```
## Performance Optimizations
- **Tile processing**: Images processed in 256px tiles for memory efficiency
- **Max input size**: 512x512 for enhance (auto-resized), 2048x2048 for docscan
- **Default scale**: 2x (faster than 4x, still good quality)
- **Background threading**: Server stays responsive during processing
## Recent Changes
- 2025-11-28: Fixed frontend progress bar and image return issues
- Added visual progress bar with percentage display to frontend
- Updated frontend to use async endpoints for all features
- Added polling logic with proper error handling and timeouts
- Fixed image result fetching with 202 status handling
- All endpoints now show real-time progress during processing
- 2025-11-28: Added progress tracking and async processing
- New progress_tracker.py module with thread-safe job tracking
- /progress/{job_id} and /result/{job_id} endpoints
- async_mode parameter on all image processing endpoints
- Dedicated /async endpoints for each feature
- Automatic cleanup of old jobs and result files
- Performance optimizations: tile=256, max_size=512, default scale=2
- 2025-11-28: Added document scanning feature
- Auto-crop with edge detection and contour finding
- Perspective correction for skewed documents
- CLAHE contrast enhancement
- Bilateral filter denoising (preserves details)
- Unsharp mask sharpening
- Optional HD upscaling with Real-ESRGAN
- 2025-11-28: Added background removal and noise reduction features
- BiRefNet integration via rembg for background removal
- OpenCV Non-Local Means Denoising
- Updated frontend with feature tabs
- Updated API documentation
- 2025-11-28: Initial creation of AI Image Enhancer API
- FastAPI backend with Swagger docs
- Real-ESRGAN integration for Hugging Face
- Simple frontend for testing
- Lightweight local preview mode
- Docker configuration for HF Spaces deployment