Spaces:

Samleuma
/

Imgenhance

Sleeping

File size: 5,702 Bytes

# AI Image Processing

## Overview
An AI-powered image processing API with multiple features:
- Image enhancement/upscaling using Real-ESRGAN
- Background removal using BiRefNet via rembg
- Noise reduction using OpenCV Non-Local Means Denoising
- Document scanning with auto-crop, alignment, and HD enhancement
- FastAPI backend with automatic Swagger API documentation
- Simple web frontend for testing

## Current State
- **Local Preview**: Running with simple processing (no heavy AI models due to size constraints)
- **Full AI Mode**: Available when deployed to Hugging Face Spaces

## Project Structure
```
/
├── app.py              # Full FastAPI app for Hugging Face deployment
├── app_local.py        # Lightweight local preview server
├── enhancer.py         # Real-ESRGAN model wrapper (for HF deployment)
├── document_scanner.py # Document scanning with OpenCV (auto-crop, align, enhance)
├── templates/
│   └── index.html      # Frontend interface
├── requirements.txt    # Dependencies for Hugging Face Spaces
├── Dockerfile          # Docker configuration for HF Spaces
├── README.md           # Hugging Face Spaces configuration
├── uploads/            # Temporary upload storage
└── outputs/            # Processed image outputs
```

## API Endpoints
- `GET /` - Web frontend
- `GET /docs` - Swagger API documentation
- `GET /health` - Health check
- `GET /model-info` - Model information
- `GET /progress/{job_id}` - Get async job progress
- `GET /result/{job_id}` - Get completed job result
- `POST /enhance` - Enhance/upscale image (Real-ESRGAN) - supports `async_mode=true`
- `POST /enhance/async` - Start async enhancement with progress tracking
- `POST /enhance/base64` - Enhance image (returns base64)
- `POST /remove-background` - Remove image background (BiRefNet) - supports `async_mode=true`
- `POST /remove-background/async` - Start async background removal with progress
- `POST /remove-background/base64` - Remove background (returns base64)
- `POST /denoise` - Reduce image noise (OpenCV NLM) - supports `async_mode=true`
- `POST /denoise/async` - Start async denoising with progress
- `POST /denoise/base64` - Denoise image (returns base64)
- `POST /docscan` - Scan document (auto-crop, align, HD enhance) - supports `async_mode=true`
- `POST /docscan/async` - Start async document scan with progress
- `POST /docscan/base64` - Scan document (returns base64)

## Document Scanner Features
The `/docscan` endpoint provides:
- **Auto-detection**: Edge detection using Canny algorithm
- **Auto-crop**: Contour detection and perspective correction
- **Alignment**: Four-point perspective transform
- **Contrast**: CLAHE (Contrast Limited Adaptive Histogram Equalization)
- **Denoising**: Bilateral filter (preserves edges while reducing noise)
- **Sharpening**: Unsharp masking for crisp text
- **HD Upscaling**: Optional Real-ESRGAN enhancement (1-4x scale)

## Deploying to Hugging Face Spaces
1. Create a new Space on Hugging Face
2. Select "Docker" as the SDK
3. Upload all files: `app.py`, `enhancer.py`, `document_scanner.py`, `templates/`, `requirements.txt`, `Dockerfile`, `README.md`
4. The Space will auto-build the container and download AI models

## Progress Tracking API
All image processing endpoints support async mode with progress tracking:

1. Start a job with `async_mode=true` or use the `/async` endpoints
2. Receive a `job_id` in the response
3. Poll `/progress/{job_id}` to get current progress (0-100%)
4. When complete, fetch result from `/result/{job_id}`

Example response from progress endpoint:
```json
{
  "job_id": "abc-123",
  "status": "processing",
  "progress": 45.0,
  "message": "Enhancing image (4 tiles)...",
  "current_step": 2,
  "total_steps": 5
}
```

## Performance Optimizations
- **Tile processing**: Images processed in 256px tiles for memory efficiency
- **Max input size**: 512x512 for enhance (auto-resized), 2048x2048 for docscan
- **Default scale**: 2x (faster than 4x, still good quality)
- **Background threading**: Server stays responsive during processing

## Recent Changes
- 2025-11-28: Fixed frontend progress bar and image return issues
  - Added visual progress bar with percentage display to frontend
  - Updated frontend to use async endpoints for all features
  - Added polling logic with proper error handling and timeouts
  - Fixed image result fetching with 202 status handling
  - All endpoints now show real-time progress during processing
- 2025-11-28: Added progress tracking and async processing
  - New progress_tracker.py module with thread-safe job tracking
  - /progress/{job_id} and /result/{job_id} endpoints
  - async_mode parameter on all image processing endpoints
  - Dedicated /async endpoints for each feature
  - Automatic cleanup of old jobs and result files
  - Performance optimizations: tile=256, max_size=512, default scale=2
- 2025-11-28: Added document scanning feature
  - Auto-crop with edge detection and contour finding
  - Perspective correction for skewed documents
  - CLAHE contrast enhancement
  - Bilateral filter denoising (preserves details)
  - Unsharp mask sharpening
  - Optional HD upscaling with Real-ESRGAN
- 2025-11-28: Added background removal and noise reduction features
  - BiRefNet integration via rembg for background removal
  - OpenCV Non-Local Means Denoising
  - Updated frontend with feature tabs
  - Updated API documentation
- 2025-11-28: Initial creation of AI Image Enhancer API
  - FastAPI backend with Swagger docs
  - Real-ESRGAN integration for Hugging Face
  - Simple frontend for testing
  - Lightweight local preview mode
  - Docker configuration for HF Spaces deployment