Imgenhance / replit.md
zaysrwk
Update documentation to reflect recent frontend fixes and changes
2142ae0

AI Image Processing

Overview

An AI-powered image processing API with multiple features:

  • Image enhancement/upscaling using Real-ESRGAN
  • Background removal using BiRefNet via rembg
  • Noise reduction using OpenCV Non-Local Means Denoising
  • Document scanning with auto-crop, alignment, and HD enhancement
  • FastAPI backend with automatic Swagger API documentation
  • Simple web frontend for testing

Current State

  • Local Preview: Running with simple processing (no heavy AI models due to size constraints)
  • Full AI Mode: Available when deployed to Hugging Face Spaces

Project Structure

/
β”œβ”€β”€ app.py              # Full FastAPI app for Hugging Face deployment
β”œβ”€β”€ app_local.py        # Lightweight local preview server
β”œβ”€β”€ enhancer.py         # Real-ESRGAN model wrapper (for HF deployment)
β”œβ”€β”€ document_scanner.py # Document scanning with OpenCV (auto-crop, align, enhance)
β”œβ”€β”€ templates/
β”‚   └── index.html      # Frontend interface
β”œβ”€β”€ requirements.txt    # Dependencies for Hugging Face Spaces
β”œβ”€β”€ Dockerfile          # Docker configuration for HF Spaces
β”œβ”€β”€ README.md           # Hugging Face Spaces configuration
β”œβ”€β”€ uploads/            # Temporary upload storage
└── outputs/            # Processed image outputs

API Endpoints

  • GET / - Web frontend
  • GET /docs - Swagger API documentation
  • GET /health - Health check
  • GET /model-info - Model information
  • GET /progress/{job_id} - Get async job progress
  • GET /result/{job_id} - Get completed job result
  • POST /enhance - Enhance/upscale image (Real-ESRGAN) - supports async_mode=true
  • POST /enhance/async - Start async enhancement with progress tracking
  • POST /enhance/base64 - Enhance image (returns base64)
  • POST /remove-background - Remove image background (BiRefNet) - supports async_mode=true
  • POST /remove-background/async - Start async background removal with progress
  • POST /remove-background/base64 - Remove background (returns base64)
  • POST /denoise - Reduce image noise (OpenCV NLM) - supports async_mode=true
  • POST /denoise/async - Start async denoising with progress
  • POST /denoise/base64 - Denoise image (returns base64)
  • POST /docscan - Scan document (auto-crop, align, HD enhance) - supports async_mode=true
  • POST /docscan/async - Start async document scan with progress
  • POST /docscan/base64 - Scan document (returns base64)

Document Scanner Features

The /docscan endpoint provides:

  • Auto-detection: Edge detection using Canny algorithm
  • Auto-crop: Contour detection and perspective correction
  • Alignment: Four-point perspective transform
  • Contrast: CLAHE (Contrast Limited Adaptive Histogram Equalization)
  • Denoising: Bilateral filter (preserves edges while reducing noise)
  • Sharpening: Unsharp masking for crisp text
  • HD Upscaling: Optional Real-ESRGAN enhancement (1-4x scale)

Deploying to Hugging Face Spaces

  1. Create a new Space on Hugging Face
  2. Select "Docker" as the SDK
  3. Upload all files: app.py, enhancer.py, document_scanner.py, templates/, requirements.txt, Dockerfile, README.md
  4. The Space will auto-build the container and download AI models

Progress Tracking API

All image processing endpoints support async mode with progress tracking:

  1. Start a job with async_mode=true or use the /async endpoints
  2. Receive a job_id in the response
  3. Poll /progress/{job_id} to get current progress (0-100%)
  4. When complete, fetch result from /result/{job_id}

Example response from progress endpoint:

{
  "job_id": "abc-123",
  "status": "processing",
  "progress": 45.0,
  "message": "Enhancing image (4 tiles)...",
  "current_step": 2,
  "total_steps": 5
}

Performance Optimizations

  • Tile processing: Images processed in 256px tiles for memory efficiency
  • Max input size: 512x512 for enhance (auto-resized), 2048x2048 for docscan
  • Default scale: 2x (faster than 4x, still good quality)
  • Background threading: Server stays responsive during processing

Recent Changes

  • 2025-11-28: Fixed frontend progress bar and image return issues
    • Added visual progress bar with percentage display to frontend
    • Updated frontend to use async endpoints for all features
    • Added polling logic with proper error handling and timeouts
    • Fixed image result fetching with 202 status handling
    • All endpoints now show real-time progress during processing
  • 2025-11-28: Added progress tracking and async processing
    • New progress_tracker.py module with thread-safe job tracking
    • /progress/{job_id} and /result/{job_id} endpoints
    • async_mode parameter on all image processing endpoints
    • Dedicated /async endpoints for each feature
    • Automatic cleanup of old jobs and result files
    • Performance optimizations: tile=256, max_size=512, default scale=2
  • 2025-11-28: Added document scanning feature
    • Auto-crop with edge detection and contour finding
    • Perspective correction for skewed documents
    • CLAHE contrast enhancement
    • Bilateral filter denoising (preserves details)
    • Unsharp mask sharpening
    • Optional HD upscaling with Real-ESRGAN
  • 2025-11-28: Added background removal and noise reduction features
    • BiRefNet integration via rembg for background removal
    • OpenCV Non-Local Means Denoising
    • Updated frontend with feature tabs
    • Updated API documentation
  • 2025-11-28: Initial creation of AI Image Enhancer API
    • FastAPI backend with Swagger docs
    • Real-ESRGAN integration for Hugging Face
    • Simple frontend for testing
    • Lightweight local preview mode
    • Docker configuration for HF Spaces deployment