Spaces:
Sleeping
Sleeping
metadata
title: Image Selector Backend
emoji: πΌοΈ
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
Image Selector Backend API
A FastAPI backend for intelligent image deduplication and selection. Upload images, group similar ones using ResNet50 embeddings, and keep only the best image from each group based on aesthetics scoring.
π Features
- Smart Deduplication: Groups similar images using deep learning embeddings
- Aesthetics Scoring: Ranks images within each group using a CLIP-based aesthetics model
- REST API: Simple endpoints for upload, process, progress tracking, and download
- Per-User Sessions: Each user gets isolated processing and temporary storage
- Auto Cleanup: Temporary files are removed after download
π‘ API Endpoints
Health Check
GET /
Returns: {"status": "ok"}
Upload Images
POST /upload
Form fields:
user_id(text): Your unique session IDfile(file): Image file to upload
Start Processing
POST /process
Form fields:
user_id(text): Your session IDsimilarity(float, optional): Similarity threshold (default: 0.87)use_aesthetics(bool, optional): Enable aesthetics scoring (default: true)
Check Progress
GET /progress/{user_id}
Returns processing status, percentage, and ETA.
Download Results
GET /download/{user_id}
Returns a ZIP file containing the best images from each group.
π§ How It Works
- Upload: Send images with a unique
user_id - Process: Trigger processing - the backend will:
- Generate embeddings for all images
- Group similar images (similarity threshold configurable)
- Score each image using an aesthetics model
- Keep the best image from each group
- Download: Retrieve a ZIP of selected images
- Auto-cleanup: All temporary files are deleted after download
π» Local Development
git clone <this-repo>
cd <repo-folder>
pip install -r requirements.txt
uvicorn main:app --host 0.0.0.0 --port 7860
π¨ Frontend
Use with the web UI:
π¦ Tech Stack
- FastAPI: Modern async web framework
- PyTorch: Deep learning for embeddings
- ResNet50: Feature extraction model
- CLIP: Aesthetics prediction model
- SQLite: Temporary embeddings storage
β οΈ Notes for HuggingFace Spaces
- Uses
/tmpfor ephemeral storage (free tier) - For persistent storage, upgrade to a paid tier and data will use
/data - Model weights download on first run (~500MB for ResNet50 + aesthetics model)
- GPU recommended for faster processing (upgrade to T4 or better)
π License
MIT License - see LICENSE file
π Project Origin
Based on Image-Selecter