esmaill1
feat: implement image processing core, FastAPI backend, and full-stack integration tests
f19ba0f | # EL HELAL Studio β Technical Documentation | |
| A headless FastAPI backend for AI-powered ID card photo processing. | |
| --- | |
| ## Architecture | |
| - **`main.py`**: FastAPI application entry point. All routes are REST-only (JSON responses). | |
| - **`core/`**: Pure Python image processing logic β UI-agnostic. | |
| - **`newcolor/`**: AI color correction model (`ColorUNet`) and in-memory inference code. | |
| - **`config/`**: Global settings (`settings.json`) for retouch, layout defaults. | |
| - **`assets/`**: Branding assets (logo) and frame overlays. | |
| **No GUI, no desktop wrapper.** Clients interact exclusively via the REST API at `/docs`. | |
| --- | |
| ## The 5-Step AI Pipeline | |
| Every photo processed by the studio follows a strictly sequenced pipeline: | |
| ### 1. Auto-Crop & Face Detection (`core/crop.py`) | |
| - **Technology:** OpenCV Haar Cascades. | |
| - **Logic:** Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop. | |
| - **Fallback:** Centers the crop if no face is detected to ensure the pipeline never breaks. | |
| ### 2. AI Background Removal (`core/process_images.py`) | |
| - **Model:** **BiRefNet (RMBG-2.0)** via the `transformers` library. | |
| - **Optimization:** Automatically detects and utilizes CUDA/GPU. Falls back to CPU with dynamic quantization. | |
| ### 3. AI Color Correction (`newcolor/inference.py`) | |
| - **Model:** **ColorUNet** via custom PyTorch model and weights. | |
| - **Mechanism:** Predicts corrected colors at model resolution (1024x1024), fits a quadratic polynomial color transform (10 parameters) on subject pixels using the alpha mask, and applies it to the full-resolution image. | |
| - **Optimization:** Dynamic device-aware PyTorch execution (reuses the RMBG execution device, e.g., CUDA or optimized CPU). | |
| ### 4. Surgical Retouching (`core/retouch.py`) | |
| - **Landmarking:** Uses **MediaPipe Face Mesh** (468 points) to generate a precise skin mask, excluding eyes, lips, and hair. | |
| - **Frequency Separation:** Splits the image into **High Frequency** (texture/pores) and **Low Frequency** (tone/color). | |
| - **Blemish Removal:** Detects anomalies on the High-Freq layer and inpaints them using surrounding texture. | |
| ### 5. Layout Composition (`core/layout_engine.py`) | |
| - **Rendering:** Composes a 300 DPI canvas for printing. | |
| - **Localization:** Uses `arabic_reshaper` and `python-bidi` for correct Arabic script rendering. | |
| - **Dynamic Assets:** Overlays IDs with specific offsets and studio branding (logos). | |
| --- | |
| ## Configuration | |
| The system is controlled by `config/settings.json`. The layout engine hot-reloads this file on every request. You can adjust `id_font_size`, `grid_gap`, or `retouch_sensitivity` and see changes in the next processed photo without restarting. | |
| --- | |
| ## Known Dependency Conflicts | |
| - **TensorFlow vs. Transformers:** Standard `tensorflow` (especially nightly) conflicts with `transformers` and `numpy >= 2.0`. | |
| - **Resolution:** Uninstall TensorFlow. The pipeline is 100% PyTorch-based. | |
| - **Pinned Versions:** | |
| - `numpy < 2.0.0`: Compatibility with `basicsr` and older `torchvision`. | |
| - `protobuf <= 3.20.3`: Prevents "Double Registration" errors in multi-model environments. | |
| --- | |
| ## Environment Setup | |
| ```bash | |
| conda create -n idmaker python=3.10 | |
| conda activate idmaker | |
| pip install -r requirements.txt | |
| pip uninstall tensorflow tb-nightly tensorboard # Remove conflicts if present | |
| ``` | |
| --- | |
| ## Docker | |
| ```bash | |
| docker-compose up --build | |
| ``` | |
| The API will be available at `http://localhost:8000` (or the port defined by `$PORT`). | |
| --- | |
| ## π Troubleshooting (Common Pitfalls) | |
| | Issue | Root Cause | Solution | | |
| | ----------------------------- | ------------------------------------------------------- | ---------------------------------------------------------------------------- | | |
| | **"Tofu" Boxes in Text** | Missing or corrupted fonts. | Ensure `assets/arialbd.ttf` is not a Git LFS pointer (size > 300KB). | | |
| | **NumPy AttributeError** | Conflict between NumPy 2.x and TensorFlow/Transformers. | Uninstall `tensorflow` and ensure `numpy < 2.0.0` is installed. | | |
| | **[Errno 10048] Socket Bind** | Port 7860 is already in use by another server process. | Close the previous server instance or set a new `PORT` environment variable. | | |
| | **Meta-Tensor Error** | Transformers 4.50+ CPU bug. | Handled by `torch.linspace` monkeypatch in `process_images.py`. | | |
| | **Slow Processing** | CPU bottleneck. | Ensure `torch` is using multiple threads or enable CUDA. | | |
| --- | |
| ## Testing Framework | |
| The codebase includes a comprehensive testing framework divided into lightweight, mock-based unit tests and full integration tests. | |
| ### Unit & Mocked API Tests | |
| Located in the `tests/` directory: | |
| - **`test_layout_engine.py`**: Validates canvas scaling, grid composition, margins, Arabic text shaping, and bidi rendering. | |
| - **`test_crop.py`**: Validates OpenCV face-detection coordinates and the 5:7 ratio auto-crop fallback mechanism. | |
| - **`test_white_bg.py`**: Verifies transparency compositing onto white canvas and 300 DPI preservation. | |
| - **`test_color_steal.py`**: Validates red/green/blue 1D LUT extraction, `.npz` caching, and `.cube` file export. | |
| - **`test_api_mocked.py`**: Validates FastAPI endpoints (`/settings`, `/status`, `/frames`, `/upload`, `/process`) using FastAPI `TestClient` and `unittest.mock` to mock ML processing. Avoids GPU/VRAM or large weights download requirements. | |
| Run them instantly using: | |
| ```bash | |
| venv\Scripts\python.exe -m unittest discover -s tests -p "test_*.py" | |
| ``` | |
| ### Integration Tests | |
| Located in the root: | |
| - **`test_api.py`**: Validates end-to-end processing with real ML model weights and HTTP requests on a running server. | |
| Run with: | |
| ```bash | |
| venv\Scripts\python.exe test_api.py | |
| ``` | |
| --- | |
| _Last Updated: June 2026 β EL HELAL Studio Engineering_ | |