# EL HELAL Studio — Technical Documentation A headless FastAPI backend for AI-powered ID card photo processing. --- ## Architecture - **`main.py`**: FastAPI application entry point. All routes are REST-only (JSON responses). - **`core/`**: Pure Python image processing logic — UI-agnostic. - **`newcolor/`**: AI color correction model (`ColorUNet`) and in-memory inference code. - **`config/`**: Global settings (`settings.json`) for retouch, layout defaults. - **`assets/`**: Branding assets (logo) and frame overlays. **No GUI, no desktop wrapper.** Clients interact exclusively via the REST API at `/docs`. --- ## The 5-Step AI Pipeline Every photo processed by the studio follows a strictly sequenced pipeline: ### 1. Auto-Crop & Face Detection (`core/crop.py`) - **Technology:** OpenCV Haar Cascades. - **Logic:** Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop. - **Fallback:** Centers the crop if no face is detected to ensure the pipeline never breaks. ### 2. AI Background Removal (`core/process_images.py`) - **Model:** **BiRefNet (RMBG-2.0)** via the `transformers` library. - **Optimization:** Automatically detects and utilizes CUDA/GPU. Falls back to CPU with dynamic quantization. ### 3. AI Color Correction (`newcolor/inference.py`) - **Model:** **ColorUNet** via custom PyTorch model and weights. - **Mechanism:** Predicts corrected colors at model resolution (1024x1024), fits a quadratic polynomial color transform (10 parameters) on subject pixels using the alpha mask, and applies it to the full-resolution image. - **Optimization:** Dynamic device-aware PyTorch execution (reuses the RMBG execution device, e.g., CUDA or optimized CPU). ### 4. Surgical Retouching (`core/retouch.py`) - **Landmarking:** Uses **MediaPipe Face Mesh** (468 points) to generate a precise skin mask, excluding eyes, lips, and hair. - **Frequency Separation:** Splits the image into **High Frequency** (texture/pores) and **Low Frequency** (tone/color). - **Blemish Removal:** Detects anomalies on the High-Freq layer and inpaints them using surrounding texture. ### 5. Layout Composition (`core/layout_engine.py`) - **Rendering:** Composes a 300 DPI canvas for printing. - **Localization:** Uses `arabic_reshaper` and `python-bidi` for correct Arabic script rendering. - **Dynamic Assets:** Overlays IDs with specific offsets and studio branding (logos). --- ## Configuration The system is controlled by `config/settings.json`. The layout engine hot-reloads this file on every request. You can adjust `id_font_size`, `grid_gap`, or `retouch_sensitivity` and see changes in the next processed photo without restarting. --- ## Known Dependency Conflicts - **TensorFlow vs. Transformers:** Standard `tensorflow` (especially nightly) conflicts with `transformers` and `numpy >= 2.0`. - **Resolution:** Uninstall TensorFlow. The pipeline is 100% PyTorch-based. - **Pinned Versions:** - `numpy < 2.0.0`: Compatibility with `basicsr` and older `torchvision`. - `protobuf <= 3.20.3`: Prevents "Double Registration" errors in multi-model environments. --- ## Environment Setup ```bash conda create -n idmaker python=3.10 conda activate idmaker pip install -r requirements.txt pip uninstall tensorflow tb-nightly tensorboard # Remove conflicts if present ``` --- ## Docker ```bash docker-compose up --build ``` The API will be available at `http://localhost:8000` (or the port defined by `$PORT`). --- ## 🛠 Troubleshooting (Common Pitfalls) | Issue | Root Cause | Solution | | ----------------------------- | ------------------------------------------------------- | ---------------------------------------------------------------------------- | | **"Tofu" Boxes in Text** | Missing or corrupted fonts. | Ensure `assets/arialbd.ttf` is not a Git LFS pointer (size > 300KB). | | **NumPy AttributeError** | Conflict between NumPy 2.x and TensorFlow/Transformers. | Uninstall `tensorflow` and ensure `numpy < 2.0.0` is installed. | | **[Errno 10048] Socket Bind** | Port 7860 is already in use by another server process. | Close the previous server instance or set a new `PORT` environment variable. | | **Meta-Tensor Error** | Transformers 4.50+ CPU bug. | Handled by `torch.linspace` monkeypatch in `process_images.py`. | | **Slow Processing** | CPU bottleneck. | Ensure `torch` is using multiple threads or enable CUDA. | --- ## Testing Framework The codebase includes a comprehensive testing framework divided into lightweight, mock-based unit tests and full integration tests. ### Unit & Mocked API Tests Located in the `tests/` directory: - **`test_layout_engine.py`**: Validates canvas scaling, grid composition, margins, Arabic text shaping, and bidi rendering. - **`test_crop.py`**: Validates OpenCV face-detection coordinates and the 5:7 ratio auto-crop fallback mechanism. - **`test_white_bg.py`**: Verifies transparency compositing onto white canvas and 300 DPI preservation. - **`test_color_steal.py`**: Validates red/green/blue 1D LUT extraction, `.npz` caching, and `.cube` file export. - **`test_api_mocked.py`**: Validates FastAPI endpoints (`/settings`, `/status`, `/frames`, `/upload`, `/process`) using FastAPI `TestClient` and `unittest.mock` to mock ML processing. Avoids GPU/VRAM or large weights download requirements. Run them instantly using: ```bash venv\Scripts\python.exe -m unittest discover -s tests -p "test_*.py" ``` ### Integration Tests Located in the root: - **`test_api.py`**: Validates end-to-end processing with real ML model weights and HTTP requests on a running server. Run with: ```bash venv\Scripts\python.exe test_api.py ``` --- _Last Updated: June 2026 — EL HELAL Studio Engineering_