| # π ID Maker Studio: Technical Master Documentation | |
| This document serves as the comprehensive technical map for the **EL HELAL Studio Photo Pipeline**. | |
| --- | |
| ## π High-Level Architecture | |
| The system is a modular Python-based suite designed to automate the conversion of raw student portraits into professional, print-ready ID sheets. It bridges the gap between complex AI models and a production studio environment. | |
| ### π§© Component Breakdown | |
| - **`/core` (The Brain):** Pure logic and AI processing. It is UI-agnostic and handles image math, landmark detection, and layout composition. | |
| - **`/web` (The Primary Interface):** A modern FastAPI backend coupled with a localized Arabic (RTL) frontend for batch processing. | |
| - **`/storage` (The Data):** Centralized storage for uploads, processed images, and final results. | |
| - **`/config` (The Settings):** Stores `settings.json` for global configuration. | |
| - **`/tools` (The Utilities):** Dev scripts, troubleshooting guides, and verification tools. | |
| - **`/assets` (The Identity):** Centralized storage for branding assets (logo), typography (Arabic fonts), and color grading LUTs. | |
| - **`/gui` (Legacy):** A Tkinter desktop wrapper for offline/workstation usage. | |
| --- | |
| ## π The 5-Step AI Pipeline | |
| Every photo processed by the studio follows a strictly sequenced pipeline: | |
| ### 1. Auto-Crop & Face Detection (`crop.py`) | |
| - **Technology:** OpenCV Haar Cascades. | |
| - **Logic:** Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop. | |
| - **Fallback:** Centers the crop if no face is detected to ensure the pipeline never breaks. | |
| ### 2. AI Background Removal (`process_images.py`) | |
| - **Model:** **BiRefNet (RMBG-2.0)**. | |
| - **Optimization:** Automatically detects and utilizes CUDA/GPU. In CPU environments (like HF Spaces), it uses dynamic quantization for speed. | |
| - **Resilience:** Includes critical monkeypatches for `transformers 4.50+` to handle tied weights and meta-tensor materialization bugs. | |
| ### 3. Color Grading Style Transfer (`color_steal.py`) | |
| - **Mechanism:** Analyzes "Before" and "After" pairs to learn R, G, and B curves. | |
| - **Smoothing:** Uses **Savitzky-Golay filters** to prevent color banding. | |
| - **Application:** Applies learned styles via vectorized NumPy operations for near-instant processing. | |
| ### 4. Surgical Retouching (`retouch.py`) | |
| - **Landmarking:** Uses **MediaPipe Face Mesh** (468 points) to generate a precise skin mask, excluding eyes, lips, and hair. | |
| - **Frequency Separation:** Splits the image into **High Frequency** (texture/pores) and **Low Frequency** (tone/color). | |
| - **Blemish Removal:** Detects anomalies on the High-Freq layer and inpaints them using surrounding texture. | |
| - **Result:** Pores and skin texture are 100% preserved; only defects are removed. | |
| ### 5. Layout Composition (`layout_engine.py`) | |
| - **Rendering:** Composes a 300 DPI canvas for printing. | |
| - **Localization:** Uses `arabic_reshaper` and `python-bidi` for correct Arabic script rendering. | |
| - **Dynamic Assets:** Overlays IDs with specific offsets and studio branding (logos). | |
| - **Customization:** Supports dynamic frame color selection (passed via API) for the large side panel. | |
| --- | |
| ## βοΈ Configuration & Real-Time Tuning | |
| The system is controlled by `core/settings.json`. | |
| - **Hot Reloading:** The layout engine reloads this file on **every request**. You can adjust `id_font_size`, `grid_gap`, or `retouch_sensitivity` and see the changes in the next processed photo without restarting the server. | |
| ### πΎ Backup & Restoration | |
| The system supports full state backup via the web interface. | |
| - **Export:** Creates a ZIP file containing: | |
| - Global `settings.json`. | |
| - All custom assets (frames, logos) in `assets/`. | |
| - Client-side preferences (theme, saved colors). | |
| - **Import:** Restores the configuration and assets from a ZIP file and refreshes the client state. | |
| --- | |
| ## π Environment & Dependency Management | |
| The project requires a carefully managed Python environment to avoid common AI library conflicts. | |
| ### Known Conflicts & Fixes | |
| - **TensorFlow vs. Transformers:** Standard installations of `tensorflow` (especially nightly versions) conflict with `transformers` and `numpy 2.x`, causing `AttributeError: module 'numpy' has no attribute 'dtypes'` and Protobuf descriptor errors. | |
| - **Resolution:** **Uninstall TensorFlow.** The pipeline is 100% PyTorch-based. Removing TensorFlow resolves these import crashes immediately. | |
| - **Pinned Versions:** | |
| - `numpy < 2.0.0`: Required for compatibility with `basicsr` and older `torchvision` utilities. | |
| - `protobuf <= 3.20.3`: Prevents "Double Registration" errors in multi-model environments. | |
| ### Environment Setup (Conda) | |
| ```bash | |
| conda create -n idmaker python=3.10 | |
| conda activate idmaker | |
| pip install -r requirements.txt | |
| # Ensure no conflicting packages remain | |
| pip uninstall tensorflow tb-nightly tensorboard | |
| ``` | |
| --- | |
| ## βοΈ CodeFormer Restoration API | |
| The `id-maker` system integrates with an external **CodeFormer** service for high-fidelity face restoration. This is handled via a dedicated REST API. | |
| ### Endpoint: `/api/restore` (POST) | |
| The API accepts an image and returns a JSON response containing a URL to the restored result. | |
| **Request Parameters (`multipart/form-data`):** | |
| - `image`: The source image file (JPG/PNG). | |
| - `fidelity`: (Float, 0.0 - 1.0) Controls the balance between restoration quality (1.0) and fidelity to the original (0.0). | |
| - `upscale`: (Integer, 1-4) Final output magnification. | |
| - `background_enhance`: (Boolean string, "true"/"false") Whether to enhance the non-face areas using Real-ESRGAN. | |
| - `face_upsample`: (Boolean string, "true"/"false") Whether to apply dedicated face upsampling. | |
| **Success Response (JSON):** | |
| ```json | |
| { | |
| "status": "success", | |
| "results": [ | |
| { "image_url": "https://service-url/static/results/result_uuid.png" } | |
| ], | |
| "message": "Restoration complete" | |
| } | |
| ``` | |
| ### Configuration | |
| The target API URL is controlled in `id-maker/config/settings.json` under `api.codeformer_url` or via the `CODEFORMER_API_URL` environment variable. | |
| --- | |
| ## π³ Deployment & Cloud Readiness | |
| The project is optimized for high-availability environments. | |
| ### Docker Environment | |
| - **Base:** `python:3.10-slim`. | |
| - **System Deps:** Requires `libgl1` (OpenCV), `libraqm0` (Font rendering), and `libharfbuzz0b` (Arabic shaping). | |
| ### Hugging Face Spaces | |
| - **Transformers Fix:** Patches `PretrainedConfig` to allow custom model loading without attribute errors. | |
| - **LFS Support:** Binary files (`.ttf`, `.cube`, `.png`) are managed via Git LFS to ensure integrity. | |
| --- | |
| ## π Troubleshooting (Common Pitfalls) | |
| | Issue | Root Cause | Solution | | |
| |-------|------------|----------| | |
| | **"Tofu" Boxes in Text** | Missing or corrupted fonts. | Ensure `assets/arialbd.ttf` is not a Git LFS pointer (size > 300KB). | | |
| | **NumPy AttributeError** | Conflict between NumPy 2.x and TensorFlow/Transformers. | Uninstall `tensorflow` and ensure `numpy < 2.0.0` is installed. | | |
| | **[Errno 10048] Socket Bind** | Port 7860 is already in use by another server process. | Close the previous server instance or set a new `PORT` environment variable. | | |
| | **Meta-Tensor Error** | Transformers 4.50+ CPU bug. | Handled by `torch.linspace` monkeypatch in `process_images.py`. | | |
| | **Slow Processing** | CPU bottleneck. | Ensure `torch` is using multiple threads or enable CUDA. | | |
| --- | |
| *Last Updated: February 2026 β EL HELAL Studio Engineering* | |