π ID Maker Studio: Technical Master Documentation
This document serves as the comprehensive technical map for the EL HELAL Studio Photo Pipeline.
π High-Level Architecture
The system is a modular Python-based suite designed to automate the conversion of raw student portraits into professional, print-ready ID sheets. It bridges the gap between complex AI models and a production studio environment.
π§© Component Breakdown
/core(The Brain): Pure logic and AI processing. It is UI-agnostic and handles image math, landmark detection, and layout composition./web(The Primary Interface): A modern FastAPI backend coupled with a localized Arabic (RTL) frontend for batch processing./storage(The Data): Centralized storage for uploads, processed images, and final results./config(The Settings): Storessettings.jsonfor global configuration./tools(The Utilities): Dev scripts, troubleshooting guides, and verification tools./assets(The Identity): Centralized storage for branding assets (logo), typography (Arabic fonts), and color grading LUTs./gui(Legacy): A Tkinter desktop wrapper for offline/workstation usage.
π The 5-Step AI Pipeline
Every photo processed by the studio follows a strictly sequenced pipeline:
1. Auto-Crop & Face Detection (crop.py)
- Technology: OpenCV Haar Cascades.
- Logic: Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop.
- Fallback: Centers the crop if no face is detected to ensure the pipeline never breaks.
2. AI Background Removal (process_images.py)
- Model: BiRefNet (RMBG-2.0).
- Optimization: Automatically detects and utilizes CUDA/GPU. In CPU environments (like HF Spaces), it uses dynamic quantization for speed.
- Resilience: Includes critical monkeypatches for
transformers 4.50+to handle tied weights and meta-tensor materialization bugs.
3. Color Grading Style Transfer (color_steal.py)
- Mechanism: Analyzes "Before" and "After" pairs to learn R, G, and B curves.
- Smoothing: Uses Savitzky-Golay filters to prevent color banding.
- Application: Applies learned styles via vectorized NumPy operations for near-instant processing.
4. Surgical Retouching (retouch.py)
- Landmarking: Uses MediaPipe Face Mesh (468 points) to generate a precise skin mask, excluding eyes, lips, and hair.
- Frequency Separation: Splits the image into High Frequency (texture/pores) and Low Frequency (tone/color).
- Blemish Removal: Detects anomalies on the High-Freq layer and inpaints them using surrounding texture.
- Result: Pores and skin texture are 100% preserved; only defects are removed.
5. Layout Composition (layout_engine.py)
- Rendering: Composes a 300 DPI canvas for printing.
- Localization: Uses
arabic_reshaperandpython-bidifor correct Arabic script rendering. - Dynamic Assets: Overlays IDs with specific offsets and studio branding (logos).
βοΈ Configuration & Real-Time Tuning
The system is controlled by core/settings.json.
- Hot Reloading: The layout engine reloads this file on every request. You can adjust
id_font_size,grid_gap, orretouch_sensitivityand see the changes in the next processed photo without restarting the server.
π³ Deployment & Cloud Readiness
The project is optimized for high-availability environments.
Docker Environment
- Base:
python:3.10-slim. - System Deps: Requires
libgl1(OpenCV),libraqm0(Font rendering), andlibharfbuzz0b(Arabic shaping).
Hugging Face Spaces
- Transformers Fix: Patches
PretrainedConfigto allow custom model loading without attribute errors. - LFS Support: Binary files (
.ttf,.cube,.png) are managed via Git LFS to ensure integrity.
π Troubleshooting (Common Pitfalls)
| Issue | Root Cause | Solution |
|---|---|---|
| "Tofu" Boxes in Text | Missing or corrupted fonts. | Ensure assets/arialbd.ttf is not a Git LFS pointer (size > 300KB). |
| Meta-Tensor Error | Transformers 4.50+ CPU bug. | Handled by torch.linspace monkeypatch in process_images.py. |
| Slow Processing | CPU bottleneck. | Ensure torch is using multiple threads or enable CUDA. |
Last Updated: February 2026 β EL HELAL Studio Engineering