EL HELAL Studio β Technical Documentation
A headless FastAPI backend for AI-powered ID card photo processing.
Architecture
main.py: FastAPI application entry point. All routes are REST-only (JSON responses).core/: Pure Python image processing logic β UI-agnostic.newcolor/: AI color correction model (ColorUNet) and in-memory inference code.config/: Global settings (settings.json) for retouch, layout defaults.assets/: Branding assets (logo) and frame overlays.
No GUI, no desktop wrapper. Clients interact exclusively via the REST API at /docs.
The 5-Step AI Pipeline
Every photo processed by the studio follows a strictly sequenced pipeline:
1. Auto-Crop & Face Detection (core/crop.py)
- Technology: OpenCV Haar Cascades.
- Logic: Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop.
- Fallback: Centers the crop if no face is detected to ensure the pipeline never breaks.
2. AI Background Removal (core/process_images.py)
- Model: BiRefNet (RMBG-2.0) via the
transformerslibrary. - Optimization: Automatically detects and utilizes CUDA/GPU. Falls back to CPU with dynamic quantization.
3. AI Color Correction (newcolor/inference.py)
- Model: ColorUNet via custom PyTorch model and weights.
- Mechanism: Predicts corrected colors at model resolution (1024x1024), fits a quadratic polynomial color transform (10 parameters) on subject pixels using the alpha mask, and applies it to the full-resolution image.
- Optimization: Dynamic device-aware PyTorch execution (reuses the RMBG execution device, e.g., CUDA or optimized CPU).
4. Surgical Retouching (core/retouch.py)
- Landmarking: Uses MediaPipe Face Mesh (468 points) to generate a precise skin mask, excluding eyes, lips, and hair.
- Frequency Separation: Splits the image into High Frequency (texture/pores) and Low Frequency (tone/color).
- Blemish Removal: Detects anomalies on the High-Freq layer and inpaints them using surrounding texture.
5. Layout Composition (core/layout_engine.py)
- Rendering: Composes a 300 DPI canvas for printing.
- Localization: Uses
arabic_reshaperandpython-bidifor correct Arabic script rendering. - Dynamic Assets: Overlays IDs with specific offsets and studio branding (logos).
Configuration
The system is controlled by config/settings.json. The layout engine hot-reloads this file on every request. You can adjust id_font_size, grid_gap, or retouch_sensitivity and see changes in the next processed photo without restarting.
Known Dependency Conflicts
- TensorFlow vs. Transformers: Standard
tensorflow(especially nightly) conflicts withtransformersandnumpy >= 2.0. - Resolution: Uninstall TensorFlow. The pipeline is 100% PyTorch-based.
- Pinned Versions:
numpy < 2.0.0: Compatibility withbasicsrand oldertorchvision.protobuf <= 3.20.3: Prevents "Double Registration" errors in multi-model environments.
Environment Setup
conda create -n idmaker python=3.10
conda activate idmaker
pip install -r requirements.txt
pip uninstall tensorflow tb-nightly tensorboard # Remove conflicts if present
Docker
docker-compose up --build
The API will be available at http://localhost:8000 (or the port defined by $PORT).
π Troubleshooting (Common Pitfalls)
| Issue | Root Cause | Solution |
|---|---|---|
| "Tofu" Boxes in Text | Missing or corrupted fonts. | Ensure assets/arialbd.ttf is not a Git LFS pointer (size > 300KB). |
| NumPy AttributeError | Conflict between NumPy 2.x and TensorFlow/Transformers. | Uninstall tensorflow and ensure numpy < 2.0.0 is installed. |
| [Errno 10048] Socket Bind | Port 7860 is already in use by another server process. | Close the previous server instance or set a new PORT environment variable. |
| Meta-Tensor Error | Transformers 4.50+ CPU bug. | Handled by torch.linspace monkeypatch in process_images.py. |
| Slow Processing | CPU bottleneck. | Ensure torch is using multiple threads or enable CUDA. |
Testing Framework
The codebase includes a comprehensive testing framework divided into lightweight, mock-based unit tests and full integration tests.
Unit & Mocked API Tests
Located in the tests/ directory:
test_layout_engine.py: Validates canvas scaling, grid composition, margins, Arabic text shaping, and bidi rendering.test_crop.py: Validates OpenCV face-detection coordinates and the 5:7 ratio auto-crop fallback mechanism.test_white_bg.py: Verifies transparency compositing onto white canvas and 300 DPI preservation.test_color_steal.py: Validates red/green/blue 1D LUT extraction,.npzcaching, and.cubefile export.test_api_mocked.py: Validates FastAPI endpoints (/settings,/status,/frames,/upload,/process) using FastAPITestClientandunittest.mockto mock ML processing. Avoids GPU/VRAM or large weights download requirements.
Run them instantly using:
venv\Scripts\python.exe -m unittest discover -s tests -p "test_*.py"
Integration Tests
Located in the root:
test_api.py: Validates end-to-end processing with real ML model weights and HTTP requests on a running server.
Run with:
venv\Scripts\python.exe test_api.py
Last Updated: June 2026 β EL HELAL Studio Engineering