Spaces:

esmailx50
/

id-making

Paused

App Files Files Community

id-making / id-maker /DOCUMENTATION.md

Esmaill1

Refactor code structure for improved readability and maintainability

3365dbf about 1 month ago

preview code

raw

history blame contribute delete

7.43 kB

📜 ID Maker Studio: Technical Master Documentation

This document serves as the comprehensive technical map for the EL HELAL Studio Photo Pipeline.

🏗 High-Level Architecture

The system is a modular Python-based suite designed to automate the conversion of raw student portraits into professional, print-ready ID sheets. It bridges the gap between complex AI models and a production studio environment.

🧩 Component Breakdown

/core (The Brain): Pure logic and AI processing. It is UI-agnostic and handles image math, landmark detection, and layout composition.
/web (The Primary Interface): A modern FastAPI backend coupled with a localized Arabic (RTL) frontend for batch processing.
/storage (The Data): Centralized storage for uploads, processed images, and final results.
/config (The Settings): Stores settings.json for global configuration.
/tools (The Utilities): Dev scripts, troubleshooting guides, and verification tools.
/assets (The Identity): Centralized storage for branding assets (logo), typography (Arabic fonts), and color grading LUTs.
/gui (Legacy): A Tkinter desktop wrapper for offline/workstation usage.

🚀 The 5-Step AI Pipeline

Every photo processed by the studio follows a strictly sequenced pipeline:

1. Auto-Crop & Face Detection (`crop.py`)

Technology: OpenCV Haar Cascades.
Logic: Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop.
Fallback: Centers the crop if no face is detected to ensure the pipeline never breaks.

2. AI Background Removal (`process_images.py`)

Model: BiRefNet (RMBG-2.0).
Optimization: Automatically detects and utilizes CUDA/GPU. In CPU environments (like HF Spaces), it uses dynamic quantization for speed.
Resilience: Includes critical monkeypatches for transformers 4.50+ to handle tied weights and meta-tensor materialization bugs.

3. Color Grading Style Transfer (`color_steal.py`)

Mechanism: Analyzes "Before" and "After" pairs to learn R, G, and B curves.
Smoothing: Uses Savitzky-Golay filters to prevent color banding.
Application: Applies learned styles via vectorized NumPy operations for near-instant processing.

4. Surgical Retouching (`retouch.py`)

Landmarking: Uses MediaPipe Face Mesh (468 points) to generate a precise skin mask, excluding eyes, lips, and hair.
Frequency Separation: Splits the image into High Frequency (texture/pores) and Low Frequency (tone/color).
Blemish Removal: Detects anomalies on the High-Freq layer and inpaints them using surrounding texture.
Result: Pores and skin texture are 100% preserved; only defects are removed.

5. Layout Composition (`layout_engine.py`)

Rendering: Composes a 300 DPI canvas for printing.
Localization: Uses arabic_reshaper and python-bidi for correct Arabic script rendering.
Dynamic Assets: Overlays IDs with specific offsets and studio branding (logos).
Customization: Supports dynamic frame color selection (passed via API) for the large side panel.

⚙️ Configuration & Real-Time Tuning

The system is controlled by core/settings.json.

Hot Reloading: The layout engine reloads this file on every request. You can adjust id_font_size, grid_gap, or retouch_sensitivity and see the changes in the next processed photo without restarting the server.

💾 Backup & Restoration

The system supports full state backup via the web interface.

Export: Creates a ZIP file containing:
- Global settings.json.
- All custom assets (frames, logos) in assets/.
- Client-side preferences (theme, saved colors).
Import: Restores the configuration and assets from a ZIP file and refreshes the client state.

🐍 Environment & Dependency Management

The project requires a carefully managed Python environment to avoid common AI library conflicts.

Known Conflicts & Fixes

TensorFlow vs. Transformers: Standard installations of tensorflow (especially nightly versions) conflict with transformers and numpy 2.x, causing AttributeError: module 'numpy' has no attribute 'dtypes' and Protobuf descriptor errors.
Resolution: Uninstall TensorFlow. The pipeline is 100% PyTorch-based. Removing TensorFlow resolves these import crashes immediately.
Pinned Versions:
- numpy < 2.0.0: Required for compatibility with basicsr and older torchvision utilities.
- protobuf <= 3.20.3: Prevents "Double Registration" errors in multi-model environments.

Environment Setup (Conda)

conda create -n idmaker python=3.10
conda activate idmaker
pip install -r requirements.txt
# Ensure no conflicting packages remain
pip uninstall tensorflow tb-nightly tensorboard

☁️ CodeFormer Restoration API

The id-maker system integrates with an external CodeFormer service for high-fidelity face restoration. This is handled via a dedicated REST API.

Endpoint: `/api/restore` (POST)

The API accepts an image and returns a JSON response containing a URL to the restored result.

Request Parameters (multipart/form-data):

image: The source image file (JPG/PNG).
fidelity: (Float, 0.0 - 1.0) Controls the balance between restoration quality (1.0) and fidelity to the original (0.0).
upscale: (Integer, 1-4) Final output magnification.
background_enhance: (Boolean string, "true"/"false") Whether to enhance the non-face areas using Real-ESRGAN.
face_upsample: (Boolean string, "true"/"false") Whether to apply dedicated face upsampling.

Success Response (JSON):

{
    "status": "success",
    "results": [
        { "image_url": "https://service-url/static/results/result_uuid.png" }
    ],
    "message": "Restoration complete"
}

Configuration

The target API URL is controlled in id-maker/config/settings.json under api.codeformer_url or via the CODEFORMER_API_URL environment variable.

🐳 Deployment & Cloud Readiness

The project is optimized for high-availability environments.

Docker Environment

Base: python:3.10-slim.
System Deps: Requires libgl1 (OpenCV), libraqm0 (Font rendering), and libharfbuzz0b (Arabic shaping).

Hugging Face Spaces

Transformers Fix: Patches PretrainedConfig to allow custom model loading without attribute errors.
LFS Support: Binary files (.ttf, .cube, .png) are managed via Git LFS to ensure integrity.

🛠 Troubleshooting (Common Pitfalls)

Issue	Root Cause	Solution
"Tofu" Boxes in Text	Missing or corrupted fonts.	Ensure `assets/arialbd.ttf` is not a Git LFS pointer (size > 300KB).
NumPy AttributeError	Conflict between NumPy 2.x and TensorFlow/Transformers.	Uninstall `tensorflow` and ensure `numpy < 2.0.0` is installed.
[Errno 10048] Socket Bind	Port 7860 is already in use by another server process.	Close the previous server instance or set a new `PORT` environment variable.
Meta-Tensor Error	Transformers 4.50+ CPU bug.	Handled by `torch.linspace` monkeypatch in `process_images.py`.
Slow Processing	CPU bottleneck.	Ensure `torch` is using multiple threads or enable CUDA.

Last Updated: February 2026 — EL HELAL Studio Engineering