Spaces:

esmailx50
/

id

Paused

App Files Files Community

id / DOCUMENTATION.md

Esmaill1

Initialize Hugging Face Space with project files

e64ee47 29 days ago

preview code

raw

history blame contribute delete

4.44 kB

	# 📜 ID Maker Studio: Technical Master Documentation

	This document serves as the comprehensive technical map for the EL HELAL Studio Photo Pipeline.

	---

	## 🏗 High-Level Architecture

	The system is a modular Python-based suite designed to automate the conversion of raw student portraits into professional, print-ready ID sheets. It bridges the gap between complex AI models and a production studio environment.

	### 🧩 Component Breakdown
	- `/core` (The Brain): Pure logic and AI processing. It is UI-agnostic and handles image math, landmark detection, and layout composition.
	- `/web` (The Primary Interface): A modern FastAPI backend coupled with a localized Arabic (RTL) frontend for batch processing.
	- `/storage` (The Data): Centralized storage for uploads, processed images, and final results.
	- `/config` (The Settings): Stores `settings.json` for global configuration.
	- `/tools` (The Utilities): Dev scripts, troubleshooting guides, and verification tools.
	- `/assets` (The Identity): Centralized storage for branding assets (logo), typography (Arabic fonts), and color grading LUTs.
	- `/gui` (Legacy): A Tkinter desktop wrapper for offline/workstation usage.

	---

	## 🚀 The 5-Step AI Pipeline

	Every photo processed by the studio follows a strictly sequenced pipeline:

	### 1. Auto-Crop & Face Detection (`crop.py`)
	- Technology: OpenCV Haar Cascades.
	- Logic: Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop.
	- Fallback: Centers the crop if no face is detected to ensure the pipeline never breaks.

	### 2. AI Background Removal (`process_images.py`)
	- Model: BiRefNet (RMBG-2.0).
	- Optimization: Automatically detects and utilizes CUDA/GPU. In CPU environments (like HF Spaces), it uses dynamic quantization for speed.
	- Resilience: Includes critical monkeypatches for `transformers 4.50+` to handle tied weights and meta-tensor materialization bugs.

	### 3. Color Grading Style Transfer (`color_steal.py`)
	- Mechanism: Analyzes "Before" and "After" pairs to learn R, G, and B curves.
	- Smoothing: Uses Savitzky-Golay filters to prevent color banding.
	- Application: Applies learned styles via vectorized NumPy operations for near-instant processing.

	### 4. Surgical Retouching (`retouch.py`)
	- Landmarking: Uses MediaPipe Face Mesh (468 points) to generate a precise skin mask, excluding eyes, lips, and hair.
	- Frequency Separation: Splits the image into High Frequency (texture/pores) and Low Frequency (tone/color).
	- Blemish Removal: Detects anomalies on the High-Freq layer and inpaints them using surrounding texture.
	- Result: Pores and skin texture are 100% preserved; only defects are removed.

	### 5. Layout Composition (`layout_engine.py`)
	- Rendering: Composes a 300 DPI canvas for printing.
	- Localization: Uses `arabic_reshaper` and `python-bidi` for correct Arabic script rendering.
	- Dynamic Assets: Overlays IDs with specific offsets and studio branding (logos).

	---

	## ⚙️ Configuration & Real-Time Tuning

	The system is controlled by `core/settings.json`.
	- Hot Reloading: The layout engine reloads this file on every request. You can adjust `id_font_size`, `grid_gap`, or `retouch_sensitivity` and see the changes in the next processed photo without restarting the server.

	---

	## 🐳 Deployment & Cloud Readiness

	The project is optimized for high-availability environments.

	### Docker Environment
	- Base: `python:3.10-slim`.
	- System Deps: Requires `libgl1` (OpenCV), `libraqm0` (Font rendering), and `libharfbuzz0b` (Arabic shaping).

	### Hugging Face Spaces
	- Transformers Fix: Patches `PretrainedConfig` to allow custom model loading without attribute errors.
	- LFS Support: Binary files (`.ttf`, `.cube`, `.png`) are managed via Git LFS to ensure integrity.

	---

	## 🛠 Troubleshooting (Common Pitfalls)

	\| Issue \| Root Cause \| Solution \|
	\|-------\|------------\|----------\|
	\| "Tofu" Boxes in Text \| Missing or corrupted fonts. \| Ensure `assets/arialbd.ttf` is not a Git LFS pointer (size > 300KB). \|
	\| Meta-Tensor Error \| Transformers 4.50+ CPU bug. \| Handled by `torch.linspace` monkeypatch in `process_images.py`. \|
	\| Slow Processing \| CPU bottleneck. \| Ensure `torch` is using multiple threads or enable CUDA. \|

	---

	Last Updated: February 2026 — EL HELAL Studio Engineering

	# 📜 ID Maker Studio: Technical Master Documentation

	This document serves as the comprehensive technical map for the EL HELAL Studio Photo Pipeline.

	---

	## 🏗 High-Level Architecture

	The system is a modular Python-based suite designed to automate the conversion of raw student portraits into professional, print-ready ID sheets. It bridges the gap between complex AI models and a production studio environment.

	### 🧩 Component Breakdown
	- `/core` (The Brain): Pure logic and AI processing. It is UI-agnostic and handles image math, landmark detection, and layout composition.
	- `/web` (The Primary Interface): A modern FastAPI backend coupled with a localized Arabic (RTL) frontend for batch processing.
	- `/storage` (The Data): Centralized storage for uploads, processed images, and final results.
	- `/config` (The Settings): Stores `settings.json` for global configuration.
	- `/tools` (The Utilities): Dev scripts, troubleshooting guides, and verification tools.
	- `/assets` (The Identity): Centralized storage for branding assets (logo), typography (Arabic fonts), and color grading LUTs.
	- `/gui` (Legacy): A Tkinter desktop wrapper for offline/workstation usage.

	---

	## 🚀 The 5-Step AI Pipeline

	Every photo processed by the studio follows a strictly sequenced pipeline:

	### 1. Auto-Crop & Face Detection (`crop.py`)
	- Technology: OpenCV Haar Cascades.
	- Logic: Detects the largest face, centers it, and calculates a 5:7 (4x6cm) aspect ratio crop.
	- Fallback: Centers the crop if no face is detected to ensure the pipeline never breaks.

	### 2. AI Background Removal (`process_images.py`)
	- Model: BiRefNet (RMBG-2.0).
	- Optimization: Automatically detects and utilizes CUDA/GPU. In CPU environments (like HF Spaces), it uses dynamic quantization for speed.
	- Resilience: Includes critical monkeypatches for `transformers 4.50+` to handle tied weights and meta-tensor materialization bugs.

	### 3. Color Grading Style Transfer (`color_steal.py`)
	- Mechanism: Analyzes "Before" and "After" pairs to learn R, G, and B curves.
	- Smoothing: Uses Savitzky-Golay filters to prevent color banding.
	- Application: Applies learned styles via vectorized NumPy operations for near-instant processing.

	### 4. Surgical Retouching (`retouch.py`)
	- Landmarking: Uses MediaPipe Face Mesh (468 points) to generate a precise skin mask, excluding eyes, lips, and hair.
	- Frequency Separation: Splits the image into High Frequency (texture/pores) and Low Frequency (tone/color).
	- Blemish Removal: Detects anomalies on the High-Freq layer and inpaints them using surrounding texture.
	- Result: Pores and skin texture are 100% preserved; only defects are removed.

	### 5. Layout Composition (`layout_engine.py`)
	- Rendering: Composes a 300 DPI canvas for printing.
	- Localization: Uses `arabic_reshaper` and `python-bidi` for correct Arabic script rendering.
	- Dynamic Assets: Overlays IDs with specific offsets and studio branding (logos).

	---

	## ⚙️ Configuration & Real-Time Tuning

	The system is controlled by `core/settings.json`.
	- Hot Reloading: The layout engine reloads this file on every request. You can adjust `id_font_size`, `grid_gap`, or `retouch_sensitivity` and see the changes in the next processed photo without restarting the server.

	---

	## 🐳 Deployment & Cloud Readiness

	The project is optimized for high-availability environments.

	### Docker Environment
	- Base: `python:3.10-slim`.
	- System Deps: Requires `libgl1` (OpenCV), `libraqm0` (Font rendering), and `libharfbuzz0b` (Arabic shaping).

	### Hugging Face Spaces
	- Transformers Fix: Patches `PretrainedConfig` to allow custom model loading without attribute errors.
	- LFS Support: Binary files (`.ttf`, `.cube`, `.png`) are managed via Git LFS to ensure integrity.

	---

	## 🛠 Troubleshooting (Common Pitfalls)

	\| Issue \| Root Cause \| Solution \|
	\|-------\|------------\|----------\|
	\| "Tofu" Boxes in Text \| Missing or corrupted fonts. \| Ensure `assets/arialbd.ttf` is not a Git LFS pointer (size > 300KB). \|
	\| Meta-Tensor Error \| Transformers 4.50+ CPU bug. \| Handled by `torch.linspace` monkeypatch in `process_images.py`. \|
	\| Slow Processing \| CPU bottleneck. \| Ensure `torch` is using multiple threads or enable CUDA. \|

	---

	Last Updated: February 2026 — EL HELAL Studio Engineering