id / tools /problems.md
Esmaill1
Document intelligent Arabic rendering fix in problems.md
b6dd390

Resolved Technical Issues & Deployment Guide

This document tracks critical problems encountered during development and deployment (especially for Docker and Hugging Face Spaces) and their corresponding solutions.

1. Font Rendering & Layout Consistency

Problem: "Boxes" instead of Arabic Text

  • Symptom: Arabic names appeared as empty boxes (tofu) or incorrectly rendered characters.
  • Cause: The system was attempting to load Windows-specific font paths (e.g., C:/Windows/Fonts/...) which do not exist in Linux/Docker environments.
  • Solution:
    • Implemented platform-agnostic font discovery in core/layout_engine.py.
    • Added automatic detection of bundled fonts in the assets/ directory.
    • Prioritized arialbd.ttf for Arabic support over legacy fonts.

Problem: Settings Changes Not Reflecting

  • Symptom: Changing id_font_size in settings.json had no effect until the app was restarted.
  • Cause: Settings were loaded once at module import time and cached as global constants.
  • Solution: Modified generate_layout to call load_settings() at the start of every execution, ensuring real-time updates from the JSON file.

2. Hugging Face Spaces (Docker) Deployment

Problem: Obsolete libgl1-mesa-glx

  • Symptom: Docker build failed with E: Package 'libgl1-mesa-glx' has no installation candidate.
  • Cause: The base Debian image (Trixie/Sid) used by HF has obsoleted this package.
  • Solution: Updated Dockerfile to use libgl1 which provides the necessary OpenGL libraries for OpenCV.

Problem: Transformers 4.50+ Compatibility (BiRefNet/RMBG-2.0)

  • Symptom: AttributeError: 'BiRefNet' object has no attribute 'all_tied_weights_keys' or 'NoneType' object has no attribute 'keys'.
  • Cause: The custom model code for BiRefNet/RMBG-2.0 is incompatible with internal changes in recent transformers versions regarding tied weight tracking.
  • Solution:
    • Applied a robust monkeypatch in core/process_images.py to the PreTrainedModel class.
    • Forced all_tied_weights_keys to always return an empty dictionary {} instead of None.
    • Pinned transformers==4.48.2 in requirements.txt for a stable environment.

Problem: Meta-Tensor Initialization Bug

  • Symptom: Model failed to load on CPU due to "meta tensors" not being correctly materialized.
  • Cause: A bug in how torch.linspace interacts with transformers' low_cpu_mem_usage flag for custom models.
  • Solution: Monkeypatched torch.linspace in core/process_images.py to force materialization on CPU when a meta-tensor is detected.

3. Git & LFS Management

Problem: Binary File Rejection

  • Symptom: remote: Your push was rejected because it contains binary files.
  • Cause: Pushing large .jpg, .png, or .cube files directly to Hugging Face without Git LFS, or having them in the Git history from previous commits.
  • Solution:
    • Configured .gitattributes to track *.png, *.TTF, *.npz, and *.cube with LFS.
    • Used git filter-branch to purge large binaries (raw/, white/, and root .jpg files) from the entire Git history to reduce repo size and satisfy HF hooks.

5. Image Processing Pipeline & Dependencies

Problem: KeyError: 'setting text direction... not supported without libraqm'

  • Symptom: Application crashes on Windows when attempting to render Arabic text in the layout.
  • Cause: Pillow's direction and features parameters require the libraqm library, which is difficult to install on Windows.
  • Solution:
    • Added a safety check using PIL.features.check("raqm").
    • Implemented a fallback that relies on arabic-reshaper and python-bidi for manual shaping/reordering when raqm is missing.

Problem: Background Removal failing when Retouching is enabled

  • Symptom: Background removal appeared "ignored" or reverted to original background after processing.
  • Cause: The retouch_image_pil function in core/retouch.py was converting the image to RGB for OpenCV processing, stripping the Alpha channel (transparency) created by the BG removal step.
  • Solution:
    • Updated retouch_image_pil to detect and save the Alpha channel before processing.
    • Modified the logic to restore the Alpha channel to the final retouched PIL image before returning it to the pipeline.

Problem: BiRefNet Model Inference Error

  • Symptom: TypeError or indexing errors during background removal inference.
  • Cause: Inconsistent model output formats (list of tensors vs. a single tensor) depending on the environment or transformers version.
  • Solution: Updated remove_background in core/process_images.py to check if output is a list/tuple and handle both cases robustly.

Problem: AttributeError: module 'mediapipe' has no attribute 'solutions'

  • Symptom: Skin retouching fails in the Docker container with this error.
  • Cause: Inconsistent behavior of the mediapipe package initialization in some Linux environments.
  • Solution:
    • Explicitly imported submodules like mediapipe.solutions.face_mesh at the top of the file.
    • Switched from python:3.10-slim to the full python:3.10 image to ensure a complete build environment.
    • Added libprotobuf-dev and protobuf-compiler to the Dockerfile.

Problem: Arabic/English Text appearing as "Boxes" (Tofu)

  • Symptom: All text on the print layout appears as empty squares in the Hugging Face Space.
  • Cause: The container lacked fonts with proper character support, and binary .ttf files were often corrupted as 130-byte Git LFS pointers.
  • Solution:
    • Automated Downloads: Updated Dockerfile to use wget to pull real binary fonts directly from GitHub during the build.
    • Deep Search: Implemented a recursive font discovery system in core/layout_engine.py that scans /usr/share/fonts.
    • System Fallbacks: Installed fonts-noto-extra and fonts-dejavu-core as guaranteed system-level backups.

Problem: Arabic Text appearing "Connected but Reversed"

  • Symptom: Arabic letters connect correctly but flow from Left-to-Right (e.g., "م ح م د" instead of "محمد").
  • Cause: Inconsistent behavior between local Windows (missing libraqm) and Docker Linux (has libraqm). Using get_display in an environment that already supports complex scripts causes a "double-reversal".
  • Solution:
    • Intelligent Detection: Updated _reshape_arabic in core/layout_engine.py to check for Raqm support via PIL.features.check("raqm").
    • Conditional Reordering: The system now only applies python-bidi reordering if Raqm is absent. This ensures perfect rendering in both environments without manual code changes.

Problem: Manual Cropping ignored or shifted

  • Symptom: After manually adjusting the crop in the web interface, the result still looked like the AI auto-crop or was completely wrong.
  • Cause: The backend was using cv2.imdecode which ignores EXIF orientation tags. Since the frontend cropper works on correctly oriented thumbnails, the coordinates sent to the backend didn't match the raw image orientation on disk.
  • Solution: Updated core/crop.py to use a _load_image_exif_safe helper that uses PIL to transpose the image before converting it to OpenCV format. This ensures coordinates from the web UI always match the backend image state.