# Smile Changer API Documentation ## Overview The Smile Changer is a facial attribute editing application built on StyleFeatureEditor that allows users to modify various facial attributes like smile, age, beard, hair style/color, glasses, and makeup using AI-powered image editing. ## Table of Contents 1. [API Endpoints](#api-endpoints) 2. [Core Functions](#core-functions) 3. [Attribute Mapping](#attribute-mapping) 4. [Configuration](#configuration) 5. [Error Handling](#error-handling) 6. [Usage Examples](#usage-examples) 7. [Model Architecture](#model-architecture) 8. [Dependencies](#dependencies) ## API Endpoints ### Main Application Interface The application is built using Gradio and provides a web-based interface with the following components: #### Input Parameters | Parameter | Type | Description | Default | Range | |-----------|------|-------------|---------|-------| | `image` | PIL.Image | Input face image | - | Any valid image format | | `attribute` | str | Attribute to edit | "Smile" | See [Attribute Mapping](#attribute-mapping) | | `strength` | float | Edit intensity | 5.0 | Varies by attribute | | `align_face` | bool | Enable face alignment | False | True/False | | `use_bg_mask` | bool | Use background masking | False | True/False | | `custom_text_edit` | str | Custom text prompt | "" | StyleCLIP format | #### Output | Parameter | Type | Description | |-----------|------|-------------| | `edited_image` | PIL.Image | Edited face image | ## Core Functions ### `run_edit(image, attribute, strength, align_face, use_bg_mask, custom_text_edit)` Main editing function that processes the input image and applies the specified attribute modification. **Parameters:** - `image` (PIL.Image): Input face image - `attribute` (str): Attribute name from ATTRIBUTE_MAP - `strength` (float): Edit intensity (automatically clipped to valid range) - `align_face` (bool): Whether to align face before editing - `use_bg_mask` (bool): Whether to use background masking - `custom_text_edit` (str): Custom text prompt for StyleCLIP edits **Returns:** - `PIL.Image`: Edited image **Process Flow:** 1. Load and initialize the SimpleRunner 2. Determine editing parameters from attribute selection 3. Apply strength clipping to valid range 4. Process image through the editing pipeline 5. Return edited result ### `get_runner() -> SimpleRunner` Singleton function that initializes and returns the SimpleRunner instance. **Returns:** - `SimpleRunner`: Configured runner instance **Features:** - Lazy initialization - Automatic model weight downloading - Error handling and logging ### `ensure_weights()` Downloads required model weights from Hugging Face if not present locally. **Required Files:** - `sfe_editor_light.pt` - Main editor model - `stylegan2-ffhq-config-f.pt` - StyleGAN2 generator - `e4e_ffhq_encode.pt` - Encoder model - `shape_predictor_68_face_landmarks.dat` - Face landmark predictor - Additional supporting models ## Attribute Mapping The application supports the following facial attributes: ### Face Semantics | Attribute | Internal Name | Range | Description | |-----------|---------------|-------|-------------| | Smile | `fs_smiling` | -10.0 to 10.0 | Positive adds smile, negative removes | | Age | `age` | -10.0 to 10.0 | Positive makes older, negative makes younger | | Female features | `gender` | -10.0 to 7.0 | Positive adds femininity | ### Facial Hair | Attribute | Internal Name | Range | Description | |-----------|---------------|-------|-------------| | Beard | `trimmed_beard` | -30.0 to 30.0 | **Negative values ADD beard** | | Mustache/Goatee | `goatee` | -7.0 to 7.0 | **Negative values ADD goatee** | ### Accessories & Cosmetics | Attribute | Internal Name | Range | Description | |-----------|---------------|-------|-------------| | Glasses | `fs_glasses` | -20.0 to 30.0 | Positive adds glasses, negative removes | | Makeup | `fs_makeup` | -10.0 to 15.0 | Positive adds makeup, negative removes | ### Hair Style | Attribute | Internal Name | Range | Description | |-----------|---------------|-------|-------------| | Curly hair | `curly_hair` | 0.0 to 0.12 | Adds curly hair texture | | Afro | `afro` | 0.0 to 0.14 | Adds afro hairstyle | ### Hair Color (Text-based) | Attribute | Internal Name | Range | Description | |-----------|---------------|-------|-------------| | Orange hair (text) | `styleclip_global_a face_a face with orange hair_0.18` | 0.0 to 0.2 | Changes hair to orange | | Blonde hair (text) | `styleclip_global_a face_a face with blonde hair_0.18` | 0.0 to 0.2 | Changes hair to blonde | ## Configuration ### Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `CUDA_VISIBLE_DEVICES` | GPU device selection | "" (CPU) | | `TORCH_CUDA_ARCH_LIST` | CUDA architecture | "8.0" | | `HF_TOKEN` | Hugging Face token | - | | `HUGGINGFACE_TOKEN` | Alternative HF token | - | ### Model Configuration The application uses the following configuration files: - `configs/simple_inference.yaml` - Main inference configuration - `pretrained_models/` - Directory containing all model weights ## Error Handling ### Common Error Scenarios 1. **Missing Model Weights** - Automatic download from Hugging Face - Fallback to CPU if GPU unavailable 2. **Face Detection Failures** - Multiple detection thresholds attempted - Graceful degradation without alignment 3. **Mask Extraction Failures** - Continues without background masking - Logs warnings for debugging 4. **Alignment Failures** - Falls back to unaligned processing - Preserves original image orientation ### Logging The application uses Python's logging module with INFO level by default: - Model initialization status - Edit process progress - Error details and stack traces - File download and verification ## Usage Examples ### Basic Smile Enhancement ```python from PIL import Image from app import run_edit # Load input image image = Image.open("input.jpg") # Apply smile enhancement edited = run_edit( image=image, attribute="Smile", strength=5.0, align_face=False, use_bg_mask=False, custom_text_edit="" ) # Save result edited.save("output.jpg") ``` ### Custom Text-based Editing ```python # Add hat using custom text prompt edited = run_edit( image=image, attribute="Orange hair (text)", # Must be text-based attribute strength=0.18, align_face=True, use_bg_mask=True, custom_text_edit="styleclip_global_a face_a face with a hat_0.18" ) ``` ### Beard Addition ```python # Add beard (use negative values) edited = run_edit( image=image, attribute="Beard", strength=-15.0, # Negative value adds beard align_face=False, use_bg_mask=False, custom_text_edit="" ) ``` ## Model Architecture ### Core Components 1. **SimpleRunner**: Main interface for image editing 2. **FSEInferenceRunner**: Handles model inference and editing 3. **LatentEditor**: Manages different editing directions 4. **StyleGAN2**: Generator for high-quality image synthesis 5. **E4E Encoder**: Encodes images to latent space ### Editing Methods 1. **InterfaceGAN Directions**: Age, smile, gender 2. **StyleSpace Directions**: Gender, facial features 3. **StyleCLIP Global Mapper**: Text-based editing 4. **DeltaEdit**: Advanced attribute manipulation ### Processing Pipeline 1. **Input Preprocessing**: Image normalization and resizing 2. **Face Alignment**: Optional landmark-based alignment 3. **Background Masking**: Optional face segmentation 4. **Latent Encoding**: Convert image to latent representation 5. **Attribute Editing**: Apply desired modifications 6. **Image Synthesis**: Generate edited result 7. **Post-processing**: Optional unalignment and blending ## Dependencies ### Core Dependencies ``` gradio==4.44.0 torch torchvision Pillow>=9.5 numpy>=1.23 opencv-python-headless==4.10.0.84 ``` ### AI/ML Dependencies ``` omegaconf==2.1.2 einops==0.7.0 timm==1.0.3 clip @ git+https://github.com/openai/CLIP.git ``` ### Utility Dependencies ``` scipy==1.10.1 networkx==3.3 fsspec==2024.3.1 gdown==4.7.1 wandb==0.15.2 pandas==2.2.2 ninja>=1.11 ``` ### System Dependencies ``` dlib-binary spaces>=0.28.3 setuptools>=68 wheel>=0.41 ``` ## Performance Considerations ### Memory Usage - Model weights: ~2GB total - GPU memory: ~4GB recommended - CPU fallback available ### Processing Time - Initialization: 30-60 seconds - Per edit: 5-15 seconds (GPU), 30-60 seconds (CPU) - Face alignment: +2-5 seconds - Background masking: +3-8 seconds ### Optimization Tips 1. Use GPU when available 2. Disable alignment for faster processing 3. Use background masking only when needed 4. Batch multiple edits when possible ## Troubleshooting ### Common Issues 1. **"No module named 'piq'"** - Install missing dependencies: `pip install piq` 2. **CUDA initialization errors** - Set `CUDA_VISIBLE_DEVICES=""` for CPU-only mode - Check GPU compatibility 3. **Face detection failures** - Ensure clear, well-lit face images - Try different alignment settings - Check image resolution (minimum 256x256) 4. **Model download failures** - Verify Hugging Face token - Check internet connectivity - Ensure sufficient disk space ### Debug Mode Enable detailed logging by setting: ```python import logging logging.basicConfig(level=logging.DEBUG) ``` ## License and Credits This application is based on the StyleFeatureEditor research project. Please refer to the original repository for licensing information and citations. ## Support For issues and questions: 1. Check the troubleshooting section 2. Review error logs 3. Verify input image quality 4. Test with different attribute combinations