Spaces:
Sleeping
Sleeping
| # Smile Changer API Documentation | |
| ## Overview | |
| The Smile Changer is a facial attribute editing application built on StyleFeatureEditor that allows users to modify various facial attributes like smile, age, beard, hair style/color, glasses, and makeup using AI-powered image editing. | |
| ## Table of Contents | |
| 1. [API Endpoints](#api-endpoints) | |
| 2. [Core Functions](#core-functions) | |
| 3. [Attribute Mapping](#attribute-mapping) | |
| 4. [Configuration](#configuration) | |
| 5. [Error Handling](#error-handling) | |
| 6. [Usage Examples](#usage-examples) | |
| 7. [Model Architecture](#model-architecture) | |
| 8. [Dependencies](#dependencies) | |
| ## API Endpoints | |
| ### Main Application Interface | |
| The application is built using Gradio and provides a web-based interface with the following components: | |
| #### Input Parameters | |
| | Parameter | Type | Description | Default | Range | | |
| |-----------|------|-------------|---------|-------| | |
| | `image` | PIL.Image | Input face image | - | Any valid image format | | |
| | `attribute` | str | Attribute to edit | "Smile" | See [Attribute Mapping](#attribute-mapping) | | |
| | `strength` | float | Edit intensity | 5.0 | Varies by attribute | | |
| | `align_face` | bool | Enable face alignment | False | True/False | | |
| | `use_bg_mask` | bool | Use background masking | False | True/False | | |
| | `custom_text_edit` | str | Custom text prompt | "" | StyleCLIP format | | |
| #### Output | |
| | Parameter | Type | Description | | |
| |-----------|------|-------------| | |
| | `edited_image` | PIL.Image | Edited face image | | |
| ## Core Functions | |
| ### `run_edit(image, attribute, strength, align_face, use_bg_mask, custom_text_edit)` | |
| Main editing function that processes the input image and applies the specified attribute modification. | |
| **Parameters:** | |
| - `image` (PIL.Image): Input face image | |
| - `attribute` (str): Attribute name from ATTRIBUTE_MAP | |
| - `strength` (float): Edit intensity (automatically clipped to valid range) | |
| - `align_face` (bool): Whether to align face before editing | |
| - `use_bg_mask` (bool): Whether to use background masking | |
| - `custom_text_edit` (str): Custom text prompt for StyleCLIP edits | |
| **Returns:** | |
| - `PIL.Image`: Edited image | |
| **Process Flow:** | |
| 1. Load and initialize the SimpleRunner | |
| 2. Determine editing parameters from attribute selection | |
| 3. Apply strength clipping to valid range | |
| 4. Process image through the editing pipeline | |
| 5. Return edited result | |
| ### `get_runner() -> SimpleRunner` | |
| Singleton function that initializes and returns the SimpleRunner instance. | |
| **Returns:** | |
| - `SimpleRunner`: Configured runner instance | |
| **Features:** | |
| - Lazy initialization | |
| - Automatic model weight downloading | |
| - Error handling and logging | |
| ### `ensure_weights()` | |
| Downloads required model weights from Hugging Face if not present locally. | |
| **Required Files:** | |
| - `sfe_editor_light.pt` - Main editor model | |
| - `stylegan2-ffhq-config-f.pt` - StyleGAN2 generator | |
| - `e4e_ffhq_encode.pt` - Encoder model | |
| - `shape_predictor_68_face_landmarks.dat` - Face landmark predictor | |
| - Additional supporting models | |
| ## Attribute Mapping | |
| The application supports the following facial attributes: | |
| ### Face Semantics | |
| | Attribute | Internal Name | Range | Description | | |
| |-----------|---------------|-------|-------------| | |
| | Smile | `fs_smiling` | -10.0 to 10.0 | Positive adds smile, negative removes | | |
| | Age | `age` | -10.0 to 10.0 | Positive makes older, negative makes younger | | |
| | Female features | `gender` | -10.0 to 7.0 | Positive adds femininity | | |
| ### Facial Hair | |
| | Attribute | Internal Name | Range | Description | | |
| |-----------|---------------|-------|-------------| | |
| | Beard | `trimmed_beard` | -30.0 to 30.0 | **Negative values ADD beard** | | |
| | Mustache/Goatee | `goatee` | -7.0 to 7.0 | **Negative values ADD goatee** | | |
| ### Accessories & Cosmetics | |
| | Attribute | Internal Name | Range | Description | | |
| |-----------|---------------|-------|-------------| | |
| | Glasses | `fs_glasses` | -20.0 to 30.0 | Positive adds glasses, negative removes | | |
| | Makeup | `fs_makeup` | -10.0 to 15.0 | Positive adds makeup, negative removes | | |
| ### Hair Style | |
| | Attribute | Internal Name | Range | Description | | |
| |-----------|---------------|-------|-------------| | |
| | Curly hair | `curly_hair` | 0.0 to 0.12 | Adds curly hair texture | | |
| | Afro | `afro` | 0.0 to 0.14 | Adds afro hairstyle | | |
| ### Hair Color (Text-based) | |
| | Attribute | Internal Name | Range | Description | | |
| |-----------|---------------|-------|-------------| | |
| | Orange hair (text) | `styleclip_global_a face_a face with orange hair_0.18` | 0.0 to 0.2 | Changes hair to orange | | |
| | Blonde hair (text) | `styleclip_global_a face_a face with blonde hair_0.18` | 0.0 to 0.2 | Changes hair to blonde | | |
| ## Configuration | |
| ### Environment Variables | |
| | Variable | Description | Default | | |
| |----------|-------------|---------| | |
| | `CUDA_VISIBLE_DEVICES` | GPU device selection | "" (CPU) | | |
| | `TORCH_CUDA_ARCH_LIST` | CUDA architecture | "8.0" | | |
| | `HF_TOKEN` | Hugging Face token | - | | |
| | `HUGGINGFACE_TOKEN` | Alternative HF token | - | | |
| ### Model Configuration | |
| The application uses the following configuration files: | |
| - `configs/simple_inference.yaml` - Main inference configuration | |
| - `pretrained_models/` - Directory containing all model weights | |
| ## Error Handling | |
| ### Common Error Scenarios | |
| 1. **Missing Model Weights** | |
| - Automatic download from Hugging Face | |
| - Fallback to CPU if GPU unavailable | |
| 2. **Face Detection Failures** | |
| - Multiple detection thresholds attempted | |
| - Graceful degradation without alignment | |
| 3. **Mask Extraction Failures** | |
| - Continues without background masking | |
| - Logs warnings for debugging | |
| 4. **Alignment Failures** | |
| - Falls back to unaligned processing | |
| - Preserves original image orientation | |
| ### Logging | |
| The application uses Python's logging module with INFO level by default: | |
| - Model initialization status | |
| - Edit process progress | |
| - Error details and stack traces | |
| - File download and verification | |
| ## Usage Examples | |
| ### Basic Smile Enhancement | |
| ```python | |
| from PIL import Image | |
| from app import run_edit | |
| # Load input image | |
| image = Image.open("input.jpg") | |
| # Apply smile enhancement | |
| edited = run_edit( | |
| image=image, | |
| attribute="Smile", | |
| strength=5.0, | |
| align_face=False, | |
| use_bg_mask=False, | |
| custom_text_edit="" | |
| ) | |
| # Save result | |
| edited.save("output.jpg") | |
| ``` | |
| ### Custom Text-based Editing | |
| ```python | |
| # Add hat using custom text prompt | |
| edited = run_edit( | |
| image=image, | |
| attribute="Orange hair (text)", # Must be text-based attribute | |
| strength=0.18, | |
| align_face=True, | |
| use_bg_mask=True, | |
| custom_text_edit="styleclip_global_a face_a face with a hat_0.18" | |
| ) | |
| ``` | |
| ### Beard Addition | |
| ```python | |
| # Add beard (use negative values) | |
| edited = run_edit( | |
| image=image, | |
| attribute="Beard", | |
| strength=-15.0, # Negative value adds beard | |
| align_face=False, | |
| use_bg_mask=False, | |
| custom_text_edit="" | |
| ) | |
| ``` | |
| ## Model Architecture | |
| ### Core Components | |
| 1. **SimpleRunner**: Main interface for image editing | |
| 2. **FSEInferenceRunner**: Handles model inference and editing | |
| 3. **LatentEditor**: Manages different editing directions | |
| 4. **StyleGAN2**: Generator for high-quality image synthesis | |
| 5. **E4E Encoder**: Encodes images to latent space | |
| ### Editing Methods | |
| 1. **InterfaceGAN Directions**: Age, smile, gender | |
| 2. **StyleSpace Directions**: Gender, facial features | |
| 3. **StyleCLIP Global Mapper**: Text-based editing | |
| 4. **DeltaEdit**: Advanced attribute manipulation | |
| ### Processing Pipeline | |
| 1. **Input Preprocessing**: Image normalization and resizing | |
| 2. **Face Alignment**: Optional landmark-based alignment | |
| 3. **Background Masking**: Optional face segmentation | |
| 4. **Latent Encoding**: Convert image to latent representation | |
| 5. **Attribute Editing**: Apply desired modifications | |
| 6. **Image Synthesis**: Generate edited result | |
| 7. **Post-processing**: Optional unalignment and blending | |
| ## Dependencies | |
| ### Core Dependencies | |
| ``` | |
| gradio==4.44.0 | |
| torch | |
| torchvision | |
| Pillow>=9.5 | |
| numpy>=1.23 | |
| opencv-python-headless==4.10.0.84 | |
| ``` | |
| ### AI/ML Dependencies | |
| ``` | |
| omegaconf==2.1.2 | |
| einops==0.7.0 | |
| timm==1.0.3 | |
| clip @ git+https://github.com/openai/CLIP.git | |
| ``` | |
| ### Utility Dependencies | |
| ``` | |
| scipy==1.10.1 | |
| networkx==3.3 | |
| fsspec==2024.3.1 | |
| gdown==4.7.1 | |
| wandb==0.15.2 | |
| pandas==2.2.2 | |
| ninja>=1.11 | |
| ``` | |
| ### System Dependencies | |
| ``` | |
| dlib-binary | |
| spaces>=0.28.3 | |
| setuptools>=68 | |
| wheel>=0.41 | |
| ``` | |
| ## Performance Considerations | |
| ### Memory Usage | |
| - Model weights: ~2GB total | |
| - GPU memory: ~4GB recommended | |
| - CPU fallback available | |
| ### Processing Time | |
| - Initialization: 30-60 seconds | |
| - Per edit: 5-15 seconds (GPU), 30-60 seconds (CPU) | |
| - Face alignment: +2-5 seconds | |
| - Background masking: +3-8 seconds | |
| ### Optimization Tips | |
| 1. Use GPU when available | |
| 2. Disable alignment for faster processing | |
| 3. Use background masking only when needed | |
| 4. Batch multiple edits when possible | |
| ## Troubleshooting | |
| ### Common Issues | |
| 1. **"No module named 'piq'"** | |
| - Install missing dependencies: `pip install piq` | |
| 2. **CUDA initialization errors** | |
| - Set `CUDA_VISIBLE_DEVICES=""` for CPU-only mode | |
| - Check GPU compatibility | |
| 3. **Face detection failures** | |
| - Ensure clear, well-lit face images | |
| - Try different alignment settings | |
| - Check image resolution (minimum 256x256) | |
| 4. **Model download failures** | |
| - Verify Hugging Face token | |
| - Check internet connectivity | |
| - Ensure sufficient disk space | |
| ### Debug Mode | |
| Enable detailed logging by setting: | |
| ```python | |
| import logging | |
| logging.basicConfig(level=logging.DEBUG) | |
| ``` | |
| ## License and Credits | |
| This application is based on the StyleFeatureEditor research project. Please refer to the original repository for licensing information and citations. | |
| ## Support | |
| For issues and questions: | |
| 1. Check the troubleshooting section | |
| 2. Review error logs | |
| 3. Verify input image quality | |
| 4. Test with different attribute combinations | |