Spaces:

LogicGoInfotechSpaces
/

Smile_Changer

Sleeping

File size: 9,721 Bytes

57bfe5c

# Smile Changer API Documentation

## Overview

The Smile Changer is a facial attribute editing application built on StyleFeatureEditor that allows users to modify various facial attributes like smile, age, beard, hair style/color, glasses, and makeup using AI-powered image editing.

## Table of Contents

1. [API Endpoints](#api-endpoints)
2. [Core Functions](#core-functions)
3. [Attribute Mapping](#attribute-mapping)
4. [Configuration](#configuration)
5. [Error Handling](#error-handling)
6. [Usage Examples](#usage-examples)
7. [Model Architecture](#model-architecture)
8. [Dependencies](#dependencies)

## API Endpoints

### Main Application Interface

The application is built using Gradio and provides a web-based interface with the following components:

#### Input Parameters

| Parameter | Type | Description | Default | Range |
|-----------|------|-------------|---------|-------|
| `image` | PIL.Image | Input face image | - | Any valid image format |
| `attribute` | str | Attribute to edit | "Smile" | See [Attribute Mapping](#attribute-mapping) |
| `strength` | float | Edit intensity | 5.0 | Varies by attribute |
| `align_face` | bool | Enable face alignment | False | True/False |
| `use_bg_mask` | bool | Use background masking | False | True/False |
| `custom_text_edit` | str | Custom text prompt | "" | StyleCLIP format |

#### Output

| Parameter | Type | Description |
|-----------|------|-------------|
| `edited_image` | PIL.Image | Edited face image |

## Core Functions

### `run_edit(image, attribute, strength, align_face, use_bg_mask, custom_text_edit)`

Main editing function that processes the input image and applies the specified attribute modification.

**Parameters:**
- `image` (PIL.Image): Input face image
- `attribute` (str): Attribute name from ATTRIBUTE_MAP
- `strength` (float): Edit intensity (automatically clipped to valid range)
- `align_face` (bool): Whether to align face before editing
- `use_bg_mask` (bool): Whether to use background masking
- `custom_text_edit` (str): Custom text prompt for StyleCLIP edits

**Returns:**
- `PIL.Image`: Edited image

**Process Flow:**
1. Load and initialize the SimpleRunner
2. Determine editing parameters from attribute selection
3. Apply strength clipping to valid range
4. Process image through the editing pipeline
5. Return edited result

### `get_runner() -> SimpleRunner`

Singleton function that initializes and returns the SimpleRunner instance.

**Returns:**
- `SimpleRunner`: Configured runner instance

**Features:**
- Lazy initialization
- Automatic model weight downloading
- Error handling and logging

### `ensure_weights()`

Downloads required model weights from Hugging Face if not present locally.

**Required Files:**
- `sfe_editor_light.pt` - Main editor model
- `stylegan2-ffhq-config-f.pt` - StyleGAN2 generator
- `e4e_ffhq_encode.pt` - Encoder model
- `shape_predictor_68_face_landmarks.dat` - Face landmark predictor
- Additional supporting models

## Attribute Mapping

The application supports the following facial attributes:

### Face Semantics

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Smile | `fs_smiling` | -10.0 to 10.0 | Positive adds smile, negative removes |
| Age | `age` | -10.0 to 10.0 | Positive makes older, negative makes younger |
| Female features | `gender` | -10.0 to 7.0 | Positive adds femininity |

### Facial Hair

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Beard | `trimmed_beard` | -30.0 to 30.0 | **Negative values ADD beard** |
| Mustache/Goatee | `goatee` | -7.0 to 7.0 | **Negative values ADD goatee** |

### Accessories & Cosmetics

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Glasses | `fs_glasses` | -20.0 to 30.0 | Positive adds glasses, negative removes |
| Makeup | `fs_makeup` | -10.0 to 15.0 | Positive adds makeup, negative removes |

### Hair Style

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Curly hair | `curly_hair` | 0.0 to 0.12 | Adds curly hair texture |
| Afro | `afro` | 0.0 to 0.14 | Adds afro hairstyle |

### Hair Color (Text-based)

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Orange hair (text) | `styleclip_global_a face_a face with orange hair_0.18` | 0.0 to 0.2 | Changes hair to orange |
| Blonde hair (text) | `styleclip_global_a face_a face with blonde hair_0.18` | 0.0 to 0.2 | Changes hair to blonde |

## Configuration

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `CUDA_VISIBLE_DEVICES` | GPU device selection | "" (CPU) |
| `TORCH_CUDA_ARCH_LIST` | CUDA architecture | "8.0" |
| `HF_TOKEN` | Hugging Face token | - |
| `HUGGINGFACE_TOKEN` | Alternative HF token | - |

### Model Configuration

The application uses the following configuration files:
- `configs/simple_inference.yaml` - Main inference configuration
- `pretrained_models/` - Directory containing all model weights

## Error Handling

### Common Error Scenarios

1. **Missing Model Weights**
   - Automatic download from Hugging Face
   - Fallback to CPU if GPU unavailable

2. **Face Detection Failures**
   - Multiple detection thresholds attempted
   - Graceful degradation without alignment

3. **Mask Extraction Failures**
   - Continues without background masking
   - Logs warnings for debugging

4. **Alignment Failures**
   - Falls back to unaligned processing
   - Preserves original image orientation

### Logging

The application uses Python's logging module with INFO level by default:
- Model initialization status
- Edit process progress
- Error details and stack traces
- File download and verification

## Usage Examples

### Basic Smile Enhancement

```python
from PIL import Image
from app import run_edit

# Load input image
image = Image.open("input.jpg")

# Apply smile enhancement
edited = run_edit(
    image=image,
    attribute="Smile",
    strength=5.0,
    align_face=False,
    use_bg_mask=False,
    custom_text_edit=""
)

# Save result
edited.save("output.jpg")
```

### Custom Text-based Editing

```python
# Add hat using custom text prompt
edited = run_edit(
    image=image,
    attribute="Orange hair (text)",  # Must be text-based attribute
    strength=0.18,
    align_face=True,
    use_bg_mask=True,
    custom_text_edit="styleclip_global_a face_a face with a hat_0.18"
)
```

### Beard Addition

```python
# Add beard (use negative values)
edited = run_edit(
    image=image,
    attribute="Beard",
    strength=-15.0,  # Negative value adds beard
    align_face=False,
    use_bg_mask=False,
    custom_text_edit=""
)
```

## Model Architecture

### Core Components

1. **SimpleRunner**: Main interface for image editing
2. **FSEInferenceRunner**: Handles model inference and editing
3. **LatentEditor**: Manages different editing directions
4. **StyleGAN2**: Generator for high-quality image synthesis
5. **E4E Encoder**: Encodes images to latent space

### Editing Methods

1. **InterfaceGAN Directions**: Age, smile, gender
2. **StyleSpace Directions**: Gender, facial features
3. **StyleCLIP Global Mapper**: Text-based editing
4. **DeltaEdit**: Advanced attribute manipulation

### Processing Pipeline

1. **Input Preprocessing**: Image normalization and resizing
2. **Face Alignment**: Optional landmark-based alignment
3. **Background Masking**: Optional face segmentation
4. **Latent Encoding**: Convert image to latent representation
5. **Attribute Editing**: Apply desired modifications
6. **Image Synthesis**: Generate edited result
7. **Post-processing**: Optional unalignment and blending

## Dependencies

### Core Dependencies

```
gradio==4.44.0
torch
torchvision
Pillow>=9.5
numpy>=1.23
opencv-python-headless==4.10.0.84
```

### AI/ML Dependencies

```
omegaconf==2.1.2
einops==0.7.0
timm==1.0.3
clip @ git+https://github.com/openai/CLIP.git
```

### Utility Dependencies

```
scipy==1.10.1
networkx==3.3
fsspec==2024.3.1
gdown==4.7.1
wandb==0.15.2
pandas==2.2.2
ninja>=1.11
```

### System Dependencies

```
dlib-binary
spaces>=0.28.3
setuptools>=68
wheel>=0.41
```

## Performance Considerations

### Memory Usage
- Model weights: ~2GB total
- GPU memory: ~4GB recommended
- CPU fallback available

### Processing Time
- Initialization: 30-60 seconds
- Per edit: 5-15 seconds (GPU), 30-60 seconds (CPU)
- Face alignment: +2-5 seconds
- Background masking: +3-8 seconds

### Optimization Tips
1. Use GPU when available
2. Disable alignment for faster processing
3. Use background masking only when needed
4. Batch multiple edits when possible

## Troubleshooting

### Common Issues

1. **"No module named 'piq'"**
   - Install missing dependencies: `pip install piq`

2. **CUDA initialization errors**
   - Set `CUDA_VISIBLE_DEVICES=""` for CPU-only mode
   - Check GPU compatibility

3. **Face detection failures**
   - Ensure clear, well-lit face images
   - Try different alignment settings
   - Check image resolution (minimum 256x256)

4. **Model download failures**
   - Verify Hugging Face token
   - Check internet connectivity
   - Ensure sufficient disk space

### Debug Mode

Enable detailed logging by setting:
```python
import logging
logging.basicConfig(level=logging.DEBUG)
```

## License and Credits

This application is based on the StyleFeatureEditor research project. Please refer to the original repository for licensing information and citations.

## Support

For issues and questions:
1. Check the troubleshooting section
2. Review error logs
3. Verify input image quality
4. Test with different attribute combinations