File size: 9,721 Bytes
57bfe5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
# Smile Changer API Documentation

## Overview

The Smile Changer is a facial attribute editing application built on StyleFeatureEditor that allows users to modify various facial attributes like smile, age, beard, hair style/color, glasses, and makeup using AI-powered image editing.

## Table of Contents

1. [API Endpoints](#api-endpoints)
2. [Core Functions](#core-functions)
3. [Attribute Mapping](#attribute-mapping)
4. [Configuration](#configuration)
5. [Error Handling](#error-handling)
6. [Usage Examples](#usage-examples)
7. [Model Architecture](#model-architecture)
8. [Dependencies](#dependencies)

## API Endpoints

### Main Application Interface

The application is built using Gradio and provides a web-based interface with the following components:

#### Input Parameters

| Parameter | Type | Description | Default | Range |
|-----------|------|-------------|---------|-------|
| `image` | PIL.Image | Input face image | - | Any valid image format |
| `attribute` | str | Attribute to edit | "Smile" | See [Attribute Mapping](#attribute-mapping) |
| `strength` | float | Edit intensity | 5.0 | Varies by attribute |
| `align_face` | bool | Enable face alignment | False | True/False |
| `use_bg_mask` | bool | Use background masking | False | True/False |
| `custom_text_edit` | str | Custom text prompt | "" | StyleCLIP format |

#### Output

| Parameter | Type | Description |
|-----------|------|-------------|
| `edited_image` | PIL.Image | Edited face image |

## Core Functions

### `run_edit(image, attribute, strength, align_face, use_bg_mask, custom_text_edit)`

Main editing function that processes the input image and applies the specified attribute modification.

**Parameters:**
- `image` (PIL.Image): Input face image
- `attribute` (str): Attribute name from ATTRIBUTE_MAP
- `strength` (float): Edit intensity (automatically clipped to valid range)
- `align_face` (bool): Whether to align face before editing
- `use_bg_mask` (bool): Whether to use background masking
- `custom_text_edit` (str): Custom text prompt for StyleCLIP edits

**Returns:**
- `PIL.Image`: Edited image

**Process Flow:**
1. Load and initialize the SimpleRunner
2. Determine editing parameters from attribute selection
3. Apply strength clipping to valid range
4. Process image through the editing pipeline
5. Return edited result

### `get_runner() -> SimpleRunner`

Singleton function that initializes and returns the SimpleRunner instance.

**Returns:**
- `SimpleRunner`: Configured runner instance

**Features:**
- Lazy initialization
- Automatic model weight downloading
- Error handling and logging

### `ensure_weights()`

Downloads required model weights from Hugging Face if not present locally.

**Required Files:**
- `sfe_editor_light.pt` - Main editor model
- `stylegan2-ffhq-config-f.pt` - StyleGAN2 generator
- `e4e_ffhq_encode.pt` - Encoder model
- `shape_predictor_68_face_landmarks.dat` - Face landmark predictor
- Additional supporting models

## Attribute Mapping

The application supports the following facial attributes:

### Face Semantics

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Smile | `fs_smiling` | -10.0 to 10.0 | Positive adds smile, negative removes |
| Age | `age` | -10.0 to 10.0 | Positive makes older, negative makes younger |
| Female features | `gender` | -10.0 to 7.0 | Positive adds femininity |

### Facial Hair

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Beard | `trimmed_beard` | -30.0 to 30.0 | **Negative values ADD beard** |
| Mustache/Goatee | `goatee` | -7.0 to 7.0 | **Negative values ADD goatee** |

### Accessories & Cosmetics

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Glasses | `fs_glasses` | -20.0 to 30.0 | Positive adds glasses, negative removes |
| Makeup | `fs_makeup` | -10.0 to 15.0 | Positive adds makeup, negative removes |

### Hair Style

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Curly hair | `curly_hair` | 0.0 to 0.12 | Adds curly hair texture |
| Afro | `afro` | 0.0 to 0.14 | Adds afro hairstyle |

### Hair Color (Text-based)

| Attribute | Internal Name | Range | Description |
|-----------|---------------|-------|-------------|
| Orange hair (text) | `styleclip_global_a face_a face with orange hair_0.18` | 0.0 to 0.2 | Changes hair to orange |
| Blonde hair (text) | `styleclip_global_a face_a face with blonde hair_0.18` | 0.0 to 0.2 | Changes hair to blonde |

## Configuration

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `CUDA_VISIBLE_DEVICES` | GPU device selection | "" (CPU) |
| `TORCH_CUDA_ARCH_LIST` | CUDA architecture | "8.0" |
| `HF_TOKEN` | Hugging Face token | - |
| `HUGGINGFACE_TOKEN` | Alternative HF token | - |

### Model Configuration

The application uses the following configuration files:
- `configs/simple_inference.yaml` - Main inference configuration
- `pretrained_models/` - Directory containing all model weights

## Error Handling

### Common Error Scenarios

1. **Missing Model Weights**
   - Automatic download from Hugging Face
   - Fallback to CPU if GPU unavailable

2. **Face Detection Failures**
   - Multiple detection thresholds attempted
   - Graceful degradation without alignment

3. **Mask Extraction Failures**
   - Continues without background masking
   - Logs warnings for debugging

4. **Alignment Failures**
   - Falls back to unaligned processing
   - Preserves original image orientation

### Logging

The application uses Python's logging module with INFO level by default:
- Model initialization status
- Edit process progress
- Error details and stack traces
- File download and verification

## Usage Examples

### Basic Smile Enhancement

```python
from PIL import Image
from app import run_edit

# Load input image
image = Image.open("input.jpg")

# Apply smile enhancement
edited = run_edit(
    image=image,
    attribute="Smile",
    strength=5.0,
    align_face=False,
    use_bg_mask=False,
    custom_text_edit=""
)

# Save result
edited.save("output.jpg")
```

### Custom Text-based Editing

```python
# Add hat using custom text prompt
edited = run_edit(
    image=image,
    attribute="Orange hair (text)",  # Must be text-based attribute
    strength=0.18,
    align_face=True,
    use_bg_mask=True,
    custom_text_edit="styleclip_global_a face_a face with a hat_0.18"
)
```

### Beard Addition

```python
# Add beard (use negative values)
edited = run_edit(
    image=image,
    attribute="Beard",
    strength=-15.0,  # Negative value adds beard
    align_face=False,
    use_bg_mask=False,
    custom_text_edit=""
)
```

## Model Architecture

### Core Components

1. **SimpleRunner**: Main interface for image editing
2. **FSEInferenceRunner**: Handles model inference and editing
3. **LatentEditor**: Manages different editing directions
4. **StyleGAN2**: Generator for high-quality image synthesis
5. **E4E Encoder**: Encodes images to latent space

### Editing Methods

1. **InterfaceGAN Directions**: Age, smile, gender
2. **StyleSpace Directions**: Gender, facial features
3. **StyleCLIP Global Mapper**: Text-based editing
4. **DeltaEdit**: Advanced attribute manipulation

### Processing Pipeline

1. **Input Preprocessing**: Image normalization and resizing
2. **Face Alignment**: Optional landmark-based alignment
3. **Background Masking**: Optional face segmentation
4. **Latent Encoding**: Convert image to latent representation
5. **Attribute Editing**: Apply desired modifications
6. **Image Synthesis**: Generate edited result
7. **Post-processing**: Optional unalignment and blending

## Dependencies

### Core Dependencies

```
gradio==4.44.0
torch
torchvision
Pillow>=9.5
numpy>=1.23
opencv-python-headless==4.10.0.84
```

### AI/ML Dependencies

```
omegaconf==2.1.2
einops==0.7.0
timm==1.0.3
clip @ git+https://github.com/openai/CLIP.git
```

### Utility Dependencies

```
scipy==1.10.1
networkx==3.3
fsspec==2024.3.1
gdown==4.7.1
wandb==0.15.2
pandas==2.2.2
ninja>=1.11
```

### System Dependencies

```
dlib-binary
spaces>=0.28.3
setuptools>=68
wheel>=0.41
```

## Performance Considerations

### Memory Usage
- Model weights: ~2GB total
- GPU memory: ~4GB recommended
- CPU fallback available

### Processing Time
- Initialization: 30-60 seconds
- Per edit: 5-15 seconds (GPU), 30-60 seconds (CPU)
- Face alignment: +2-5 seconds
- Background masking: +3-8 seconds

### Optimization Tips
1. Use GPU when available
2. Disable alignment for faster processing
3. Use background masking only when needed
4. Batch multiple edits when possible

## Troubleshooting

### Common Issues

1. **"No module named 'piq'"**
   - Install missing dependencies: `pip install piq`

2. **CUDA initialization errors**
   - Set `CUDA_VISIBLE_DEVICES=""` for CPU-only mode
   - Check GPU compatibility

3. **Face detection failures**
   - Ensure clear, well-lit face images
   - Try different alignment settings
   - Check image resolution (minimum 256x256)

4. **Model download failures**
   - Verify Hugging Face token
   - Check internet connectivity
   - Ensure sufficient disk space

### Debug Mode

Enable detailed logging by setting:
```python
import logging
logging.basicConfig(level=logging.DEBUG)
```

## License and Credits

This application is based on the StyleFeatureEditor research project. Please refer to the original repository for licensing information and citations.

## Support

For issues and questions:
1. Check the troubleshooting section
2. Review error logs
3. Verify input image quality
4. Test with different attribute combinations