LogicGoInfotechSpaces commited on
Commit
57bfe5c
·
1 Parent(s): 0154081

API: add FastAPI endpoints, bearer auth, align default true; docs updated

Browse files
Files changed (3) hide show
  1. API_DOCUMENTATION.md +354 -0
  2. app.py +113 -3
  3. requirements.txt +2 -0
API_DOCUMENTATION.md ADDED
@@ -0,0 +1,354 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Smile Changer API Documentation
2
+
3
+ ## Overview
4
+
5
+ The Smile Changer is a facial attribute editing application built on StyleFeatureEditor that allows users to modify various facial attributes like smile, age, beard, hair style/color, glasses, and makeup using AI-powered image editing.
6
+
7
+ ## Table of Contents
8
+
9
+ 1. [API Endpoints](#api-endpoints)
10
+ 2. [Core Functions](#core-functions)
11
+ 3. [Attribute Mapping](#attribute-mapping)
12
+ 4. [Configuration](#configuration)
13
+ 5. [Error Handling](#error-handling)
14
+ 6. [Usage Examples](#usage-examples)
15
+ 7. [Model Architecture](#model-architecture)
16
+ 8. [Dependencies](#dependencies)
17
+
18
+ ## API Endpoints
19
+
20
+ ### Main Application Interface
21
+
22
+ The application is built using Gradio and provides a web-based interface with the following components:
23
+
24
+ #### Input Parameters
25
+
26
+ | Parameter | Type | Description | Default | Range |
27
+ |-----------|------|-------------|---------|-------|
28
+ | `image` | PIL.Image | Input face image | - | Any valid image format |
29
+ | `attribute` | str | Attribute to edit | "Smile" | See [Attribute Mapping](#attribute-mapping) |
30
+ | `strength` | float | Edit intensity | 5.0 | Varies by attribute |
31
+ | `align_face` | bool | Enable face alignment | False | True/False |
32
+ | `use_bg_mask` | bool | Use background masking | False | True/False |
33
+ | `custom_text_edit` | str | Custom text prompt | "" | StyleCLIP format |
34
+
35
+ #### Output
36
+
37
+ | Parameter | Type | Description |
38
+ |-----------|------|-------------|
39
+ | `edited_image` | PIL.Image | Edited face image |
40
+
41
+ ## Core Functions
42
+
43
+ ### `run_edit(image, attribute, strength, align_face, use_bg_mask, custom_text_edit)`
44
+
45
+ Main editing function that processes the input image and applies the specified attribute modification.
46
+
47
+ **Parameters:**
48
+ - `image` (PIL.Image): Input face image
49
+ - `attribute` (str): Attribute name from ATTRIBUTE_MAP
50
+ - `strength` (float): Edit intensity (automatically clipped to valid range)
51
+ - `align_face` (bool): Whether to align face before editing
52
+ - `use_bg_mask` (bool): Whether to use background masking
53
+ - `custom_text_edit` (str): Custom text prompt for StyleCLIP edits
54
+
55
+ **Returns:**
56
+ - `PIL.Image`: Edited image
57
+
58
+ **Process Flow:**
59
+ 1. Load and initialize the SimpleRunner
60
+ 2. Determine editing parameters from attribute selection
61
+ 3. Apply strength clipping to valid range
62
+ 4. Process image through the editing pipeline
63
+ 5. Return edited result
64
+
65
+ ### `get_runner() -> SimpleRunner`
66
+
67
+ Singleton function that initializes and returns the SimpleRunner instance.
68
+
69
+ **Returns:**
70
+ - `SimpleRunner`: Configured runner instance
71
+
72
+ **Features:**
73
+ - Lazy initialization
74
+ - Automatic model weight downloading
75
+ - Error handling and logging
76
+
77
+ ### `ensure_weights()`
78
+
79
+ Downloads required model weights from Hugging Face if not present locally.
80
+
81
+ **Required Files:**
82
+ - `sfe_editor_light.pt` - Main editor model
83
+ - `stylegan2-ffhq-config-f.pt` - StyleGAN2 generator
84
+ - `e4e_ffhq_encode.pt` - Encoder model
85
+ - `shape_predictor_68_face_landmarks.dat` - Face landmark predictor
86
+ - Additional supporting models
87
+
88
+ ## Attribute Mapping
89
+
90
+ The application supports the following facial attributes:
91
+
92
+ ### Face Semantics
93
+
94
+ | Attribute | Internal Name | Range | Description |
95
+ |-----------|---------------|-------|-------------|
96
+ | Smile | `fs_smiling` | -10.0 to 10.0 | Positive adds smile, negative removes |
97
+ | Age | `age` | -10.0 to 10.0 | Positive makes older, negative makes younger |
98
+ | Female features | `gender` | -10.0 to 7.0 | Positive adds femininity |
99
+
100
+ ### Facial Hair
101
+
102
+ | Attribute | Internal Name | Range | Description |
103
+ |-----------|---------------|-------|-------------|
104
+ | Beard | `trimmed_beard` | -30.0 to 30.0 | **Negative values ADD beard** |
105
+ | Mustache/Goatee | `goatee` | -7.0 to 7.0 | **Negative values ADD goatee** |
106
+
107
+ ### Accessories & Cosmetics
108
+
109
+ | Attribute | Internal Name | Range | Description |
110
+ |-----------|---------------|-------|-------------|
111
+ | Glasses | `fs_glasses` | -20.0 to 30.0 | Positive adds glasses, negative removes |
112
+ | Makeup | `fs_makeup` | -10.0 to 15.0 | Positive adds makeup, negative removes |
113
+
114
+ ### Hair Style
115
+
116
+ | Attribute | Internal Name | Range | Description |
117
+ |-----------|---------------|-------|-------------|
118
+ | Curly hair | `curly_hair` | 0.0 to 0.12 | Adds curly hair texture |
119
+ | Afro | `afro` | 0.0 to 0.14 | Adds afro hairstyle |
120
+
121
+ ### Hair Color (Text-based)
122
+
123
+ | Attribute | Internal Name | Range | Description |
124
+ |-----------|---------------|-------|-------------|
125
+ | Orange hair (text) | `styleclip_global_a face_a face with orange hair_0.18` | 0.0 to 0.2 | Changes hair to orange |
126
+ | Blonde hair (text) | `styleclip_global_a face_a face with blonde hair_0.18` | 0.0 to 0.2 | Changes hair to blonde |
127
+
128
+ ## Configuration
129
+
130
+ ### Environment Variables
131
+
132
+ | Variable | Description | Default |
133
+ |----------|-------------|---------|
134
+ | `CUDA_VISIBLE_DEVICES` | GPU device selection | "" (CPU) |
135
+ | `TORCH_CUDA_ARCH_LIST` | CUDA architecture | "8.0" |
136
+ | `HF_TOKEN` | Hugging Face token | - |
137
+ | `HUGGINGFACE_TOKEN` | Alternative HF token | - |
138
+
139
+ ### Model Configuration
140
+
141
+ The application uses the following configuration files:
142
+ - `configs/simple_inference.yaml` - Main inference configuration
143
+ - `pretrained_models/` - Directory containing all model weights
144
+
145
+ ## Error Handling
146
+
147
+ ### Common Error Scenarios
148
+
149
+ 1. **Missing Model Weights**
150
+ - Automatic download from Hugging Face
151
+ - Fallback to CPU if GPU unavailable
152
+
153
+ 2. **Face Detection Failures**
154
+ - Multiple detection thresholds attempted
155
+ - Graceful degradation without alignment
156
+
157
+ 3. **Mask Extraction Failures**
158
+ - Continues without background masking
159
+ - Logs warnings for debugging
160
+
161
+ 4. **Alignment Failures**
162
+ - Falls back to unaligned processing
163
+ - Preserves original image orientation
164
+
165
+ ### Logging
166
+
167
+ The application uses Python's logging module with INFO level by default:
168
+ - Model initialization status
169
+ - Edit process progress
170
+ - Error details and stack traces
171
+ - File download and verification
172
+
173
+ ## Usage Examples
174
+
175
+ ### Basic Smile Enhancement
176
+
177
+ ```python
178
+ from PIL import Image
179
+ from app import run_edit
180
+
181
+ # Load input image
182
+ image = Image.open("input.jpg")
183
+
184
+ # Apply smile enhancement
185
+ edited = run_edit(
186
+ image=image,
187
+ attribute="Smile",
188
+ strength=5.0,
189
+ align_face=False,
190
+ use_bg_mask=False,
191
+ custom_text_edit=""
192
+ )
193
+
194
+ # Save result
195
+ edited.save("output.jpg")
196
+ ```
197
+
198
+ ### Custom Text-based Editing
199
+
200
+ ```python
201
+ # Add hat using custom text prompt
202
+ edited = run_edit(
203
+ image=image,
204
+ attribute="Orange hair (text)", # Must be text-based attribute
205
+ strength=0.18,
206
+ align_face=True,
207
+ use_bg_mask=True,
208
+ custom_text_edit="styleclip_global_a face_a face with a hat_0.18"
209
+ )
210
+ ```
211
+
212
+ ### Beard Addition
213
+
214
+ ```python
215
+ # Add beard (use negative values)
216
+ edited = run_edit(
217
+ image=image,
218
+ attribute="Beard",
219
+ strength=-15.0, # Negative value adds beard
220
+ align_face=False,
221
+ use_bg_mask=False,
222
+ custom_text_edit=""
223
+ )
224
+ ```
225
+
226
+ ## Model Architecture
227
+
228
+ ### Core Components
229
+
230
+ 1. **SimpleRunner**: Main interface for image editing
231
+ 2. **FSEInferenceRunner**: Handles model inference and editing
232
+ 3. **LatentEditor**: Manages different editing directions
233
+ 4. **StyleGAN2**: Generator for high-quality image synthesis
234
+ 5. **E4E Encoder**: Encodes images to latent space
235
+
236
+ ### Editing Methods
237
+
238
+ 1. **InterfaceGAN Directions**: Age, smile, gender
239
+ 2. **StyleSpace Directions**: Gender, facial features
240
+ 3. **StyleCLIP Global Mapper**: Text-based editing
241
+ 4. **DeltaEdit**: Advanced attribute manipulation
242
+
243
+ ### Processing Pipeline
244
+
245
+ 1. **Input Preprocessing**: Image normalization and resizing
246
+ 2. **Face Alignment**: Optional landmark-based alignment
247
+ 3. **Background Masking**: Optional face segmentation
248
+ 4. **Latent Encoding**: Convert image to latent representation
249
+ 5. **Attribute Editing**: Apply desired modifications
250
+ 6. **Image Synthesis**: Generate edited result
251
+ 7. **Post-processing**: Optional unalignment and blending
252
+
253
+ ## Dependencies
254
+
255
+ ### Core Dependencies
256
+
257
+ ```
258
+ gradio==4.44.0
259
+ torch
260
+ torchvision
261
+ Pillow>=9.5
262
+ numpy>=1.23
263
+ opencv-python-headless==4.10.0.84
264
+ ```
265
+
266
+ ### AI/ML Dependencies
267
+
268
+ ```
269
+ omegaconf==2.1.2
270
+ einops==0.7.0
271
+ timm==1.0.3
272
+ clip @ git+https://github.com/openai/CLIP.git
273
+ ```
274
+
275
+ ### Utility Dependencies
276
+
277
+ ```
278
+ scipy==1.10.1
279
+ networkx==3.3
280
+ fsspec==2024.3.1
281
+ gdown==4.7.1
282
+ wandb==0.15.2
283
+ pandas==2.2.2
284
+ ninja>=1.11
285
+ ```
286
+
287
+ ### System Dependencies
288
+
289
+ ```
290
+ dlib-binary
291
+ spaces>=0.28.3
292
+ setuptools>=68
293
+ wheel>=0.41
294
+ ```
295
+
296
+ ## Performance Considerations
297
+
298
+ ### Memory Usage
299
+ - Model weights: ~2GB total
300
+ - GPU memory: ~4GB recommended
301
+ - CPU fallback available
302
+
303
+ ### Processing Time
304
+ - Initialization: 30-60 seconds
305
+ - Per edit: 5-15 seconds (GPU), 30-60 seconds (CPU)
306
+ - Face alignment: +2-5 seconds
307
+ - Background masking: +3-8 seconds
308
+
309
+ ### Optimization Tips
310
+ 1. Use GPU when available
311
+ 2. Disable alignment for faster processing
312
+ 3. Use background masking only when needed
313
+ 4. Batch multiple edits when possible
314
+
315
+ ## Troubleshooting
316
+
317
+ ### Common Issues
318
+
319
+ 1. **"No module named 'piq'"**
320
+ - Install missing dependencies: `pip install piq`
321
+
322
+ 2. **CUDA initialization errors**
323
+ - Set `CUDA_VISIBLE_DEVICES=""` for CPU-only mode
324
+ - Check GPU compatibility
325
+
326
+ 3. **Face detection failures**
327
+ - Ensure clear, well-lit face images
328
+ - Try different alignment settings
329
+ - Check image resolution (minimum 256x256)
330
+
331
+ 4. **Model download failures**
332
+ - Verify Hugging Face token
333
+ - Check internet connectivity
334
+ - Ensure sufficient disk space
335
+
336
+ ### Debug Mode
337
+
338
+ Enable detailed logging by setting:
339
+ ```python
340
+ import logging
341
+ logging.basicConfig(level=logging.DEBUG)
342
+ ```
343
+
344
+ ## License and Credits
345
+
346
+ This application is based on the StyleFeatureEditor research project. Please refer to the original repository for licensing information and citations.
347
+
348
+ ## Support
349
+
350
+ For issues and questions:
351
+ 1. Check the troubleshooting section
352
+ 2. Review error logs
353
+ 3. Verify input image quality
354
+ 4. Test with different attribute combinations
app.py CHANGED
@@ -4,6 +4,9 @@ import logging
4
  from typing import Tuple, Dict
5
 
6
  import gradio as gr
 
 
 
7
  from spaces import GPU
8
  from huggingface_hub import snapshot_download
9
  from PIL import Image
@@ -206,7 +209,7 @@ def build_ui() -> gr.Blocks:
206
  label="Attribute",
207
  )
208
  strength = gr.Slider(-15, 15, value=5, step=0.01, label="Strength (p)")
209
- align_face = gr.Checkbox(value=False, label="Align face before editing")
210
  use_bg_mask = gr.Checkbox(value=False, label="Use background mask (reduce artifacts)")
211
  custom_text = gr.Textbox(
212
  value="",
@@ -236,8 +239,115 @@ def build_ui() -> gr.Blocks:
236
  return demo
237
 
238
 
239
- # Expose a top-level Gradio app for Hugging Face Spaces
240
- app = build_ui()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
241
 
242
 
243
  @GPU()
 
4
  from typing import Tuple, Dict
5
 
6
  import gradio as gr
7
+ from fastapi import FastAPI, UploadFile, File, Form, Header, HTTPException, Depends
8
+ from fastapi.responses import StreamingResponse, JSONResponse
9
+ import io
10
  from spaces import GPU
11
  from huggingface_hub import snapshot_download
12
  from PIL import Image
 
209
  label="Attribute",
210
  )
211
  strength = gr.Slider(-15, 15, value=5, step=0.01, label="Strength (p)")
212
+ align_face = gr.Checkbox(value=True, label="Align face before editing")
213
  use_bg_mask = gr.Checkbox(value=False, label="Use background mask (reduce artifacts)")
214
  custom_text = gr.Textbox(
215
  value="",
 
239
  return demo
240
 
241
 
242
+ # Build Gradio UI
243
+ demo = build_ui()
244
+
245
+ # -----------------------------
246
+ # REST API (FastAPI) endpoints
247
+ # -----------------------------
248
+ api = FastAPI(title="Smile Changer API")
249
+
250
+
251
+ def _require_auth(authorization: str | None = Header(default=None)):
252
+ expected = os.getenv("API_AUTH_TOKEN", "logicgo_123")
253
+ if not authorization or not authorization.startswith("Bearer "):
254
+ raise HTTPException(status_code=401, detail="Missing or invalid Authorization header")
255
+ token = authorization.split(" ", 1)[1]
256
+ if token != expected:
257
+ raise HTTPException(status_code=401, detail="Invalid token")
258
+
259
+
260
+ @api.get("/api/attributes")
261
+ def list_attributes(_: None = Depends(_require_auth)):
262
+ items = {}
263
+ for k, v in ATTRIBUTE_MAP.items():
264
+ edit_name, (lo, hi) = v
265
+ items[k] = {"internal": edit_name, "min": lo, "max": hi}
266
+ return JSONResponse(items)
267
+
268
+
269
+ @api.post("/api/edit")
270
+ async def api_edit(
271
+ file: UploadFile = File(...),
272
+ attribute: str = Form(...),
273
+ strength: float = Form(5.0),
274
+ align_face: bool = Form(True),
275
+ use_bg_mask: bool = Form(False),
276
+ custom_text_edit: str = Form(""),
277
+ _: None = Depends(_require_auth)
278
+ ):
279
+ data = await file.read()
280
+ image = Image.open(io.BytesIO(data)).convert("RGB")
281
+ result = run_edit(
282
+ image=image,
283
+ attribute=attribute,
284
+ strength=strength,
285
+ align_face=align_face,
286
+ use_bg_mask=use_bg_mask,
287
+ custom_text_edit=custom_text_edit,
288
+ )
289
+ buf = io.BytesIO()
290
+ result.save(buf, format="PNG")
291
+ buf.seek(0)
292
+ return StreamingResponse(buf, media_type="image/png")
293
+
294
+
295
+ @api.post("/api/edit/{attribute_name}")
296
+ async def api_edit_by_attribute(
297
+ attribute_name: str,
298
+ file: UploadFile = File(...),
299
+ strength: float = Form(5.0),
300
+ align_face: bool = Form(True),
301
+ use_bg_mask: bool = Form(False),
302
+ custom_text_edit: str = Form(""),
303
+ _: None = Depends(_require_auth)
304
+ ):
305
+ return await api_edit(
306
+ file=file,
307
+ attribute=attribute_name,
308
+ strength=strength,
309
+ align_face=align_face,
310
+ use_bg_mask=use_bg_mask,
311
+ custom_text_edit=custom_text_edit,
312
+ )
313
+
314
+
315
+ # Convenience endpoints for each attribute
316
+ def _register_attribute_endpoint(path: str, attribute_value: str):
317
+ @api.post(path)
318
+ async def _endpoint(
319
+ file: UploadFile = File(...),
320
+ strength: float = Form(5.0),
321
+ align_face: bool = Form(True),
322
+ use_bg_mask: bool = Form(False),
323
+ custom_text_edit: str = Form(""),
324
+ _: None = Depends(_require_auth)
325
+ ):
326
+ return await api_edit(
327
+ file=file,
328
+ attribute=attribute_value,
329
+ strength=strength,
330
+ align_face=align_face,
331
+ use_bg_mask=use_bg_mask,
332
+ custom_text_edit=custom_text_edit,
333
+ )
334
+
335
+
336
+ _register_attribute_endpoint("/api/smile", "Smile")
337
+ _register_attribute_endpoint("/api/age", "Age")
338
+ _register_attribute_endpoint("/api/female-features", "Female features")
339
+ _register_attribute_endpoint("/api/beard", "Beard")
340
+ _register_attribute_endpoint("/api/mustache-goatee", "Mustache/Goatee")
341
+ _register_attribute_endpoint("/api/glasses", "Glasses")
342
+ _register_attribute_endpoint("/api/makeup", "Makeup")
343
+ _register_attribute_endpoint("/api/curly-hair", "Curly hair")
344
+ _register_attribute_endpoint("/api/afro", "Afro")
345
+ _register_attribute_endpoint("/api/orange-hair-text", "Orange hair (text)")
346
+ _register_attribute_endpoint("/api/blonde-hair-text", "Blonde hair (text)")
347
+
348
+
349
+ # Mount Gradio on FastAPI and expose combined app
350
+ app = gr.mount_gradio_app(api, demo, path="/")
351
 
352
 
353
  @GPU()
requirements.txt CHANGED
@@ -20,3 +20,5 @@ torchvision
20
  clip @ git+https://github.com/openai/CLIP.git@a1d071733d7111c9c014f024669f959182114e33
21
  spaces>=0.28.3
22
  dlib-binary
 
 
 
20
  clip @ git+https://github.com/openai/CLIP.git@a1d071733d7111c9c014f024669f959182114e33
21
  spaces>=0.28.3
22
  dlib-binary
23
+ fastapi>=0.110
24
+ python-multipart>=0.0.9