Spaces:

LogicGoInfotechSpaces
/

Smile_Changer

Sleeping

App Files Files Community

Smile_Changer / API_DOCUMENTATION.md

LogicGoInfotechSpaces

API: add FastAPI endpoints, bearer auth, align default true; docs updated

57bfe5c about 2 months ago

preview code

raw

history blame contribute delete

9.72 kB

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

Smile Changer API Documentation

Overview

The Smile Changer is a facial attribute editing application built on StyleFeatureEditor that allows users to modify various facial attributes like smile, age, beard, hair style/color, glasses, and makeup using AI-powered image editing.

API Endpoints
Core Functions
Attribute Mapping
Configuration
Error Handling
Usage Examples
Model Architecture
Dependencies

API Endpoints

Main Application Interface

The application is built using Gradio and provides a web-based interface with the following components:

Input Parameters

Parameter	Type	Description	Default	Range
`image`	PIL.Image	Input face image	-	Any valid image format
`attribute`	str	Attribute to edit	"Smile"	See Attribute Mapping
`strength`	float	Edit intensity	5.0	Varies by attribute
`align_face`	bool	Enable face alignment	False	True/False
`use_bg_mask`	bool	Use background masking	False	True/False
`custom_text_edit`	str	Custom text prompt	""	StyleCLIP format

Output

Parameter	Type	Description
`edited_image`	PIL.Image	Edited face image

Core Functions

`run_edit(image, attribute, strength, align_face, use_bg_mask, custom_text_edit)`

Main editing function that processes the input image and applies the specified attribute modification.

Parameters:

image (PIL.Image): Input face image
attribute (str): Attribute name from ATTRIBUTE_MAP
strength (float): Edit intensity (automatically clipped to valid range)
align_face (bool): Whether to align face before editing
use_bg_mask (bool): Whether to use background masking
custom_text_edit (str): Custom text prompt for StyleCLIP edits

Returns:

PIL.Image: Edited image

Process Flow:

Load and initialize the SimpleRunner
Determine editing parameters from attribute selection
Apply strength clipping to valid range
Process image through the editing pipeline
Return edited result

`get_runner() -> SimpleRunner`

Singleton function that initializes and returns the SimpleRunner instance.

Returns:

SimpleRunner: Configured runner instance

Features:

Lazy initialization
Automatic model weight downloading
Error handling and logging

`ensure_weights()`

Downloads required model weights from Hugging Face if not present locally.

Required Files:

sfe_editor_light.pt - Main editor model
stylegan2-ffhq-config-f.pt - StyleGAN2 generator
e4e_ffhq_encode.pt - Encoder model
shape_predictor_68_face_landmarks.dat - Face landmark predictor
Additional supporting models

Attribute Mapping

The application supports the following facial attributes:

Face Semantics

Attribute	Internal Name	Range	Description
Smile	`fs_smiling`	-10.0 to 10.0	Positive adds smile, negative removes
Age	`age`	-10.0 to 10.0	Positive makes older, negative makes younger
Female features	`gender`	-10.0 to 7.0	Positive adds femininity

Facial Hair

Attribute	Internal Name	Range	Description
Beard	`trimmed_beard`	-30.0 to 30.0	Negative values ADD beard
Mustache/Goatee	`goatee`	-7.0 to 7.0	Negative values ADD goatee

Accessories & Cosmetics

Attribute	Internal Name	Range	Description
Glasses	`fs_glasses`	-20.0 to 30.0	Positive adds glasses, negative removes
Makeup	`fs_makeup`	-10.0 to 15.0	Positive adds makeup, negative removes

Hair Style

Attribute	Internal Name	Range	Description
Curly hair	`curly_hair`	0.0 to 0.12	Adds curly hair texture
Afro	`afro`	0.0 to 0.14	Adds afro hairstyle

Hair Color (Text-based)

Attribute	Internal Name	Range	Description
Orange hair (text)	`styleclip_global_a face_a face with orange hair_0.18`	0.0 to 0.2	Changes hair to orange
Blonde hair (text)	`styleclip_global_a face_a face with blonde hair_0.18`	0.0 to 0.2	Changes hair to blonde

Configuration

Environment Variables

Variable	Description	Default
`CUDA_VISIBLE_DEVICES`	GPU device selection	"" (CPU)
`TORCH_CUDA_ARCH_LIST`	CUDA architecture	"8.0"
`HF_TOKEN`	Hugging Face token	-
`HUGGINGFACE_TOKEN`	Alternative HF token	-

Model Configuration

The application uses the following configuration files:

configs/simple_inference.yaml - Main inference configuration
pretrained_models/ - Directory containing all model weights

Error Handling

Common Error Scenarios

Missing Model Weights
- Automatic download from Hugging Face
- Fallback to CPU if GPU unavailable
Face Detection Failures
- Multiple detection thresholds attempted
- Graceful degradation without alignment
Mask Extraction Failures
- Continues without background masking
- Logs warnings for debugging
Alignment Failures
- Falls back to unaligned processing
- Preserves original image orientation

Logging

The application uses Python's logging module with INFO level by default:

Model initialization status
Edit process progress
Error details and stack traces
File download and verification

Usage Examples

Basic Smile Enhancement

from PIL import Image
from app import run_edit

# Load input image
image = Image.open("input.jpg")

# Apply smile enhancement
edited = run_edit(
    image=image,
    attribute="Smile",
    strength=5.0,
    align_face=False,
    use_bg_mask=False,
    custom_text_edit=""
)

# Save result
edited.save("output.jpg")

Custom Text-based Editing

# Add hat using custom text prompt
edited = run_edit(
    image=image,
    attribute="Orange hair (text)",  # Must be text-based attribute
    strength=0.18,
    align_face=True,
    use_bg_mask=True,
    custom_text_edit="styleclip_global_a face_a face with a hat_0.18"
)

Beard Addition

# Add beard (use negative values)
edited = run_edit(
    image=image,
    attribute="Beard",
    strength=-15.0,  # Negative value adds beard
    align_face=False,
    use_bg_mask=False,
    custom_text_edit=""
)

Model Architecture

Core Components

SimpleRunner: Main interface for image editing
FSEInferenceRunner: Handles model inference and editing
LatentEditor: Manages different editing directions
StyleGAN2: Generator for high-quality image synthesis
E4E Encoder: Encodes images to latent space

Editing Methods

InterfaceGAN Directions: Age, smile, gender
StyleSpace Directions: Gender, facial features
StyleCLIP Global Mapper: Text-based editing
DeltaEdit: Advanced attribute manipulation

Processing Pipeline

Input Preprocessing: Image normalization and resizing
Face Alignment: Optional landmark-based alignment
Background Masking: Optional face segmentation
Latent Encoding: Convert image to latent representation
Attribute Editing: Apply desired modifications
Image Synthesis: Generate edited result
Post-processing: Optional unalignment and blending

Dependencies

Core Dependencies

gradio==4.44.0
torch
torchvision
Pillow>=9.5
numpy>=1.23
opencv-python-headless==4.10.0.84

AI/ML Dependencies

omegaconf==2.1.2
einops==0.7.0
timm==1.0.3
clip @ git+https://github.com/openai/CLIP.git

Utility Dependencies

scipy==1.10.1
networkx==3.3
fsspec==2024.3.1
gdown==4.7.1
wandb==0.15.2
pandas==2.2.2
ninja>=1.11

System Dependencies

dlib-binary
spaces>=0.28.3
setuptools>=68
wheel>=0.41

Performance Considerations

Memory Usage

Model weights: ~2GB total
GPU memory: ~4GB recommended
CPU fallback available

Processing Time

Initialization: 30-60 seconds
Per edit: 5-15 seconds (GPU), 30-60 seconds (CPU)
Face alignment: +2-5 seconds
Background masking: +3-8 seconds

Optimization Tips

Use GPU when available
Disable alignment for faster processing
Use background masking only when needed
Batch multiple edits when possible

Troubleshooting

Common Issues

"No module named 'piq'"
- Install missing dependencies: pip install piq
CUDA initialization errors
- Set CUDA_VISIBLE_DEVICES="" for CPU-only mode
- Check GPU compatibility
Face detection failures
- Ensure clear, well-lit face images
- Try different alignment settings
- Check image resolution (minimum 256x256)
Model download failures
- Verify Hugging Face token
- Check internet connectivity
- Ensure sufficient disk space

Debug Mode

Enable detailed logging by setting:

import logging
logging.basicConfig(level=logging.DEBUG)

License and Credits

This application is based on the StyleFeatureEditor research project. Please refer to the original repository for licensing information and citations.

Support

For issues and questions:

Check the troubleshooting section
Review error logs
Verify input image quality
Test with different attribute combinations