Spaces:
Running
Running
| # CodeFormer Face Restoration - Project Documentation | |
| ## 1. Introduction | |
| **CodeFormer** is a robust blind face restoration algorithm designed to restore old, degraded, or AI-generated face images. It utilizes a **Codebook Lookup Transformer** (VQGAN-based) to predict high-quality facial features even from severe degradation, ensuring that the restored faces look natural and faithful to the original identity. | |
| This project wraps the core CodeFormer research code into a deployable, user-friendly **Flask Web Application**, containerized with **Docker** for easy deployment on platforms like Hugging Face Spaces. | |
| ### Key Features | |
| * **Blind Face Restoration:** Restores faces from low-quality inputs without knowing the specific degradation details. | |
| * **Background Enhancement:** Uses **Real-ESRGAN** to upscale and enhance the non-face background regions of the image. | |
| * **Face Alignment & Paste-back:** Automatically detects faces, aligns them for processing, and seamlessly blends them back into the original image. | |
| * **Adjustable Fidelity:** Users can balance between restoration quality (hallucinating details) and identity fidelity (keeping the original look). | |
| --- | |
| ## 2. System Architecture | |
| The application is built on a Python/PyTorch backend served via Flask. | |
| ### 2.1 Technology Stack | |
| * **Framework:** Flask (Python Web Server) | |
| * **Deep Learning:** PyTorch, TorchVision | |
| * **Image Processing:** OpenCV, NumPy, Pillow | |
| * **Core Libraries:** `basicsr` (Basic Super-Restoration), `facelib` (Face detection/utils) | |
| * **Frontend:** HTML5, Bootstrap 5, Jinja2 Templates | |
| * **Containerization:** Docker (CUDA-enabled) | |
| ### 2.2 Directory Structure | |
| ``` | |
| CodeFormer/ | |
| βββ app.py # Main Flask application entry point | |
| βββ Dockerfile # Container configuration | |
| βββ requirements.txt # Python dependencies | |
| βββ basicsr/ # Core AI framework (Super-Resolution tools) | |
| βββ facelib/ # Face detection and alignment utilities | |
| βββ templates/ # HTML Frontend | |
| β βββ index.html # Upload interface | |
| β βββ result.html # Results display | |
| βββ static/ # Static assets (css, js, uploads) | |
| β βββ uploads/ # Temporary storage for input images | |
| β βββ results/ # Temporary storage for processed output | |
| βββ weights/ # Pre-trained model weights (downloaded on startup) | |
| βββ CodeFormer/ # CodeFormer model (.pth) | |
| βββ facelib/ # Detection (RetinaFace) and Parsing models | |
| βββ realesrgan/ # Background upscaler (Real-ESRGAN) | |
| ``` | |
| ### 2.3 Logic Flow | |
| 1. **Input:** User uploads an image via the Web UI. | |
| 2. **Pre-processing (`app.py`):** | |
| * Image is saved to `static/uploads`. | |
| * Parameters (fidelity, upscale factor) are parsed. | |
| 3. **Inference Pipeline:** | |
| * **Detection:** `facelib` detects faces in the image using RetinaFace. | |
| * **Alignment:** Faces are cropped and aligned to a standard 512x512 resolution. | |
| * **Restoration:** The **CodeFormer** model processes the aligned faces. | |
| * **Upscaling (Optional):** The background is upscaled using **Real-ESRGAN**. | |
| * **Paste-back:** Restored faces are warped back to their original positions and blended. | |
| 4. **Output:** The final image is saved to `static/results` and displayed to the user. | |
| --- | |
| ## 3. Installation & Deployment | |
| ### 3.1 Docker Deployment (Recommended) | |
| The project is optimized for Docker. | |
| **Prerequisites:** Docker, NVIDIA GPU (optional, but recommended). | |
| 1. **Build the Image:** | |
| ```bash | |
| docker build -t codeformer-app . | |
| ``` | |
| 2. **Run the Container:** | |
| ```bash | |
| # Run on port 7860 (Standard for HF Spaces) | |
| docker run -it -p 7860:7860 codeformer-app | |
| ``` | |
| *Note: To use GPU, add the `--gpus all` flag to the run command.* | |
| ### 3.2 Hugging Face Spaces Deployment | |
| This repository is configured for direct deployment to Hugging Face. | |
| 1. Create a **Docker** Space on Hugging Face. | |
| 2. Push this entire repository to the Space's Git remote. | |
| ```bash | |
| git remote add hf git@hf.co:spaces/USERNAME/SPACE_NAME | |
| git push hf main | |
| ``` | |
| 3. The Space will build (approx. 5-10 mins) and launch automatically. | |
| ### 3.3 Local Development | |
| 1. **Install Environment:** | |
| ```bash | |
| conda create -n codeformer python=3.8 | |
| conda activate codeformer | |
| pip install -r requirements.txt | |
| ``` | |
| 2. **Install Basicsr:** | |
| ```bash | |
| python basicsr/setup.py install | |
| ``` | |
| 3. **Run App:** | |
| ```bash | |
| python app.py | |
| ``` | |
| --- | |
| ## 4. User Guide (Web Interface) | |
| ### 4.1 Interface Controls | |
| * **Input Image:** Supports standard formats (JPG, PNG, WEBP). Drag and drop supported. | |
| * **Fidelity Weight (w):** | |
| * **Range:** 0.0 to 1.0. | |
| * **0.0 (Better Quality):** The model "hallucinates" more details. Results look very sharp and high-quality but may slightly alter the person's identity (look less like the original). | |
| * **1.0 (Better Identity):** The model sticks strictly to the original features. Results are faithful to the original photo but might be blurrier or contain more artifacts. | |
| * **Recommended:** 0.5 is a balanced default. | |
| * **Upscale Factor:** | |
| * Scales the final output resolution (1x, 2x, or 4x). | |
| * *Note: Higher scaling requires more VRAM.* | |
| * **Enhance Background:** | |
| * If checked, runs Real-ESRGAN on the non-face areas. | |
| * *Recommendation:* Keep checked for full-photo restoration. Uncheck if you only care about the face or are running on limited hardware. | |
| * **Upsample Face:** | |
| * If checked, the restored face is also upsampled to match the background resolution. | |
| ### 4.2 Viewing Results | |
| The result page features an interactive **Before/After Slider**. Drag the handle left and right to compare the pixels of the original versus the restored image directly. | |
| --- | |
| ## 5. Technical Details | |
| ### 5.1 Model Weights | |
| The application automatically checks for and downloads the following weights to the `weights/` directory on startup: | |
| | Model | Path | Description | | |
| | :--- | :--- | :--- | | |
| | **CodeFormer** | `weights/CodeFormer/codeformer.pth` | Main restoration model. | | |
| | **RetinaFace** | `weights/facelib/detection_Resnet50_Final.pth` | Face detection. | | |
| | **ParseNet** | `weights/facelib/parsing_parsenet.pth` | Face parsing (segmentation). | | |
| | **Real-ESRGAN** | `weights/realesrgan/RealESRGAN_x2plus.pth` | Background upscaler (x2). | | |
| ### 5.2 Performance Notes | |
| * **Memory:** The full pipeline (CodeFormer + Real-ESRGAN) requires significant RAM/VRAM. On CPU-only environments (like basic HF Spaces), processing a single image may take 30-60 seconds. | |
| * **Git LFS:** Image assets in this repository are tracked with Git LFS to keep the repo size manageable. | |
| --- | |
| ## 6. Credits & References | |
| * **Original Paper:** [Towards Robust Blind Face Restoration with Codebook Lookup Transformer (NeurIPS 2022)](https://arxiv.org/abs/2206.11253) | |
| * **Authors:** Shangchen Zhou, Kelvin C.K. Chan, Chongyi Li, Chen Change Loy (S-Lab, Nanyang Technological University). | |
| * **Original Repository:** [sczhou/CodeFormer](https://github.com/sczhou/CodeFormer) | |