TrueFace-Adapter / README.md
RishabhInCode's picture
Update README.md
446b1fb verified
---
license: creativeml-openrail-m
base_model:
- SG161222/Realistic_Vision_V4.0_noVAE
tags:
- text-to-image
- stable-diffusion
- ip-adapter
- face-id
- identity-preservation
- portrait
- rishabh-in-code
library_name: diffusers
pipeline_tag: text-to-image
---
# TrueFace-Adapter: High-Fidelity Identity Preservation
![Model License](https://img.shields.io/badge/License-Non--Commercial-red.svg)
![Base Model](https://img.shields.io/badge/Base%20Model-Realistic%20Vision%20V4.0-blue.svg)
---
## Introduction
This is a custom, fine-tuned version of the **IP-Adapter-FaceID-PlusV2** model for Stable Diffusion 1.5. It was specifically trained to prioritize high-fidelity identity preservation while maintaining compositional realism across highly diverse prompts.
The model relies on FaceID embeddings extracted via the InsightFace `buffalo_l` model to condition the image generation process directly into the UNet cross-attention layers.
* **Base Diffusion Model:** `SG161222/Realistic_Vision_V4.0_noVAE`
* **VAE:** `stabilityai/sd-vae-ft-mse`
* **Image Encoder:** `laion/CLIP-ViT-H-14-laion2B-s32B-b79K`
* **Dataset:** images sampled from `bitmind/celeb-a-hq`.
* **Optimization:** Joint optimization utilizing standard Diffusion Loss paired with Identity Loss (ArcFace Cosine Similarity).
## Evaluation Metrics
The model was rigorously evaluated against the generic zero-shot IP-Adapter baseline. Testing involved generating multiple stylistic variations (cinematic lighting, charcoal sketch, outdoor lighting, etc.) across various seed images.
| Metric | Baseline (Zero-Shot) | Fine-Tuned (This Model) | Note |
|---|---|---|---|
| **Identity Score** (Higher is better) | 0.8327 | **0.8754** | Significant improvement in facial structure retention. |
| **FID Score** (Lower is better) | **259.27** | 283.11 | Standard distributional gap trade-off when forcing strict identity constraints. |
*Note: In 1-to-1 sample comparisons, this fine-tuned model successfully pushed specific Identity Scores as high as **0.9680**, achieving superior sample-specific realism (FID: 421.97 vs Baseline: 448.15).*
## Generalization to Unseen Data (CelebA-HQ)
To prove TrueFace-Adapter does not overfit to the training data, we tested it on unseen subjects from the CelebA-HQ dataset across 5 distinct prompts (Cinematic, Smiling, Sunglasses, Studio, Charcoal Sketch).
**Reference Subject (Unseen Data):**
![Original](celeb_original.png)
**Baseline (Standard IP-Adapter Zero-Shot):**
*Notice the loss of the square jawline, the alteration of the eye shape, and the complete loss of identity in the sketch (far right).*
![Baseline](celeb_baseline.png)
**TrueFace-Adapter (Ours):**
*The fine-tuned model strictly preserves the subject's deep-set eyes, specific jaw structure, and maintains high-fidelity likeness even in the charcoal sketch medium.*
![Finetuned](celeb_finetuned.png)
## Usage
To use this model, you first need to extract the face embedding and aligned face image using `insightface`.
```python
import cv2
import torch
from insightface.app import FaceAnalysis
from insightface.utils import face_align
from diffusers import StableDiffusionPipeline, DDIMScheduler, AutoencoderKL
from ip_adapter.ip_adapter_faceid import IPAdapterFaceIDPlus
# 1. Setup Face Extraction
app = FaceAnalysis(name="buffalo_l", providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))
image = cv2.imread("your_seed_image.jpg")
faces = app.get(image)
faceid_embeds = torch.from_numpy(faces[0].normed_embedding).unsqueeze(0)
face_image = face_align.norm_crop(image, landmark=faces[0].kps, image_size=224)
# 2. Setup Pipeline
device = "cuda"
base_model_path = "SG161222/Realistic_Vision_V4.0_noVAE"
vae_model_path = "stabilityai/sd-vae-ft-mse"
image_encoder_path = "laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
ip_ckpt = "ip-adapter-faceid-plusv2_sd15-finetuned_RishabhInCode.bin" # This repo's file
noise_scheduler = DDIMScheduler(
num_train_timesteps=1000,
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear",
clip_sample=False,
set_alpha_to_one=False,
steps_offset=1,
)
vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
pipe = StableDiffusionPipeline.from_pretrained(
base_model_path,
torch_dtype=torch.float16,
scheduler=noise_scheduler,
vae=vae,
safety_checker=None
).to(device)
# 3. Load IP-Adapter with Custom Fine-Tuned Weights
ip_model = IPAdapterFaceIDPlus(pipe, image_encoder_path, ip_ckpt, device)
# 4. Generate
prompt = "a cinematic portrait of the person in cyberpunk lighting"
images = ip_model.generate(
prompt=prompt,
face_image=face_image,
faceid_embeds=faceid_embeds,
shortcut=True,
s_scale=1.0,
num_samples=1,
width=512,
height=768,
num_inference_steps=30
)
images[0].save("output.png")
```
## Technical Lineage & Credits
This project is a specialized refinement of several foundational works in the Generative AI ecosystem.
### Base Architecture
* **Diffusion Model:** [Realistic Vision V4.0](https://huggingface.co/SG161222/Realistic_Vision_V4.0_noVAE) by SG161222.
* **Adapter Framework:** [IP-Adapter-FaceID-PlusV2](https://huggingface.co/h94/IP-Adapter-FaceID-PlusV2) by Tencent AI Lab.
### Component Acknowledgments
* **Face Embedding:** Developed using [InsightFace](https://github.com/deepinsight/insightface) (buffalo_l), utilizing the ArcFace identity loss function.
* **Image Encoding:** [CLIP-ViT-H-14-laion2B](https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K) for structural consistency.
* **Fine-Tuning Data:** Curated samples from the [CelebA-HQ Dataset](https://github.com/tkarras/progressive_growing_of_gans).
## License & Ethical Use
**TrueFace-Adapter** is released under a **Non-Commercial Research License**.
1. This model inherits the restrictive license of InsightFace.
2. **Ethical Guidelines:** This model is intended for artistic expression and identity-consistent portrait generation. Users are prohibited from using this tool to generate non-consensual deepfakes or misleading media.
## Citation
If you use this fine-tuned model in your research or projects, please cite it as:
```bibtex
@misc{rishabhincode2026trueface,
author = {RishabhInCode},
title = {TrueFace-Adapter: High-Fidelity Identity Preservation},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/RishabhInCode/TrueFace-Adapter}}
}
```