--- license: creativeml-openrail-m base_model: - SG161222/Realistic_Vision_V4.0_noVAE tags: - text-to-image - stable-diffusion - ip-adapter - face-id - identity-preservation - portrait - rishabh-in-code library_name: diffusers pipeline_tag: text-to-image --- # TrueFace-Adapter: High-Fidelity Identity Preservation ![Model License](https://img.shields.io/badge/License-Non--Commercial-red.svg) ![Base Model](https://img.shields.io/badge/Base%20Model-Realistic%20Vision%20V4.0-blue.svg) --- ## Introduction This is a custom, fine-tuned version of the **IP-Adapter-FaceID-PlusV2** model for Stable Diffusion 1.5. It was specifically trained to prioritize high-fidelity identity preservation while maintaining compositional realism across highly diverse prompts. The model relies on FaceID embeddings extracted via the InsightFace `buffalo_l` model to condition the image generation process directly into the UNet cross-attention layers. * **Base Diffusion Model:** `SG161222/Realistic_Vision_V4.0_noVAE` * **VAE:** `stabilityai/sd-vae-ft-mse` * **Image Encoder:** `laion/CLIP-ViT-H-14-laion2B-s32B-b79K` * **Dataset:** images sampled from `bitmind/celeb-a-hq`. * **Optimization:** Joint optimization utilizing standard Diffusion Loss paired with Identity Loss (ArcFace Cosine Similarity). ## Evaluation Metrics The model was rigorously evaluated against the generic zero-shot IP-Adapter baseline. Testing involved generating multiple stylistic variations (cinematic lighting, charcoal sketch, outdoor lighting, etc.) across various seed images. | Metric | Baseline (Zero-Shot) | Fine-Tuned (This Model) | Note | |---|---|---|---| | **Identity Score** (Higher is better) | 0.8327 | **0.8754** | Significant improvement in facial structure retention. | | **FID Score** (Lower is better) | **259.27** | 283.11 | Standard distributional gap trade-off when forcing strict identity constraints. | *Note: In 1-to-1 sample comparisons, this fine-tuned model successfully pushed specific Identity Scores as high as **0.9680**, achieving superior sample-specific realism (FID: 421.97 vs Baseline: 448.15).* ## Generalization to Unseen Data (CelebA-HQ) To prove TrueFace-Adapter does not overfit to the training data, we tested it on unseen subjects from the CelebA-HQ dataset across 5 distinct prompts (Cinematic, Smiling, Sunglasses, Studio, Charcoal Sketch). **Reference Subject (Unseen Data):** ![Original](celeb_original.png) **Baseline (Standard IP-Adapter Zero-Shot):** *Notice the loss of the square jawline, the alteration of the eye shape, and the complete loss of identity in the sketch (far right).* ![Baseline](celeb_baseline.png) **TrueFace-Adapter (Ours):** *The fine-tuned model strictly preserves the subject's deep-set eyes, specific jaw structure, and maintains high-fidelity likeness even in the charcoal sketch medium.* ![Finetuned](celeb_finetuned.png) ## Usage To use this model, you first need to extract the face embedding and aligned face image using `insightface`. ```python import cv2 import torch from insightface.app import FaceAnalysis from insightface.utils import face_align from diffusers import StableDiffusionPipeline, DDIMScheduler, AutoencoderKL from ip_adapter.ip_adapter_faceid import IPAdapterFaceIDPlus # 1. Setup Face Extraction app = FaceAnalysis(name="buffalo_l", providers=['CUDAExecutionProvider']) app.prepare(ctx_id=0, det_size=(640, 640)) image = cv2.imread("your_seed_image.jpg") faces = app.get(image) faceid_embeds = torch.from_numpy(faces[0].normed_embedding).unsqueeze(0) face_image = face_align.norm_crop(image, landmark=faces[0].kps, image_size=224) # 2. Setup Pipeline device = "cuda" base_model_path = "SG161222/Realistic_Vision_V4.0_noVAE" vae_model_path = "stabilityai/sd-vae-ft-mse" image_encoder_path = "laion/CLIP-ViT-H-14-laion2B-s32B-b79K" ip_ckpt = "ip-adapter-faceid-plusv2_sd15-finetuned_RishabhInCode.bin" # This repo's file noise_scheduler = DDIMScheduler( num_train_timesteps=1000, beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False, steps_offset=1, ) vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16) pipe = StableDiffusionPipeline.from_pretrained( base_model_path, torch_dtype=torch.float16, scheduler=noise_scheduler, vae=vae, safety_checker=None ).to(device) # 3. Load IP-Adapter with Custom Fine-Tuned Weights ip_model = IPAdapterFaceIDPlus(pipe, image_encoder_path, ip_ckpt, device) # 4. Generate prompt = "a cinematic portrait of the person in cyberpunk lighting" images = ip_model.generate( prompt=prompt, face_image=face_image, faceid_embeds=faceid_embeds, shortcut=True, s_scale=1.0, num_samples=1, width=512, height=768, num_inference_steps=30 ) images[0].save("output.png") ``` ## Technical Lineage & Credits This project is a specialized refinement of several foundational works in the Generative AI ecosystem. ### Base Architecture * **Diffusion Model:** [Realistic Vision V4.0](https://huggingface.co/SG161222/Realistic_Vision_V4.0_noVAE) by SG161222. * **Adapter Framework:** [IP-Adapter-FaceID-PlusV2](https://huggingface.co/h94/IP-Adapter-FaceID-PlusV2) by Tencent AI Lab. ### Component Acknowledgments * **Face Embedding:** Developed using [InsightFace](https://github.com/deepinsight/insightface) (buffalo_l), utilizing the ArcFace identity loss function. * **Image Encoding:** [CLIP-ViT-H-14-laion2B](https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K) for structural consistency. * **Fine-Tuning Data:** Curated samples from the [CelebA-HQ Dataset](https://github.com/tkarras/progressive_growing_of_gans). ## License & Ethical Use **TrueFace-Adapter** is released under a **Non-Commercial Research License**. 1. This model inherits the restrictive license of InsightFace. 2. **Ethical Guidelines:** This model is intended for artistic expression and identity-consistent portrait generation. Users are prohibited from using this tool to generate non-consensual deepfakes or misleading media. ## Citation If you use this fine-tuned model in your research or projects, please cite it as: ```bibtex @misc{rishabhincode2026trueface, author = {RishabhInCode}, title = {TrueFace-Adapter: High-Fidelity Identity Preservation}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/RishabhInCode/TrueFace-Adapter}} } ```