Instructions to use Bl4ckSpaces/FFD-XL-2.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Bl4ckSpaces/FFD-XL-2.0 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Bl4ckSpaces/FFD-XL-2.0", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("Bl4ckSpaces/FFD-XL-2.0", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]π₯ FFD-XL 2.0 (v3.1) - The Awakened Architecture (Restored Edition)
By Bl4ckspaces
Welcome to FastFourierDiffusion-XL (FFD-XL) 2.0 v3.1. This is the Restored Editionβa paradigm shift in mathematical model merging. Moving beyond simple weight averaging, FFD-XL utilizes advanced frequency manipulation and matrix decomposition to engineer an absolute powerhouse for 2.5D, anime, and cinematic generation.
If you are tired of models with "foggy" lighting, broken extremity anatomy, or color bleeding, the Awakened Architecture is engineered to be your final stop.
π The Foundation: FFT & RSVD
This model is built upon a surgical, multi-domain merging pipeline. Instead of blindly mixing models, we dissect them using two core mathematical principles:
- FFT (Fast Fourier Transform): We map the model's weights into the frequency domain. This allows us to separate Low Frequencies (macro-lighting, global contrast, and cinematic color grading) from High Frequencies (micro-details, skin textures, and sharp edges). The result? We inject stunning 2.5D textures without destroying the underlying lighting, achieving a pristine, "zero-fog" contrast.
- RSVD (Randomized Singular Value Decomposition): We use RSVD to extract the absolute core "skeleton" of a model's understanding. By isolating the most significant singular vectors of anatomical models (like Hasshaku), we surgically graft perfect pose structures and extremity accuracy (hands/eyes) into the base model without carrying over unwanted stylistic artifacts.
𧬠Why FFD-XL 2.0 v3.1 is Superior
- Fourier-Space Sinkhorn Optimal Transport (FS-OT): Perfect cinematic color grading locked in with Phase Blending to prevent visual ghosting and color bleeding.
- Multi-Band Spectral RSVD & EAR: By dynamically allocating bone structure, we enforce 100% Empirical Asymmetric Routing (EAR) dominance on extreme appendages. This guarantees uncompromising anatomical accuracy for fingers and eyes.
- Exact TIES Restricted HF: 2.5D skin textures and micro-details are injected purely into the High-Frequency domain using an Exact Quantile Noise Filter, keeping the base generation incredibly clean.
- Pre-Flight Shield: Mathematically shielded against FP16 NaN blowouts, ensuring stable generation across all UIs.
π― Recommended Settings
To unleash the true potential and the "wild" micro-details of this model, strictly follow these parameters:
- Sampler (Crucial): DPM++ 2M Karras (Highly recommended for maximum sharpness and intricate textures). Euler a is acceptable for a softer, classic look, but DPM++ 2M Karras is where the architecture truly breathes.
- Steps: 25 - 35
- CFG Scale: 5.0 - 7.0
- Resolution & Scaling (Near-2K Support): * Base Native:
832x1216,1024x1024, or1216x832.- High-Res Multipliers: This model robustly supports direct high-resolution generation up to near-2K scale. You can confidently scale your initial dimensions by 1.2x, 1.3x, 1.4x, or up to 1.5x without the latent space breaking apartβas long as the final resolution values are multiples of 64 (e.g.,
1216x1792,1472x1472, or1536x1536).
- High-Res Multipliers: This model robustly supports direct high-resolution generation up to near-2K scale. You can confidently scale your initial dimensions by 1.2x, 1.3x, 1.4x, or up to 1.5x without the latent space breaking apartβas long as the final resolution values are multiples of 64 (e.g.,
- Clip Skip: 2
π¬ Prompting Guide (Danbooru Syntax)
The architecture features a highly enriched, democratic vocabulary balanced between ViT-L & ViT-bigG. It responds exceptionally well to structured Danbooru tags.
Positive Prompt Structure:
(masterpiece, best quality, ultra-detailed:1.2), 1girl, solo, glowing eyes, detailed cinematic lighting, looking at viewer, [subject description], [clothing], [background], depth of field, 8k resolution
Negative Prompt:
(worst quality, low quality, normal quality:1.4), deformed, bad anatomy, bad hands, missing fingers, blurry, ugly, text, watermark, fog
π License & Credits
- Creator: Bl4ckspaces
- License: OpenRAIL
- Acknowledgments: Massive thanks to the creators of Wai Base, Janku, Hasshaku, Perfect Illustrious, and NTR MIX for providing the foundational latents that made this surgical merge possible.
π How to Use
1. ComfyUI (Recommended)
- Download the
FFD_XL_v3_1_2_0_RESTORED_fp16.safetensorsfile. - Place it in your ComfyUI
models/checkpointsfolder. - Use a standard SDXL workflow. Load the model using the Load Checkpoint node.
- (Optional but recommended) No need for an external VAE; the model natively includes the SDXL FP16 Fix VAE.
2. Diffusers (Python)
You can easily load this model using the diffusers library.
from diffusers import StableDiffusionXLPipeline
import torch
repo_id = "Bl4ckSpaces/FFD-XL-2.0"
model_name = "FFD_XL_v3_1_2_0_RESTORED_fp16.safetensors"
pipe = StableDiffusionXLPipeline.from_single_file(
f"[https://huggingface.co/](https://huggingface.co/){repo_id}/blob/main/{model_name}",
torch_dtype=torch.float16,
use_safetensors=True
)
pipe.to("cuda")
prompt = "(masterpiece, best quality:1.2), 1girl, solo, intricate details, cinematic lighting, cyberpunk city background"
negative_prompt = "(worst quality, low quality:1.4), bad anatomy, watermark"
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=30,
guidance_scale=6.0,
).images[0]
image.save("ffd_xl_masterpiece.png")
- Downloads last month
- 133


