Image-to-Image
Diffusers
Safetensors
Sana
English
VIBESanaEditingPipeline
image-editing
text-guided-editing
diffusion
qwen-vl
multimodal
distilled
cfg-distillation
Instructions to use iitolstykh/VIBE-Image-Edit-DistilledCFG with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use iitolstykh/VIBE-Image-Edit-DistilledCFG with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("iitolstykh/VIBE-Image-Edit-DistilledCFG", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Sana
How to use iitolstykh/VIBE-Image-Edit-DistilledCFG with Sana:
# Load the model and infer image from text import torch from app.sana_pipeline import SanaPipeline from torchvision.utils import save_image sana = SanaPipeline("configs/sana_config/1024ms/Sana_1600M_img1024.yaml") sana.from_pretrained("hf://iitolstykh/VIBE-Image-Edit-DistilledCFG") image = sana( prompt='a cyberpunk cat with a neon sign that says "Sana"', height=1024, width=1024, guidance_scale=5.0, pag_guidance_scale=2.0, num_inference_steps=18, ) - Notebooks
- Google Colab
- Kaggle
Step distilled model?
#1
by MinhNH232331M - opened
Hi, interesting work on pushing the frontier of small visual instruction editing. But with the real bottleneck being the step number, do you guys plan to release the step distilled version like other popular models like Qwen Image Edit or Z-Image-Turbo?