Image-to-Image
Diffusers
Safetensors
How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("jay-jnp/F-ViTA_KAIST", dtype=torch.bfloat16, device_map="cuda")

prompt = "Turn this cat into a dog"
input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")

image = pipe(image=input_image, prompt=prompt).images[0]

F-ViTA: Foundation Model Guided Visible to Thermal Translation

This repository contains the model described in the paper F-ViTA: Foundation Model Guided Visible to Thermal Translation.

F-ViTA leverages foundation models (SAM and Grounded DINO) to guide the visible-to-thermal image translation process using an InstructPix2Pix diffusion model. This approach improves translation accuracy and generalizes well to out-of-distribution scenarios.

Code: https://github.com/jay-jnp/F-ViTA

Pre-trained checkpoints are available for several datasets:

Downloads last month
24
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for jay-jnp/F-ViTA_KAIST