ViBT / README.md
Yuanshi's picture
Update README.md
46170e1 verified
metadata
license: apache-2.0
tags:
  - Art
  - Image Generation
  - Image Editing
  - Video Generation
  - Vision Translation
  - Bridge Model
pipeline_tag: any-to-any
library_name: diffusion-single-file

🎥 ViBT: Vision Bridge Transformer at Scale

Project Page arXiv HuggingFace GitHub

This repository introduces Vision Bridge Transformer (ViBT), a large-scale instantiation of Brownian Bridge Models designed for efficient conditional generation. ViBT directly models the trajectory between inputs and outputs, creating an efficient data-to-data translation paradigm. The models demonstrate effectiveness for various image and video translation tasks, including instruction-based image editing and complex video translation.