ViBT / README.md
nielsr's picture
nielsr HF Staff
Refine model card content and add pipeline tag
3c6131e verified
|
raw
history blame
1.3 kB
metadata
license: apache-2.0
tags:
  - Art
  - Image Generation
  - Image Editing
  - Video Generation
  - Vision Translation
  - Bridge Model
pipeline_tag: any-to-any

🎥 ViBT: Vision Bridge Transformer at Scale

Project Page arXiv HuggingFace GitHub

This repository introduces Vision Bridge Transformer (ViBT), a large-scale instantiation of Brownian Bridge Models designed for efficient conditional generation. ViBT directly models the trajectory between inputs and outputs, creating an efficient data-to-data translation paradigm. The models demonstrate effectiveness for various image and video translation tasks, including instruction-based image editing and complex video translation.