--- license: cc-by-nc-4.0 language: - en pipeline_tag: text-to-audio tags: - text-video-to-audio - text-controlled-video-to-audio - audio-controlled-video-to-audio - audio-generation library_name: diffusers ---
If you find this project useful, please consider giving a star βοΈ~
Left: Overview of the ControlFoley framework with three multimodal conditioning modes for controllable video-synchronized audio generation. Right: Performance radar chart of Video-to-Audio models.