LTX-2.3 Audio Reactive LORA V1

Still early stages and really just a proof of concept. Created to increase the responsiveness and synchronization of musical elements to changing visual elements within generated videos, AKA "Audio Reactive" content.

V1 shows a marked improvement over the base model, but further improvements are expected with more fine tuning.

Trained exclusively on custom synthetic data (which I may open source for training other multimodal models). I expect V2 to be much improved with a broader, less abstract sampling of audio reactive content and adjusted training settings.

If you want to submit some training samples I am accepting high quality well labeled data.

Recommended LORA weight :

  • 1.0-2.0, recommended 1.4 or above for better motion.

Prompt template:

  • A continuous audio-reactive video that transitions smoothly from [Phase 1: Initial Subject & Action] to [Phase 2: First Evolution & Motion], then warps/morphs into [Phase 3: Second Evolution & Morphing] before warping/morphing into [Phase 4: Final Chaotic/Complex State], with every [Type of Visual Motion/Deformation] perfectly synchronized to the [Musical Element 1], [Musical Element 2], and [Musical Element 3] of the [Music Genre & Vibe] track.

It's recommended to have 2-4 different 'phases' for optimal motion since this is the template the (current v1) training data uses, which I expect to improve with future versions and more diverse data. The base LTX-2.3 model still has some issues with low motion on shorter videos so 10-20 seconds produces better results. I usually find that batches of 4-8 will have some semi-useable results.

I2V was trained into the LORA but I haven't tested it extensively, and I expect motion to be less stable.

Trained with Ostris AI Toolkit Runpod template.

Trigger words

You should use continuous audio-reactive video to trigger the image generation.

You should use audio-reactive to trigger the image generation.

Download model

Download them in the Files & versions tab.

Downloads last month
152
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 100percentrobot/LTX-2.3-Audio-Reactive-LORA

Adapter
(20)
this model